Overview

Dataset statistics

Number of variables16
Number of observations41
Missing cells221
Missing cells (%)33.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.4 KiB
Average record size in memory135.2 B

Variable types

Numeric2
Text6
Categorical4
DateTime2
Unsupported2

Dataset

Description사업번호,사업명,승인기관,주관기관,사업분야,사업지역주소,등록일,평가대행자명,사업자명,사업시작일,사업종료일,사업규모(예산),작성계획생략여부,진행단계,평가서초안공개여부,검토결과공개여부
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-2196/S/1/datasetView.do

Alerts

사업종료일 has constant value ""Constant
사업번호 is highly overall correlated with 등록일High correlation
등록일 is highly overall correlated with 사업번호 and 1 other fieldsHigh correlation
평가서초안공개여부 is highly overall correlated with 등록일High correlation
사업분야 is highly imbalanced (71.9%)Imbalance
작성계획생략여부 is highly imbalanced (83.5%)Imbalance
평가서초안공개여부 is highly imbalanced (53.9%)Imbalance
승인기관 has 14 (34.1%) missing valuesMissing
주관기관 has 26 (63.4%) missing valuesMissing
평가대행자명 has 20 (48.8%) missing valuesMissing
사업시작일 has 39 (95.1%) missing valuesMissing
사업종료일 has 40 (97.6%) missing valuesMissing
사업규모(예산) has 41 (100.0%) missing valuesMissing
검토결과공개여부 has 41 (100.0%) missing valuesMissing
사업번호 has unique valuesUnique
사업명 has unique valuesUnique
사업지역주소 has unique valuesUnique
사업규모(예산) is an unsupported type, check if it needs cleaning or further analysisUnsupported
검토결과공개여부 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-05-17 21:58:59.114183
Analysis finished2024-05-17 21:59:03.614939
Duration4.5 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

사업번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct41
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.6371874 × 1012
Minimum1.5386407 × 1012
Maximum1.7159056 × 1012
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size501.0 B
2024-05-18T06:59:03.818889image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.5386407 × 1012
5-th percentile1.5386407 × 1012
Q11.5386407 × 1012
median1.6638116 × 1012
Q31.7028649 × 1012
95-th percentile1.7137457 × 1012
Maximum1.7159056 × 1012
Range1.7726485 × 1011
Interquartile range (IQR)1.6422419 × 1011

Descriptive statistics

Standard deviation6.941529 × 1010
Coefficient of variation (CV)0.042399112
Kurtosis-1.4802041
Mean1.6371874 × 1012
Median Absolute Deviation (MAD)4.3304772 × 1010
Skewness-0.47530791
Sum6.7124682 × 1013
Variance4.8184826 × 1021
MonotonicityStrictly decreasing
2024-05-18T06:59:04.291602image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=41)
ValueCountFrequency (%)
1715905562753 1
 
2.4%
1538640714911 1
 
2.4%
1652247115552 1
 
2.4%
1635300766245 1
 
2.4%
1632976968491 1
 
2.4%
1626076698746 1
 
2.4%
1589161042870 1
 
2.4%
1578631220046 1
 
2.4%
1578630783787 1
 
2.4%
1538640715111 1
 
2.4%
Other values (31) 31
75.6%
ValueCountFrequency (%)
1538640712791 1
2.4%
1538640712811 1
2.4%
1538640713571 1
2.4%
1538640713591 1
2.4%
1538640713631 1
2.4%
1538640714131 1
2.4%
1538640714331 1
2.4%
1538640714371 1
2.4%
1538640714831 1
2.4%
1538640714911 1
2.4%
ValueCountFrequency (%)
1715905562753 1
2.4%
1713836281857 1
2.4%
1713745723587 1
2.4%
1711586919427 1
2.4%
1710463401224 1
2.4%
1707351808324 1
2.4%
1707116392853 1
2.4%
1705019734188 1
2.4%
1703776283324 1
2.4%
1703738558285 1
2.4%

사업명
Text

UNIQUE 

Distinct41
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size460.0 B
2024-05-18T06:59:04.962631image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length40
Median length27
Mean length19.195122
Min length10

Characters and Unicode

Total characters787
Distinct characters179
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique41 ?
Unique (%)100.0%

Sample

1st row쌍문역 서측 도심 공공주택 복합사업
2nd row테스트(240423)
3rd row봉천 제14구역 주택재개발정비사업
4th row서울중앙지방검찰청 증축사업
5th row봉은사로 120 일원 복합시설 신축공사
ValueCountFrequency (%)
주택사업 6
 
4.9%
리모델링 6
 
4.9%
도시환경정비사업 4
 
3.3%
재개발정비사업 3
 
2.5%
재건축정비사업 3
 
2.5%
개발사업 2
 
1.6%
재개발사업 2
 
1.6%
도시정비형 2
 
1.6%
신축공사 2
 
1.6%
주택재개발정비사업 2
 
1.6%
Other values (90) 90
73.8%
2024-05-18T06:59:06.701723image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
81
 
10.3%
39
 
5.0%
34
 
4.3%
23
 
2.9%
22
 
2.8%
21
 
2.7%
17
 
2.2%
17
 
2.2%
1 13
 
1.7%
13
 
1.7%
Other values (169) 507
64.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 630
80.1%
Space Separator 81
 
10.3%
Decimal Number 36
 
4.6%
Open Punctuation 12
 
1.5%
Close Punctuation 12
 
1.5%
Uppercase Letter 7
 
0.9%
Dash Punctuation 5
 
0.6%
Math Symbol 2
 
0.3%
Other Punctuation 2
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
39
 
6.2%
34
 
5.4%
23
 
3.7%
22
 
3.5%
21
 
3.3%
17
 
2.7%
17
 
2.7%
13
 
2.1%
13
 
2.1%
13
 
2.1%
Other values (147) 418
66.3%
Decimal Number
ValueCountFrequency (%)
1 13
36.1%
2 9
25.0%
4 4
 
11.1%
3 3
 
8.3%
6 2
 
5.6%
7 2
 
5.6%
0 2
 
5.6%
9 1
 
2.8%
Uppercase Letter
ValueCountFrequency (%)
D 2
28.6%
V 1
14.3%
F 1
14.3%
P 1
14.3%
S 1
14.3%
G 1
14.3%
Open Punctuation
ValueCountFrequency (%)
( 11
91.7%
1
 
8.3%
Close Punctuation
ValueCountFrequency (%)
) 11
91.7%
1
 
8.3%
Space Separator
ValueCountFrequency (%)
81
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5
100.0%
Math Symbol
ValueCountFrequency (%)
~ 2
100.0%
Other Punctuation
ValueCountFrequency (%)
, 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 630
80.1%
Common 150
 
19.1%
Latin 7
 
0.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
39
 
6.2%
34
 
5.4%
23
 
3.7%
22
 
3.5%
21
 
3.3%
17
 
2.7%
17
 
2.7%
13
 
2.1%
13
 
2.1%
13
 
2.1%
Other values (147) 418
66.3%
Common
ValueCountFrequency (%)
81
54.0%
1 13
 
8.7%
( 11
 
7.3%
) 11
 
7.3%
2 9
 
6.0%
- 5
 
3.3%
4 4
 
2.7%
3 3
 
2.0%
~ 2
 
1.3%
6 2
 
1.3%
Other values (6) 9
 
6.0%
Latin
ValueCountFrequency (%)
D 2
28.6%
V 1
14.3%
F 1
14.3%
P 1
14.3%
S 1
14.3%
G 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 630
80.1%
ASCII 155
 
19.7%
None 2
 
0.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
81
52.3%
1 13
 
8.4%
( 11
 
7.1%
) 11
 
7.1%
2 9
 
5.8%
- 5
 
3.2%
4 4
 
2.6%
3 3
 
1.9%
D 2
 
1.3%
~ 2
 
1.3%
Other values (10) 14
 
9.0%
Hangul
ValueCountFrequency (%)
39
 
6.2%
34
 
5.4%
23
 
3.7%
22
 
3.5%
21
 
3.3%
17
 
2.7%
17
 
2.7%
13
 
2.1%
13
 
2.1%
13
 
2.1%
Other values (147) 418
66.3%
None
ValueCountFrequency (%)
1
50.0%
1
50.0%

승인기관
Text

MISSING 

Distinct15
Distinct (%)55.6%
Missing14
Missing (%)34.1%
Memory size460.0 B
2024-05-18T06:59:07.189582image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length3
Mean length3.2592593
Min length3

Characters and Unicode

Total characters88
Distinct characters28
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)18.5%

Sample

1st row관악구
2nd row서초구
3rd row송파구
4th row강서구
5th row광진구
ValueCountFrequency (%)
관악구 3
11.1%
영등포구 3
11.1%
강서구 2
 
7.4%
광진구 2
 
7.4%
성북구 2
 
7.4%
성동구 2
 
7.4%
강남구 2
 
7.4%
강동구 2
 
7.4%
마포구 2
 
7.4%
동대문구 2
 
7.4%
Other values (5) 5
18.5%
2024-05-18T06:59:07.919701image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
26
29.5%
6
 
6.8%
6
 
6.8%
5
 
5.7%
4
 
4.5%
3
 
3.4%
3
 
3.4%
3
 
3.4%
3
 
3.4%
3
 
3.4%
Other values (18) 26
29.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 88
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
26
29.5%
6
 
6.8%
6
 
6.8%
5
 
5.7%
4
 
4.5%
3
 
3.4%
3
 
3.4%
3
 
3.4%
3
 
3.4%
3
 
3.4%
Other values (18) 26
29.5%

Most occurring scripts

ValueCountFrequency (%)
Hangul 88
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
26
29.5%
6
 
6.8%
6
 
6.8%
5
 
5.7%
4
 
4.5%
3
 
3.4%
3
 
3.4%
3
 
3.4%
3
 
3.4%
3
 
3.4%
Other values (18) 26
29.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 88
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
26
29.5%
6
 
6.8%
6
 
6.8%
5
 
5.7%
4
 
4.5%
3
 
3.4%
3
 
3.4%
3
 
3.4%
3
 
3.4%
3
 
3.4%
Other values (18) 26
29.5%

주관기관
Text

MISSING 

Distinct12
Distinct (%)80.0%
Missing26
Missing (%)63.4%
Memory size460.0 B
2024-05-18T06:59:08.240535image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length3
Mean length3.4
Min length3

Characters and Unicode

Total characters51
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)60.0%

Sample

1st row송파구
2nd row관악구
3rd row영등포구
4th row성북구
5th row관악구
ValueCountFrequency (%)
관악구 2
13.3%
영등포구 2
13.3%
동대문구 2
13.3%
송파구 1
6.7%
성북구 1
6.7%
중랑구 1
6.7%
성동구 1
6.7%
강동구 1
6.7%
광진구 1
6.7%
용산구 1
6.7%
Other values (2) 2
13.3%
2024-05-18T06:59:08.855007image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
14
27.5%
4
 
7.8%
3
 
5.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
Other values (16) 16
31.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 51
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
14
27.5%
4
 
7.8%
3
 
5.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
Other values (16) 16
31.4%

Most occurring scripts

ValueCountFrequency (%)
Hangul 51
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
14
27.5%
4
 
7.8%
3
 
5.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
Other values (16) 16
31.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 51
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
14
27.5%
4
 
7.8%
3
 
5.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
2
 
3.9%
Other values (16) 16
31.4%

사업분야
Categorical

IMBALANCE 

Distinct2
Distinct (%)4.9%
Missing0
Missing (%)0.0%
Memory size460.0 B
도시의 개발
39 
철도의 건설
 
2

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row도시의 개발
2nd row도시의 개발
3rd row도시의 개발
4th row도시의 개발
5th row도시의 개발

Common Values

ValueCountFrequency (%)
도시의 개발 39
95.1%
철도의 건설 2
 
4.9%

Length

2024-05-18T06:59:09.089884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T06:59:09.281142image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
도시의 39
47.6%
개발 39
47.6%
철도의 2
 
2.4%
건설 2
 
2.4%

사업지역주소
Text

UNIQUE 

Distinct41
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size460.0 B
2024-05-18T06:59:09.774395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length35
Median length25
Mean length18.292683
Min length3

Characters and Unicode

Total characters750
Distinct characters107
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique41 ?
Unique (%)100.0%

Sample

1st row도봉구 쌍문동 138-1 일대
2nd row테스트
3rd row관악구 봉천동 4-51번지 일대
4th row서초구 서초동 1724번지 일원
5th row강남구 역삼동 602-4번지 일원
ValueCountFrequency (%)
일원 14
 
8.4%
일대 12
 
7.2%
서울특별시 8
 
4.8%
강남구 5
 
3.0%
영등포구 5
 
3.0%
관악구 4
 
2.4%
동대문구 3
 
1.8%
성동구 3
 
1.8%
마포구 2
 
1.2%
신정동 2
 
1.2%
Other values (103) 109
65.3%
2024-05-18T06:59:10.752569image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
126
 
16.8%
45
 
6.0%
43
 
5.7%
28
 
3.7%
27
 
3.6%
26
 
3.5%
- 25
 
3.3%
1 25
 
3.3%
2 22
 
2.9%
18
 
2.4%
Other values (97) 365
48.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 434
57.9%
Decimal Number 151
 
20.1%
Space Separator 126
 
16.8%
Dash Punctuation 25
 
3.3%
Other Punctuation 9
 
1.2%
Lowercase Letter 2
 
0.3%
Math Symbol 1
 
0.1%
Open Punctuation 1
 
0.1%
Close Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
45
 
10.4%
43
 
9.9%
28
 
6.5%
27
 
6.2%
26
 
6.0%
18
 
4.1%
15
 
3.5%
14
 
3.2%
14
 
3.2%
10
 
2.3%
Other values (78) 194
44.7%
Decimal Number
ValueCountFrequency (%)
1 25
16.6%
2 22
14.6%
8 16
10.6%
4 16
10.6%
3 16
10.6%
7 14
9.3%
5 14
9.3%
9 13
8.6%
0 10
 
6.6%
6 5
 
3.3%
Other Punctuation
ValueCountFrequency (%)
, 8
88.9%
. 1
 
11.1%
Lowercase Letter
ValueCountFrequency (%)
k 1
50.0%
m 1
50.0%
Space Separator
ValueCountFrequency (%)
126
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 25
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 434
57.9%
Common 314
41.9%
Latin 2
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
45
 
10.4%
43
 
9.9%
28
 
6.5%
27
 
6.2%
26
 
6.0%
18
 
4.1%
15
 
3.5%
14
 
3.2%
14
 
3.2%
10
 
2.3%
Other values (78) 194
44.7%
Common
ValueCountFrequency (%)
126
40.1%
- 25
 
8.0%
1 25
 
8.0%
2 22
 
7.0%
8 16
 
5.1%
4 16
 
5.1%
3 16
 
5.1%
7 14
 
4.5%
5 14
 
4.5%
9 13
 
4.1%
Other values (7) 27
 
8.6%
Latin
ValueCountFrequency (%)
k 1
50.0%
m 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 434
57.9%
ASCII 316
42.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
126
39.9%
- 25
 
7.9%
1 25
 
7.9%
2 22
 
7.0%
8 16
 
5.1%
4 16
 
5.1%
3 16
 
5.1%
7 14
 
4.4%
5 14
 
4.4%
9 13
 
4.1%
Other values (9) 29
 
9.2%
Hangul
ValueCountFrequency (%)
45
 
10.4%
43
 
9.9%
28
 
6.5%
27
 
6.2%
26
 
6.0%
18
 
4.1%
15
 
3.5%
14
 
3.2%
14
 
3.2%
10
 
2.3%
Other values (78) 194
44.7%

등록일
Real number (ℝ)

HIGH CORRELATION 

Distinct40
Distinct (%)97.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20200406
Minimum20101108
Maximum20240517
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size501.0 B
2024-05-18T06:59:11.131893image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20101108
5-th percentile20110419
Q120181114
median20220922
Q320231218
95-th percentile20240422
Maximum20240517
Range139409
Interquartile range (IQR)50104

Descriptive statistics

Standard deviation46801.487
Coefficient of variation (CV)0.0023168587
Kurtosis-0.26672387
Mean20200406
Median Absolute Deviation (MAD)19283
Skewness-1.1284173
Sum8.2821665 × 108
Variance2.1903792 × 109
MonotonicityNot monotonic
2024-05-18T06:59:11.456145image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=40)
ValueCountFrequency (%)
20200110 2
 
4.9%
20240517 1
 
2.4%
20220627 1
 
2.4%
20220511 1
 
2.4%
20211027 1
 
2.4%
20210930 1
 
2.4%
20210712 1
 
2.4%
20200511 1
 
2.4%
20140919 1
 
2.4%
20130614 1
 
2.4%
Other values (30) 30
73.2%
ValueCountFrequency (%)
20101108 1
2.4%
20101116 1
2.4%
20110419 1
2.4%
20110831 1
2.4%
20110904 1
2.4%
20130614 1
2.4%
20130823 1
2.4%
20140919 1
2.4%
20150727 1
2.4%
20150914 1
2.4%
ValueCountFrequency (%)
20240517 1
2.4%
20240423 1
2.4%
20240422 1
2.4%
20240328 1
2.4%
20240315 1
2.4%
20240208 1
2.4%
20240205 1
2.4%
20240112 1
2.4%
20231229 1
2.4%
20231228 1
2.4%

평가대행자명
Text

MISSING 

Distinct12
Distinct (%)57.1%
Missing20
Missing (%)48.8%
Memory size460.0 B
2024-05-18T06:59:11.838354image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length8
Mean length8.8095238
Min length5

Characters and Unicode

Total characters185
Distinct characters48
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)38.1%

Sample

1st row(주)대한콘설탄트
2nd row(주)동해종합기술공사
3rd row(주)동해종합기술공사
4th row(주)유연이앤씨
5th row(주)유연이앤씨
ValueCountFrequency (%)
주)예평이앤씨 6
27.3%
주)동해종합기술공사 3
13.6%
주)유연이앤씨 2
 
9.1%
주)동림피엔디 2
 
9.1%
주)대한콘설탄트 1
 
4.5%
주)청마 1
 
4.5%
주)다원피앤디 1
 
4.5%
주)대영이이씨 1
 
4.5%
주)한국종합공해시험연구소 1
 
4.5%
주)와이디엔에스 1
 
4.5%
Other values (3) 3
13.6%
2024-05-18T06:59:12.451610image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
21
 
11.4%
( 20
 
10.8%
) 20
 
10.8%
11
 
5.9%
9
 
4.9%
9
 
4.9%
6
 
3.2%
6
 
3.2%
5
 
2.7%
5
 
2.7%
Other values (38) 73
39.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 143
77.3%
Open Punctuation 20
 
10.8%
Close Punctuation 20
 
10.8%
Space Separator 1
 
0.5%
Uppercase Letter 1
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
21
 
14.7%
11
 
7.7%
9
 
6.3%
9
 
6.3%
6
 
4.2%
6
 
4.2%
5
 
3.5%
5
 
3.5%
5
 
3.5%
5
 
3.5%
Other values (34) 61
42.7%
Open Punctuation
ValueCountFrequency (%)
( 20
100.0%
Close Punctuation
ValueCountFrequency (%)
) 20
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%
Uppercase Letter
ValueCountFrequency (%)
E 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 143
77.3%
Common 41
 
22.2%
Latin 1
 
0.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
21
 
14.7%
11
 
7.7%
9
 
6.3%
9
 
6.3%
6
 
4.2%
6
 
4.2%
5
 
3.5%
5
 
3.5%
5
 
3.5%
5
 
3.5%
Other values (34) 61
42.7%
Common
ValueCountFrequency (%)
( 20
48.8%
) 20
48.8%
1
 
2.4%
Latin
ValueCountFrequency (%)
E 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 143
77.3%
ASCII 42
 
22.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
21
 
14.7%
11
 
7.7%
9
 
6.3%
9
 
6.3%
6
 
4.2%
6
 
4.2%
5
 
3.5%
5
 
3.5%
5
 
3.5%
5
 
3.5%
Other values (34) 61
42.7%
ASCII
ValueCountFrequency (%)
( 20
47.6%
) 20
47.6%
1
 
2.4%
E 1
 
2.4%
Distinct31
Distinct (%)75.6%
Missing0
Missing (%)0.0%
Memory size460.0 B
2024-05-18T06:59:12.763448image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length23
Mean length10.97561
Min length2

Characters and Unicode

Total characters450
Distinct characters140
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique29 ?
Unique (%)70.7%

Sample

1st row한국토지주택공사
2nd row테스트
3rd row조합
4th row법무부
5th row마스턴제116호강남프리미어프로젝트금융투자(주)
ValueCountFrequency (%)
조합 10
 
15.9%
리모델링주택조합 3
 
4.8%
재건축정비사업조합 2
 
3.2%
리모델링 2
 
3.2%
주택조합 2
 
3.2%
한국토지주택공사 2
 
3.2%
주식회사 2
 
3.2%
성균관대학교 1
 
1.6%
학교법인 1
 
1.6%
고려중앙학원 1
 
1.6%
Other values (37) 37
58.7%
2024-05-18T06:59:13.325427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
22
 
4.9%
19
 
4.2%
19
 
4.2%
18
 
4.0%
13
 
2.9%
10
 
2.2%
9
 
2.0%
9
 
2.0%
( 8
 
1.8%
) 8
 
1.8%
Other values (130) 315
70.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 399
88.7%
Space Separator 22
 
4.9%
Open Punctuation 8
 
1.8%
Close Punctuation 8
 
1.8%
Decimal Number 8
 
1.8%
Other Punctuation 2
 
0.4%
Uppercase Letter 2
 
0.4%
Other Symbol 1
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
19
 
4.8%
19
 
4.8%
18
 
4.5%
13
 
3.3%
10
 
2.5%
9
 
2.3%
9
 
2.3%
7
 
1.8%
7
 
1.8%
7
 
1.8%
Other values (119) 281
70.4%
Decimal Number
ValueCountFrequency (%)
1 4
50.0%
2 2
25.0%
6 1
 
12.5%
4 1
 
12.5%
Uppercase Letter
ValueCountFrequency (%)
S 1
50.0%
G 1
50.0%
Space Separator
ValueCountFrequency (%)
22
100.0%
Open Punctuation
ValueCountFrequency (%)
( 8
100.0%
Close Punctuation
ValueCountFrequency (%)
) 8
100.0%
Other Punctuation
ValueCountFrequency (%)
, 2
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 400
88.9%
Common 48
 
10.7%
Latin 2
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
19
 
4.8%
19
 
4.8%
18
 
4.5%
13
 
3.2%
10
 
2.5%
9
 
2.2%
9
 
2.2%
7
 
1.8%
7
 
1.8%
7
 
1.8%
Other values (120) 282
70.5%
Common
ValueCountFrequency (%)
22
45.8%
( 8
 
16.7%
) 8
 
16.7%
1 4
 
8.3%
, 2
 
4.2%
2 2
 
4.2%
6 1
 
2.1%
4 1
 
2.1%
Latin
ValueCountFrequency (%)
S 1
50.0%
G 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 399
88.7%
ASCII 50
 
11.1%
None 1
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
22
44.0%
( 8
 
16.0%
) 8
 
16.0%
1 4
 
8.0%
, 2
 
4.0%
2 2
 
4.0%
6 1
 
2.0%
4 1
 
2.0%
S 1
 
2.0%
G 1
 
2.0%
Hangul
ValueCountFrequency (%)
19
 
4.8%
19
 
4.8%
18
 
4.5%
13
 
3.3%
10
 
2.5%
9
 
2.3%
9
 
2.3%
7
 
1.8%
7
 
1.8%
7
 
1.8%
Other values (119) 281
70.4%
None
ValueCountFrequency (%)
1
100.0%

사업시작일
Date

MISSING 

Distinct2
Distinct (%)100.0%
Missing39
Missing (%)95.1%
Memory size460.0 B
Minimum2023-01-05 00:00:00
Maximum2024-02-08 00:00:00
2024-05-18T06:59:13.649202image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T06:59:13.949125image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=2)

사업종료일
Date

CONSTANT  MISSING 

Distinct1
Distinct (%)100.0%
Missing40
Missing (%)97.6%
Memory size460.0 B
Minimum2023-02-06 00:00:00
Maximum2023-02-06 00:00:00
2024-05-18T06:59:14.217310image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T06:59:14.446969image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

사업규모(예산)
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing41
Missing (%)100.0%
Memory size501.0 B

작성계획생략여부
Categorical

IMBALANCE 

Distinct2
Distinct (%)4.9%
Missing0
Missing (%)0.0%
Memory size460.0 B
작성계획서 생략
40 
생략 안함
 
1

Length

Max length8
Median length8
Mean length7.9268293
Min length5

Unique

Unique1 ?
Unique (%)2.4%

Sample

1st row작성계획서 생략
2nd row작성계획서 생략
3rd row작성계획서 생략
4th row작성계획서 생략
5th row작성계획서 생략

Common Values

ValueCountFrequency (%)
작성계획서 생략 40
97.6%
생략 안함 1
 
2.4%

Length

2024-05-18T06:59:14.710813image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T06:59:15.040156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
생략 41
50.0%
작성계획서 40
48.8%
안함 1
 
1.2%

진행단계
Categorical

Distinct5
Distinct (%)12.2%
Missing0
Missing (%)0.0%
Memory size460.0 B
평가서 초안(검토완료)
30 
평가서 초안(검토중)
평가서 초안(검토의견완료)
 
3
평가서 초안(등록완료)
 
2
평가서 초안(등록대기)
 
1

Length

Max length14
Median length12
Mean length12.02439
Min length11

Unique

Unique1 ?
Unique (%)2.4%

Sample

1st row평가서 초안(등록완료)
2nd row평가서 초안(등록대기)
3rd row평가서 초안(검토중)
4th row평가서 초안(검토중)
5th row평가서 초안(검토완료)

Common Values

ValueCountFrequency (%)
평가서 초안(검토완료) 30
73.2%
평가서 초안(검토중) 5
 
12.2%
평가서 초안(검토의견완료) 3
 
7.3%
평가서 초안(등록완료) 2
 
4.9%
평가서 초안(등록대기) 1
 
2.4%

Length

2024-05-18T06:59:15.405067image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T06:59:15.770645image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
평가서 41
50.0%
초안(검토완료 30
36.6%
초안(검토중 5
 
6.1%
초안(검토의견완료 3
 
3.7%
초안(등록완료 2
 
2.4%
초안(등록대기 1
 
1.2%

평가서초안공개여부
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)4.9%
Missing0
Missing (%)0.0%
Memory size460.0 B
공개
37 
비공개

Length

Max length3
Median length2
Mean length2.097561
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row공개
2nd row공개
3rd row공개
4th row공개
5th row공개

Common Values

ValueCountFrequency (%)
공개 37
90.2%
비공개 4
 
9.8%

Length

2024-05-18T06:59:16.135466image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T06:59:16.460840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
공개 37
90.2%
비공개 4
 
9.8%

검토결과공개여부
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing41
Missing (%)100.0%
Memory size501.0 B

Interactions

2024-05-18T06:59:01.089426image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T06:59:00.549811image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T06:59:01.383220image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T06:59:00.813262image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-18T06:59:16.682604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사업번호사업명승인기관주관기관사업분야사업지역주소등록일평가대행자명사업자명사업시작일작성계획생략여부진행단계평가서초안공개여부
사업번호1.0001.0000.0000.7800.0001.0000.7840.6260.8540.0000.0000.0000.383
사업명1.0001.0001.0001.0001.0001.0001.0001.0001.0000.0001.0001.0001.000
승인기관0.0001.0001.0001.0001.0001.0000.8140.8210.8760.000NaN0.9190.866
주관기관0.7801.0001.0001.0001.0001.0000.9560.9430.891NaNNaN1.0001.000
사업분야0.0001.0001.0001.0001.0001.0000.2881.0001.000NaN0.0000.1800.000
사업지역주소1.0001.0001.0001.0001.0001.0001.0001.0001.0000.0001.0001.0001.000
등록일0.7841.0000.8140.9560.2881.0001.0000.0000.972NaN0.4130.0000.762
평가대행자명0.6261.0000.8210.9431.0001.0000.0001.0001.000NaN0.0000.0000.472
사업자명0.8541.0000.8760.8911.0001.0000.9721.0001.0000.0000.0000.9271.000
사업시작일0.0000.0000.000NaNNaN0.000NaNNaN0.0001.000NaNNaNNaN
작성계획생략여부0.0001.000NaNNaN0.0001.0000.4130.0000.000NaN1.0000.0000.000
진행단계0.0001.0000.9191.0000.1801.0000.0000.0000.927NaN0.0001.0000.000
평가서초안공개여부0.3831.0000.8661.0000.0001.0000.7620.4721.000NaN0.0000.0001.000
2024-05-18T06:59:17.081010image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
평가서초안공개여부작성계획생략여부사업분야진행단계
평가서초안공개여부1.0000.0000.0000.000
작성계획생략여부0.0001.0000.0000.000
사업분야0.0000.0001.0000.206
진행단계0.0000.0000.2061.000
2024-05-18T06:59:17.315255image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사업번호등록일사업분야작성계획생략여부진행단계평가서초안공개여부
사업번호1.0000.9970.0000.0000.0000.351
등록일0.9971.0000.4740.4130.0000.803
사업분야0.0000.4741.0000.0000.2060.000
작성계획생략여부0.0000.4130.0001.0000.0000.000
진행단계0.0000.0000.2060.0001.0000.000
평가서초안공개여부0.3510.8030.0000.0000.0001.000

Missing values

2024-05-18T06:59:01.971451image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-18T06:59:02.767102image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-05-18T06:59:03.359468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

사업번호사업명승인기관주관기관사업분야사업지역주소등록일평가대행자명사업자명사업시작일사업종료일사업규모(예산)작성계획생략여부진행단계평가서초안공개여부검토결과공개여부
01715905562753쌍문역 서측 도심 공공주택 복합사업<NA><NA>도시의 개발도봉구 쌍문동 138-1 일대20240517<NA>한국토지주택공사<NA><NA><NA>작성계획서 생략평가서 초안(등록완료)공개<NA>
11713836281857테스트(240423)<NA><NA>도시의 개발테스트20240423<NA>테스트<NA><NA><NA>작성계획서 생략평가서 초안(등록대기)공개<NA>
21713745723587봉천 제14구역 주택재개발정비사업관악구<NA>도시의 개발관악구 봉천동 4-51번지 일대20240422<NA>조합<NA><NA><NA>작성계획서 생략평가서 초안(검토중)공개<NA>
31711586919427서울중앙지방검찰청 증축사업서초구<NA>도시의 개발서초구 서초동 1724번지 일원20240328<NA>법무부<NA><NA><NA>작성계획서 생략평가서 초안(검토중)공개<NA>
41710463401224봉은사로 120 일원 복합시설 신축공사<NA><NA>도시의 개발강남구 역삼동 602-4번지 일원20240315<NA>마스턴제116호강남프리미어프로젝트금융투자(주)<NA><NA><NA>작성계획서 생략평가서 초안(검토완료)공개<NA>
51707351808324마천3재정비촉진구역 주택재개발정비사업송파구송파구도시의 개발송파구 마천동 215번지 일대20240208<NA>조합2024-02-08<NA><NA>작성계획서 생략평가서 초안(검토완료)공개<NA>
61707116392853케이스퀘어 그랜드강서 PFV 신축공사강서구<NA>도시의 개발강서구 가양동 449-19번지 일원20240205<NA>케이스퀘어그랜드강서피에프브이 주식회사<NA><NA><NA>작성계획서 생략평가서 초안(검토중)공개<NA>
71705019734188대치현대아파트 리모델링 주택사업<NA><NA>도시의 개발서울특별시 강남구 대치동 97420240112<NA>대치현대아파트 리모델링 주택조합<NA><NA><NA>작성계획서 생략평가서 초안(검토완료)공개<NA>
81703776283324자양우성1차아파트 리모델링 주택사업광진구<NA>도시의 개발광진구 자양동 579번지 일대20231229<NA>자양우성1차아파트 리모델링주택조합<NA><NA><NA>작성계획서 생략평가서 초안(검토완료)공개<NA>
91703738558285신림1재정비촉진구역 재개발정비사업관악구관악구도시의 개발관악구 신림동 808번지 일대20231228<NA>조합<NA><NA><NA>작성계획서 생략평가서 초안(검토완료)공개<NA>
사업번호사업명승인기관주관기관사업분야사업지역주소등록일평가대행자명사업자명사업시작일사업종료일사업규모(예산)작성계획생략여부진행단계평가서초안공개여부검토결과공개여부
311538640714911한강로구역 도시환경정비사업용산구용산구도시의 개발용산구 한강로1가 158번지 일원20130614(주)대영이이씨한강로구역 도시환경정비사업조합<NA><NA><NA>작성계획서 생략평가서 초안(검토의견완료)공개<NA>
321538640714831세운 재정비촉진지구 3-1~9구역 도시환경정비사업 환경영향평가서(재협의)<NA><NA>도시의 개발중구 입정동 2-4번지 일원20181114(주)예평이앤씨더센터시티제이차(주)외 1개사<NA><NA><NA>작성계획서 생략평가서 초안(검토완료)공개<NA>
331538640714371성균관대학교 인문사회과학캠퍼스(교수회관 신축)<NA><NA>도시의 개발종로구 성균관로 25-2 일대20150914(주)동해종합기술공사성균관대학교<NA><NA><NA>작성계획서 생략평가서 초안(검토완료)공개<NA>
341538640714331서울시립대학교(기숙사 증축) 환경영향평가서 초안<NA><NA>도시의 개발동대문구 서울시립대로 16320150727(주)한국종합공해시험연구소서울시립대학교<NA><NA><NA>작성계획서 생략평가서 초안(검토의견완료)공개<NA>
351538640714131행복주택건설을 위한 오류동 주택지구<NA><NA>도시의 개발구로구 오류동 33-177, 개봉동 237-3번지 일원20130823(주)동림피엔디한국토지주택공사<NA><NA><NA>생략 안함평가서 초안(검토완료)공개<NA>
361538640713631신림선 경전철 민간투자사업 환경영향평가(간이평가서)<NA><NA>철도의 건설영등포구 여의도동 ~ 관악구 대학동(7.8km)20110904(주)와이디엔에스남서울경전철(주)<NA><NA><NA>작성계획서 생략평가서 초안(검토완료)공개<NA>
371538640713591합정2구역 도시환경정비사업(재협의)마포구마포구도시의 개발마포구 합정동 385-120110831(주)예평이앤씨(주)파나씨티<NA><NA><NA>작성계획서 생략평가서 초안(검토완료)비공개<NA>
381538640713571청량리 제4구역 도시환경정비사업동대문구동대문구도시의 개발동대문구 전농동 620-1번지 외 210필지20110419(주)동림피엔디청량리 제4구역 도시환경정비사업 추진위원회<NA><NA><NA>작성계획서 생략평가서 초안(검토완료)비공개<NA>
391538640712811동부청과 시장정비사업동대문구동대문구도시의 개발동대문구 용두동 39-1번지20101116(주)포도E동부청과 주식회사<NA><NA><NA>작성계획서 생략평가서 초안(검토완료)비공개<NA>
401538640712791수도권 고속철도(수서~평택) 환경영향평가서(초안)국토해양부서울특별시철도의 건설강남구, 경기도 성남시, 용인시, 화성시, 오산시, 평택시 일원20101108주식회사 도화종합기술공사한국철도시설공단<NA><NA><NA>작성계획서 생략평가서 초안(검토의견완료)비공개<NA>