Overview

Dataset statistics

Number of variables10
Number of observations6079
Missing cells3092
Missing cells (%)5.1%
Duplicate rows21
Duplicate rows (%)0.3%
Total size in memory492.9 KiB
Average record size in memory83.0 B

Variable types

Categorical1
Text5
DateTime1
Numeric3

Alerts

Dataset has 21 (0.3%) duplicate rowsDuplicates
정제우편번호 is highly overall correlated with 정제WGS84위도 and 1 other fieldsHigh correlation
정제WGS84위도 is highly overall correlated with 정제우편번호 and 1 other fieldsHigh correlation
정제WGS84경도 is highly overall correlated with 시군명High correlation
시군명 is highly overall correlated with 정제우편번호 and 2 other fieldsHigh correlation
취득일 has 1333 (21.9%) missing valuesMissing
정제도로명주소 has 1197 (19.7%) missing valuesMissing
정제우편번호 has 194 (3.2%) missing valuesMissing
정제WGS84위도 has 184 (3.0%) missing valuesMissing
정제WGS84경도 has 184 (3.0%) missing valuesMissing

Reproduction

Analysis started2024-05-17 20:28:06.884912
Analysis finished2024-05-17 20:28:12.502227
Duration5.62 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시군명
Categorical

HIGH CORRELATION 

Distinct31
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size47.6 KiB
수원시
640 
안양시
537 
포천시
505 
이천시
463 
파주시
453 
Other values (26)
3481 

Length

Max length4
Median length3
Mean length3.0435927
Min length3

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row경기도
2nd row경기도
3rd row경기도
4th row경기도
5th row경기도

Common Values

ValueCountFrequency (%)
수원시 640
 
10.5%
안양시 537
 
8.8%
포천시 505
 
8.3%
이천시 463
 
7.6%
파주시 453
 
7.5%
양평군 445
 
7.3%
안산시 413
 
6.8%
부천시 365
 
6.0%
가평군 336
 
5.5%
성남시 276
 
4.5%
Other values (21) 1646
27.1%

Length

2024-05-18T05:28:12.810498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
수원시 640
 
10.5%
안양시 537
 
8.8%
포천시 505
 
8.3%
이천시 463
 
7.6%
파주시 453
 
7.5%
양평군 445
 
7.3%
안산시 413
 
6.8%
부천시 365
 
6.0%
가평군 336
 
5.5%
성남시 276
 
4.5%
Other values (21) 1646
27.1%
Distinct5587
Distinct (%)91.9%
Missing0
Missing (%)0.0%
Memory size47.6 KiB
2024-05-18T05:28:13.365357image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length41
Median length30
Mean length9.4393815
Min length2

Characters and Unicode

Total characters57382
Distinct characters664
Distinct categories12 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5380 ?
Unique (%)88.5%

Sample

1st row수리산 탐방안내소
2nd row포천병원 본관동 환경개선
3rd row안성 원곡119안전센터
4th row안산 신길119안전센터
5th row화성 봉담 119안전센터
ValueCountFrequency (%)
경로당 122
 
1.4%
마을회관 110
 
1.2%
행정복지센터 103
 
1.2%
공중화장실 83
 
0.9%
75
 
0.8%
화장실 69
 
0.8%
수원환경사업소 41
 
0.5%
하수종말처리장 31
 
0.3%
경기도 30
 
0.3%
주민센터 28
 
0.3%
Other values (6120) 8173
92.2%
2024-05-18T05:28:14.397923image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2826
 
4.9%
1888
 
3.3%
1745
 
3.0%
1525
 
2.7%
1162
 
2.0%
1099
 
1.9%
1069
 
1.9%
986
 
1.7%
955
 
1.7%
936
 
1.6%
Other values (654) 43191
75.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 50565
88.1%
Space Separator 2826
 
4.9%
Decimal Number 1705
 
3.0%
Close Punctuation 924
 
1.6%
Open Punctuation 905
 
1.6%
Uppercase Letter 281
 
0.5%
Dash Punctuation 114
 
0.2%
Other Punctuation 33
 
0.1%
Lowercase Letter 14
 
< 0.1%
Other Symbol 8
 
< 0.1%
Other values (2) 7
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1888
 
3.7%
1745
 
3.5%
1525
 
3.0%
1162
 
2.3%
1099
 
2.2%
1069
 
2.1%
986
 
1.9%
955
 
1.9%
936
 
1.9%
921
 
1.8%
Other values (595) 38279
75.7%
Uppercase Letter
ValueCountFrequency (%)
B 45
16.0%
A 44
15.7%
C 39
13.9%
E 21
7.5%
D 21
7.5%
M 19
 
6.8%
T 12
 
4.3%
S 11
 
3.9%
G 11
 
3.9%
L 7
 
2.5%
Other values (15) 51
18.1%
Decimal Number
ValueCountFrequency (%)
1 624
36.6%
2 433
25.4%
3 197
 
11.6%
9 106
 
6.2%
4 89
 
5.2%
5 64
 
3.8%
6 54
 
3.2%
0 48
 
2.8%
8 46
 
2.7%
7 44
 
2.6%
Lowercase Letter
ValueCountFrequency (%)
i 4
28.6%
c 2
14.3%
t 2
14.3%
v 1
 
7.1%
n 1
 
7.1%
y 1
 
7.1%
h 1
 
7.1%
b 1
 
7.1%
a 1
 
7.1%
Other Punctuation
ValueCountFrequency (%)
. 12
36.4%
/ 8
24.2%
, 4
 
12.1%
: 3
 
9.1%
· 3
 
9.1%
' 2
 
6.1%
? 1
 
3.0%
Close Punctuation
ValueCountFrequency (%)
) 923
99.9%
] 1
 
0.1%
Space Separator
ValueCountFrequency (%)
2826
100.0%
Open Punctuation
ValueCountFrequency (%)
( 905
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 114
100.0%
Other Symbol
ValueCountFrequency (%)
8
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 5
100.0%
Math Symbol
ValueCountFrequency (%)
~ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 50565
88.1%
Common 6522
 
11.4%
Latin 295
 
0.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1888
 
3.7%
1745
 
3.5%
1525
 
3.0%
1162
 
2.3%
1099
 
2.2%
1069
 
2.1%
986
 
1.9%
955
 
1.9%
936
 
1.9%
921
 
1.8%
Other values (595) 38279
75.7%
Latin
ValueCountFrequency (%)
B 45
15.3%
A 44
14.9%
C 39
13.2%
E 21
 
7.1%
D 21
 
7.1%
M 19
 
6.4%
T 12
 
4.1%
S 11
 
3.7%
G 11
 
3.7%
L 7
 
2.4%
Other values (24) 65
22.0%
Common
ValueCountFrequency (%)
2826
43.3%
) 923
 
14.2%
( 905
 
13.9%
1 624
 
9.6%
2 433
 
6.6%
3 197
 
3.0%
- 114
 
1.7%
9 106
 
1.6%
4 89
 
1.4%
5 64
 
1.0%
Other values (15) 241
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 50565
88.1%
ASCII 6806
 
11.9%
CJK Compat 8
 
< 0.1%
None 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2826
41.5%
) 923
 
13.6%
( 905
 
13.3%
1 624
 
9.2%
2 433
 
6.4%
3 197
 
2.9%
- 114
 
1.7%
9 106
 
1.6%
4 89
 
1.3%
5 64
 
0.9%
Other values (47) 525
 
7.7%
Hangul
ValueCountFrequency (%)
1888
 
3.7%
1745
 
3.5%
1525
 
3.0%
1162
 
2.3%
1099
 
2.2%
1069
 
2.1%
986
 
1.9%
955
 
1.9%
936
 
1.9%
921
 
1.8%
Other values (595) 38279
75.7%
CJK Compat
ValueCountFrequency (%)
8
100.0%
None
ValueCountFrequency (%)
· 3
100.0%

면적
Text

Distinct5246
Distinct (%)86.3%
Missing0
Missing (%)0.0%
Memory size47.6 KiB
2024-05-18T05:28:15.051406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length31
Median length15
Mean length6.5305149
Min length1

Characters and Unicode

Total characters39699
Distinct characters70
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4823 ?
Unique (%)79.3%

Sample

1st row연면적 685.59㎡
2nd row리모델링공사
3rd row연면적 942㎡
4th row연면적 990㎡
5th row연면적 893㎡
ValueCountFrequency (%)
연면적 1437
 
19.0%
건축면적 34
 
0.4%
12.96 21
 
0.3%
100.98 16
 
0.2%
60 13
 
0.2%
232.81 13
 
0.2%
198 12
 
0.2%
40 12
 
0.2%
36 10
 
0.1%
12 10
 
0.1%
Other values (5240) 5996
79.2%
2024-05-18T05:28:16.554348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 4741
11.9%
1 4009
10.1%
2 3350
 
8.4%
4 2878
 
7.2%
3 2705
 
6.8%
6 2658
 
6.7%
9 2647
 
6.7%
8 2522
 
6.4%
5 2452
 
6.2%
7 2328
 
5.9%
Other values (60) 9409
23.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 27475
69.2%
Other Punctuation 4988
 
12.6%
Other Letter 4540
 
11.4%
Space Separator 1495
 
3.8%
Other Symbol 1191
 
3.0%
Uppercase Letter 4
 
< 0.1%
Lowercase Letter 2
 
< 0.1%
Close Punctuation 2
 
< 0.1%
Open Punctuation 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1478
32.6%
1477
32.5%
1438
31.7%
36
 
0.8%
34
 
0.7%
4
 
0.1%
3
 
0.1%
3
 
0.1%
3
 
0.1%
3
 
0.1%
Other values (41) 61
 
1.3%
Decimal Number
ValueCountFrequency (%)
1 4009
14.6%
2 3350
12.2%
4 2878
10.5%
3 2705
9.8%
6 2658
9.7%
9 2647
9.6%
8 2522
9.2%
5 2452
8.9%
7 2328
8.5%
0 1926
7.0%
Other Punctuation
ValueCountFrequency (%)
. 4741
95.0%
, 247
 
5.0%
Other Symbol
ValueCountFrequency (%)
1190
99.9%
1
 
0.1%
Space Separator
ValueCountFrequency (%)
1495
100.0%
Uppercase Letter
ValueCountFrequency (%)
F 4
100.0%
Lowercase Letter
ValueCountFrequency (%)
m 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 35153
88.5%
Hangul 4540
 
11.4%
Latin 6
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1478
32.6%
1477
32.5%
1438
31.7%
36
 
0.8%
34
 
0.7%
4
 
0.1%
3
 
0.1%
3
 
0.1%
3
 
0.1%
3
 
0.1%
Other values (41) 61
 
1.3%
Common
ValueCountFrequency (%)
. 4741
13.5%
1 4009
11.4%
2 3350
9.5%
4 2878
8.2%
3 2705
7.7%
6 2658
7.6%
9 2647
7.5%
8 2522
7.2%
5 2452
7.0%
7 2328
6.6%
Other values (7) 4863
13.8%
Latin
ValueCountFrequency (%)
F 4
66.7%
m 2
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 33968
85.6%
Hangul 4540
 
11.4%
CJK Compat 1191
 
3.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 4741
14.0%
1 4009
11.8%
2 3350
9.9%
4 2878
8.5%
3 2705
8.0%
6 2658
7.8%
9 2647
7.8%
8 2522
7.4%
5 2452
7.2%
7 2328
6.9%
Other values (7) 3678
10.8%
Hangul
ValueCountFrequency (%)
1478
32.6%
1477
32.5%
1438
31.7%
36
 
0.8%
34
 
0.7%
4
 
0.1%
3
 
0.1%
3
 
0.1%
3
 
0.1%
3
 
0.1%
Other values (41) 61
 
1.3%
CJK Compat
ValueCountFrequency (%)
1190
99.9%
1
 
0.1%

취득일
Text

MISSING 

Distinct2951
Distinct (%)62.2%
Missing1333
Missing (%)21.9%
Memory size47.6 KiB
2024-05-18T05:28:17.274720image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters47460
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2133 ?
Unique (%)44.9%

Sample

1st row2019-05-09
2nd row2017-09-21
3rd row2014-03-26
4th row2011-02-28
5th row2022-02-15
ValueCountFrequency (%)
2018-01-31 27
 
0.6%
1995-02-28 27
 
0.6%
2007-12-28 23
 
0.5%
2019-01-01 21
 
0.4%
2003-10-30 19
 
0.4%
2012-06-29 17
 
0.4%
2005-02-28 17
 
0.4%
2012-05-22 16
 
0.3%
2018-01-01 16
 
0.3%
1997-03-25 15
 
0.3%
Other values (2941) 4548
95.8%
2024-05-18T05:28:18.356717image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 10738
22.6%
- 9492
20.0%
1 7963
16.8%
2 7537
15.9%
9 3389
 
7.1%
3 1773
 
3.7%
8 1683
 
3.5%
5 1379
 
2.9%
6 1251
 
2.6%
7 1223
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 37968
80.0%
Dash Punctuation 9492
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 10738
28.3%
1 7963
21.0%
2 7537
19.9%
9 3389
 
8.9%
3 1773
 
4.7%
8 1683
 
4.4%
5 1379
 
3.6%
6 1251
 
3.3%
7 1223
 
3.2%
4 1032
 
2.7%
Dash Punctuation
ValueCountFrequency (%)
- 9492
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 47460
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 10738
22.6%
- 9492
20.0%
1 7963
16.8%
2 7537
15.9%
9 3389
 
7.1%
3 1773
 
3.7%
8 1683
 
3.5%
5 1379
 
2.9%
6 1251
 
2.6%
7 1223
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 47460
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 10738
22.6%
- 9492
20.0%
1 7963
16.8%
2 7537
15.9%
9 3389
 
7.1%
3 1773
 
3.7%
8 1683
 
3.5%
5 1379
 
2.9%
6 1251
 
2.6%
7 1223
 
2.6%
Distinct22
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size47.6 KiB
Minimum2022-12-31 00:00:00
Maximum2024-04-30 00:00:00
2024-05-18T05:28:18.728684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T05:28:19.068671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)

정제도로명주소
Text

MISSING 

Distinct3581
Distinct (%)73.4%
Missing1197
Missing (%)19.7%
Memory size47.6 KiB
2024-05-18T05:28:19.558867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length29
Median length26
Mean length19.725113
Min length13

Characters and Unicode

Total characters96298
Distinct characters401
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3011 ?
Unique (%)61.7%

Sample

1st row경기도 군포시 속달로 347-4
2nd row경기도 포천시 포천로 1648
3rd row경기도 안성시 원곡면 원곡물류단지로 162-20
4th row경기도 안산시 단원구 삼일로 50
5th row경기도 화성시 봉담읍 동화새터길 135
ValueCountFrequency (%)
경기도 4882
 
21.3%
수원시 507
 
2.2%
포천시 435
 
1.9%
안양시 424
 
1.8%
이천시 368
 
1.6%
파주시 344
 
1.5%
안산시 332
 
1.4%
부천시 314
 
1.4%
양평군 311
 
1.4%
성남시 265
 
1.2%
Other values (3805) 14753
64.3%
2024-05-18T05:28:20.518997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
18053
18.7%
5077
 
5.3%
4974
 
5.2%
4965
 
5.2%
4517
 
4.7%
4192
 
4.4%
1 3542
 
3.7%
2 2249
 
2.3%
2234
 
2.3%
3 1969
 
2.0%
Other values (391) 44526
46.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 60006
62.3%
Space Separator 18053
 
18.7%
Decimal Number 17274
 
17.9%
Dash Punctuation 965
 
1.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5077
 
8.5%
4974
 
8.3%
4965
 
8.3%
4517
 
7.5%
4192
 
7.0%
2234
 
3.7%
1849
 
3.1%
1611
 
2.7%
1557
 
2.6%
1480
 
2.5%
Other values (379) 27550
45.9%
Decimal Number
ValueCountFrequency (%)
1 3542
20.5%
2 2249
13.0%
3 1969
11.4%
4 1630
9.4%
5 1516
8.8%
6 1377
 
8.0%
0 1363
 
7.9%
7 1247
 
7.2%
8 1221
 
7.1%
9 1160
 
6.7%
Space Separator
ValueCountFrequency (%)
18053
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 965
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 60006
62.3%
Common 36292
37.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5077
 
8.5%
4974
 
8.3%
4965
 
8.3%
4517
 
7.5%
4192
 
7.0%
2234
 
3.7%
1849
 
3.1%
1611
 
2.7%
1557
 
2.6%
1480
 
2.5%
Other values (379) 27550
45.9%
Common
ValueCountFrequency (%)
18053
49.7%
1 3542
 
9.8%
2 2249
 
6.2%
3 1969
 
5.4%
4 1630
 
4.5%
5 1516
 
4.2%
6 1377
 
3.8%
0 1363
 
3.8%
7 1247
 
3.4%
8 1221
 
3.4%
Other values (2) 2125
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 60006
62.3%
ASCII 36292
37.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
18053
49.7%
1 3542
 
9.8%
2 2249
 
6.2%
3 1969
 
5.4%
4 1630
 
4.5%
5 1516
 
4.2%
6 1377
 
3.8%
0 1363
 
3.8%
7 1247
 
3.4%
8 1221
 
3.4%
Other values (2) 2125
 
5.9%
Hangul
ValueCountFrequency (%)
5077
 
8.5%
4974
 
8.3%
4965
 
8.3%
4517
 
7.5%
4192
 
7.0%
2234
 
3.7%
1849
 
3.1%
1611
 
2.7%
1557
 
2.6%
1480
 
2.5%
Other values (379) 27550
45.9%
Distinct4474
Distinct (%)73.6%
Missing0
Missing (%)0.0%
Memory size47.6 KiB
2024-05-18T05:28:21.159588image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length47
Median length42
Mean length20.300872
Min length13

Characters and Unicode

Total characters123409
Distinct characters377
Distinct categories9 ?
Distinct scripts4 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3813 ?
Unique (%)62.7%

Sample

1st row경기도 군포시 속달동 306번지 일원
2nd row경기도 포천시 신읍동 243-1번지
3rd row경기도 안성시 원곡면 칠곡리 928-3
4th row경기도 안산시 단원구 신길동 1691번지
5th row경기도 화성시 봉담읍 동화리 621번지
ValueCountFrequency (%)
경기도 6077
 
20.5%
수원시 624
 
2.1%
포천시 509
 
1.7%
안양시 506
 
1.7%
이천시 465
 
1.6%
파주시 455
 
1.5%
양평군 446
 
1.5%
안산시 428
 
1.4%
부천시 368
 
1.2%
가평군 338
 
1.1%
Other values (5107) 19460
65.6%
2024-05-18T05:28:21.975723image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
23598
19.1%
6185
 
5.0%
6158
 
5.0%
6115
 
5.0%
5420
 
4.4%
1 4833
 
3.9%
4750
 
3.8%
- 3883
 
3.1%
2 2917
 
2.4%
3 2562
 
2.1%
Other values (367) 56988
46.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 72191
58.5%
Decimal Number 23606
 
19.1%
Space Separator 23598
 
19.1%
Dash Punctuation 3883
 
3.1%
Close Punctuation 48
 
< 0.1%
Open Punctuation 48
 
< 0.1%
Uppercase Letter 22
 
< 0.1%
Other Punctuation 11
 
< 0.1%
Lowercase Letter 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6185
 
8.6%
6158
 
8.5%
6115
 
8.5%
5420
 
7.5%
4750
 
6.6%
2263
 
3.1%
2187
 
3.0%
1963
 
2.7%
1937
 
2.7%
1754
 
2.4%
Other values (337) 33459
46.3%
Uppercase Letter
ValueCountFrequency (%)
B 8
36.4%
A 4
18.2%
I 2
 
9.1%
L 1
 
4.5%
C 1
 
4.5%
D 1
 
4.5%
N 1
 
4.5%
T 1
 
4.5%
P 1
 
4.5%
J 1
 
4.5%
Decimal Number
ValueCountFrequency (%)
1 4833
20.5%
2 2917
12.4%
3 2562
10.9%
5 2301
9.7%
4 2279
9.7%
6 1980
8.4%
7 1897
 
8.0%
0 1703
 
7.2%
8 1592
 
6.7%
9 1542
 
6.5%
Other Punctuation
ValueCountFrequency (%)
, 8
72.7%
. 2
 
18.2%
? 1
 
9.1%
Lowercase Letter
ValueCountFrequency (%)
e 1
50.0%
c 1
50.0%
Space Separator
ValueCountFrequency (%)
23598
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3883
100.0%
Close Punctuation
ValueCountFrequency (%)
) 48
100.0%
Open Punctuation
ValueCountFrequency (%)
( 48
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 72190
58.5%
Common 51194
41.5%
Latin 24
 
< 0.1%
Han 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6185
 
8.6%
6158
 
8.5%
6115
 
8.5%
5420
 
7.5%
4750
 
6.6%
2263
 
3.1%
2187
 
3.0%
1963
 
2.7%
1937
 
2.7%
1754
 
2.4%
Other values (336) 33458
46.3%
Common
ValueCountFrequency (%)
23598
46.1%
1 4833
 
9.4%
- 3883
 
7.6%
2 2917
 
5.7%
3 2562
 
5.0%
5 2301
 
4.5%
4 2279
 
4.5%
6 1980
 
3.9%
7 1897
 
3.7%
0 1703
 
3.3%
Other values (7) 3241
 
6.3%
Latin
ValueCountFrequency (%)
B 8
33.3%
A 4
16.7%
I 2
 
8.3%
L 1
 
4.2%
C 1
 
4.2%
D 1
 
4.2%
e 1
 
4.2%
c 1
 
4.2%
N 1
 
4.2%
T 1
 
4.2%
Other values (3) 3
 
12.5%
Han
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 72190
58.5%
ASCII 51218
41.5%
CJK 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
23598
46.1%
1 4833
 
9.4%
- 3883
 
7.6%
2 2917
 
5.7%
3 2562
 
5.0%
5 2301
 
4.5%
4 2279
 
4.4%
6 1980
 
3.9%
7 1897
 
3.7%
0 1703
 
3.3%
Other values (20) 3265
 
6.4%
Hangul
ValueCountFrequency (%)
6185
 
8.6%
6158
 
8.5%
6115
 
8.5%
5420
 
7.5%
4750
 
6.6%
2263
 
3.1%
2187
 
3.0%
1963
 
2.7%
1937
 
2.7%
1754
 
2.4%
Other values (336) 33458
46.3%
CJK
ValueCountFrequency (%)
1
100.0%

정제우편번호
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct2152
Distinct (%)36.6%
Missing194
Missing (%)3.2%
Infinite0
Infinite (%)0.0%
Mean13910.261
Minimum1377
Maximum18626
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size53.6 KiB
2024-05-18T05:28:22.303190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1377
5-th percentile10801
Q112232
median13901
Q316069
95-th percentile17526.6
Maximum18626
Range17249
Interquartile range (IQR)3837

Descriptive statistics

Standard deviation2337.2812
Coefficient of variation (CV)0.16802569
Kurtosis-0.94128228
Mean13910.261
Median Absolute Deviation (MAD)2092
Skewness0.14749723
Sum81861887
Variance5462883.5
MonotonicityNot monotonic
2024-05-18T05:28:22.568948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
17379 38
 
0.6%
18130 35
 
0.6%
12422 30
 
0.5%
11139 30
 
0.5%
13912 29
 
0.5%
14089 28
 
0.5%
13922 27
 
0.4%
18355 27
 
0.4%
11101 25
 
0.4%
17406 24
 
0.4%
Other values (2142) 5592
92.0%
(Missing) 194
 
3.2%
ValueCountFrequency (%)
1377 2
 
< 0.1%
10046 1
 
< 0.1%
10068 1
 
< 0.1%
10109 1
 
< 0.1%
10210 2
 
< 0.1%
10212 1
 
< 0.1%
10215 1
 
< 0.1%
10218 2
 
< 0.1%
10222 2
 
< 0.1%
10223 7
0.1%
ValueCountFrequency (%)
18626 1
 
< 0.1%
18527 1
 
< 0.1%
18388 2
 
< 0.1%
18358 8
 
0.1%
18355 27
0.4%
18298 1
 
< 0.1%
18151 3
 
< 0.1%
18150 1
 
< 0.1%
18148 1
 
< 0.1%
18147 1
 
< 0.1%

정제WGS84위도
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct4187
Distinct (%)71.0%
Missing184
Missing (%)3.0%
Infinite0
Infinite (%)0.0%
Mean37.498918
Minimum36.916889
Maximum38.18295
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size53.6 KiB
2024-05-18T05:28:22.959188image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum36.916889
5-th percentile37.149035
Q137.300591
median37.439277
Q337.70323
95-th percentile37.933689
Maximum38.18295
Range1.2660606
Interquartile range (IQR)0.40263861

Descriptive statistics

Standard deviation0.25001354
Coefficient of variation (CV)0.00666722
Kurtosis-0.5639163
Mean37.498918
Median Absolute Deviation (MAD)0.15665065
Skewness0.49857476
Sum221056.12
Variance0.062506769
MonotonicityNot monotonic
2024-05-18T05:28:23.430736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
37.3889976474 27
 
0.4%
37.4183798984 27
 
0.4%
37.8773715659 24
 
0.4%
37.4018698265 21
 
0.3%
37.1404782493 19
 
0.3%
37.3831019563 18
 
0.3%
37.8557615302 18
 
0.3%
37.3810305854 17
 
0.3%
37.5131520747 17
 
0.3%
37.8161951551 16
 
0.3%
Other values (4177) 5691
93.6%
(Missing) 184
 
3.0%
ValueCountFrequency (%)
36.9168894764 1
< 0.1%
36.9350364703 1
< 0.1%
36.9433047732 2
< 0.1%
36.943676386 1
< 0.1%
36.9517666318 1
< 0.1%
36.9597572254 1
< 0.1%
36.9602913498 1
< 0.1%
36.9645754499 2
< 0.1%
36.9651786777 1
< 0.1%
36.9673433617 1
< 0.1%
ValueCountFrequency (%)
38.182950099 1
< 0.1%
38.1666893514 1
< 0.1%
38.1664994489 1
< 0.1%
38.1661954165 1
< 0.1%
38.1659979498 1
< 0.1%
38.1623108065 1
< 0.1%
38.1604142789 1
< 0.1%
38.1603753855 1
< 0.1%
38.1588427797 2
< 0.1%
38.1581025812 1
< 0.1%

정제WGS84경도
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct4187
Distinct (%)71.0%
Missing184
Missing (%)3.0%
Infinite0
Infinite (%)0.0%
Mean127.09761
Minimum126.38913
Maximum127.79132
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size53.6 KiB
2024-05-18T05:28:23.763292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum126.38913
5-th percentile126.75424
Q1126.86752
median127.05105
Q3127.27898
95-th percentile127.54356
Maximum127.79132
Range1.4021954
Interquartile range (IQR)0.41145995

Descriptive statistics

Standard deviation0.26653403
Coefficient of variation (CV)0.0020970814
Kurtosis-0.77912947
Mean127.09761
Median Absolute Deviation (MAD)0.19966396
Skewness0.40681353
Sum749240.41
Variance0.071040391
MonotonicityNot monotonic
2024-05-18T05:28:24.171932image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
126.9280462689 27
 
0.4%
126.9183187069 27
 
0.4%
126.8048065471 24
 
0.4%
126.9667477669 21
 
0.3%
127.0645822452 19
 
0.3%
126.9704727553 18
 
0.3%
127.1190825052 18
 
0.3%
126.9778048454 17
 
0.3%
126.7452297663 17
 
0.3%
127.5209053191 16
 
0.3%
Other values (4177) 5691
93.6%
(Missing) 184
 
3.0%
ValueCountFrequency (%)
126.3891261111 1
< 0.1%
126.3915488852 1
< 0.1%
126.3923701162 1
< 0.1%
126.3929010601 1
< 0.1%
126.450515871 1
< 0.1%
126.4532254318 1
< 0.1%
126.547640313 2
< 0.1%
126.5526162746 1
< 0.1%
126.5540584899 1
< 0.1%
126.5669484651 1
< 0.1%
ValueCountFrequency (%)
127.7913215403 2
< 0.1%
127.7910921485 1
< 0.1%
127.7766029235 1
< 0.1%
127.7727365668 1
< 0.1%
127.7711531862 1
< 0.1%
127.7706997381 1
< 0.1%
127.7705441813 1
< 0.1%
127.770290524 1
< 0.1%
127.7688750232 1
< 0.1%
127.7666402442 1
< 0.1%

Interactions

2024-05-18T05:28:10.929045image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T05:28:09.358314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T05:28:10.178222image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T05:28:11.115137image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T05:28:09.624876image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T05:28:10.372386image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T05:28:11.311400image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T05:28:09.896722image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T05:28:10.645380image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-18T05:28:24.444636image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군명데이터기준일자정제우편번호정제WGS84위도정제WGS84경도
시군명1.0001.0000.9910.9360.911
데이터기준일자1.0001.0000.9530.8880.839
정제우편번호0.9910.9531.0000.9340.842
정제WGS84위도0.9360.8880.9341.0000.732
정제WGS84경도0.9110.8390.8420.7321.000
2024-05-18T05:28:24.643629image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
정제우편번호정제WGS84위도정제WGS84경도시군명
정제우편번호1.000-0.9220.0590.852
정제WGS84위도-0.9221.000-0.0200.689
정제WGS84경도0.059-0.0201.0000.620
시군명0.8520.6890.6201.000

Missing values

2024-05-18T05:28:11.574074image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-18T05:28:11.930791image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-05-18T05:28:12.304359image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

시군명건축물명면적취득일데이터기준일자정제도로명주소정제지번주소정제우편번호정제WGS84위도정제WGS84경도
0경기도수리산 탐방안내소연면적 685.59㎡<NA>2022-12-31경기도 군포시 속달로 347-4경기도 군포시 속달동 306번지 일원1588937.348002126.900526
1경기도포천병원 본관동 환경개선리모델링공사<NA>2022-12-31경기도 포천시 포천로 1648경기도 포천시 신읍동 243-1번지1114237.903093127.198349
2경기도안성 원곡119안전센터연면적 942㎡<NA>2022-12-31경기도 안성시 원곡면 원곡물류단지로 162-20경기도 안성시 원곡면 칠곡리 928-31755537.042071127.158226
3경기도안산 신길119안전센터연면적 990㎡<NA>2022-12-31경기도 안산시 단원구 삼일로 50경기도 안산시 단원구 신길동 1691번지1540337.335039126.783624
4경기도화성 봉담 119안전센터연면적 893㎡<NA>2022-12-31경기도 화성시 봉담읍 동화새터길 135경기도 화성시 봉담읍 동화리 621번지1829837.215031126.962915
5경기도의정부병원 본관동 환경개선 및 장례식장 증축연면적 969㎡<NA>2022-12-31경기도 의정부시 흥선로 142경기도 의정부시 의정부동 433번지1167137.741076127.042514
6구리시갈매동 제설작업 전진기지4,8932019-05-092023-03-15경기도 구리시 금강로 164경기도 구리시 갈매동 8-10 외1필지1190137.640195127.128664
7구리시구리 남자청소년 쉼터195.42017-09-212023-03-15경기도 구리시 안골로 32-1경기도 구리시 교문동 7361193437.597102127.134452
8구리시구리시 멀티스포츠센터10,6922014-03-262023-03-15경기도 구리시 체육관로 137-25경기도 구리시 교문동 153-1 외12필지1193437.596158127.13517
9경기도양주 옥정119안전센터연면적 992㎡<NA>2022-12-31<NA>경기도 양주시 옥정동 119-6번지<NA>37.82745127.094035
시군명건축물명면적취득일데이터기준일자정제도로명주소정제지번주소정제우편번호정제WGS84위도정제WGS84경도
6069포천시폐수종말처리시설 가동349.372007-12-312024-03-06경기도 포천시 영중면 양문공단로 98경기도 포천시 영중면 양문리 9941112838.007934127.257803
6070포천시금주2리경로당199.932003-03-042024-03-06경기도 포천시 영중면 물안3길 14경기도 포천시 영중면 금주리 275-61113137.976763127.273652
6071포천시거사2리경로당179.882002-03-182024-03-06경기도 포천시 영중면 금화봉길 569경기도 포천시 영중면 거사리 295-11113037.987725127.235913
6072포천시성동4리 경로당 리모델링 대상 건물110.072021-06-082024-03-06경기도 포천시 영중면 성장로166번길 12-17경기도 포천시 영중면 성동리 139-41112838.027582127.275593
6073포천시성동3리 경로당 화장실(옥외)1442008-11-102024-03-06경기도 포천시 영중면 나삼길 197경기도 포천시 영중면 성동리 241-31112838.019963127.265089
6074포천시희망애찬 제작소662021-07-312024-03-06경기도 포천시 영중면 전영로 1382경기도 포천시 영중면 영평리 209-41112638.017187127.212111
6075포천시영평2리 경로당137.162013-06-052024-03-06경기도 포천시 영중면 전영로 1355-41경기도 포천시 영중면 영평리 453-21112638.018566127.208768
6076포천시영송리 분뇨처리장 나동1611986-12-302024-03-06<NA>경기도 포천시 영중면 영송리 6161113037.998872127.20747
6077포천시영송리 분뇨처리장 다동1081992-10-312024-03-06<NA>경기도 포천시 영중면 영송리 6161113037.998872127.20747
6078포천시영송리 분뇨처리장 라동32.81992-10-312024-03-06<NA>경기도 포천시 영중면 영송리 6161113037.998872127.20747

Duplicate rows

Most frequently occurring

시군명건축물명면적취득일데이터기준일자정제도로명주소정제지번주소정제우편번호정제WGS84위도정제WGS84경도# duplicates
13안양시관양두산벤처다임232.812007-12-282024-03-18경기도 안양시 동안구 학의로 250경기도 안양시 동안구 관양동 1307-37번지1405637.40187126.96674813
1가평군산장관광지 숙박시설16.742009-03-122024-02-14<NA>경기도 가평군 상면 덕현리 산 74-61244637.753713127.4108395
14안양시관양두산벤처다임244.192007-12-282024-03-18경기도 안양시 동안구 학의로 250경기도 안양시 동안구 관양동 1307-37번지1405637.40187126.9667484
2가평군산장국민관광지92005-12-222024-02-14<NA>경기도 가평군 상면 덕현리 산 74-61244637.753713127.4108393
6동두천시종합운동장56.541997-09-102024-04-30경기도 동두천시 어등로 45경기도 동두천시 생연동 70 종합운동장(화장실)1132037.899696127.0708113
0가평군가평정수장9.222004-11-292024-02-14<NA>경기도 가평군 가평읍 달전리 510-31242237.807265127.5146612
3가평군연인산캠핑장공동화장실138.42008-06-262024-02-14<NA>경기도 가평군 북면 백둔리 3601240637.901667127.4593292
4가평군연하희망마을센터202.652012-08-312024-02-14<NA>경기도 가평군 상면 연하리 218-121244437.803148127.3484972
5광명시3급관사364.28<NA>2023-03-20<NA>경기도 광명시 모세로 27 (철산동)<NA><NA><NA>2
7부천시삼삼약수경로당34.31<NA>2023-03-20경기도 부천시 지양로158번길 66경기도 부천시 고강동 422-191446937.522973126.8219372