Overview

Dataset statistics

Number of variables6
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows5
Duplicate rows (%)0.1%
Total size in memory546.9 KiB
Average record size in memory56.0 B

Variable types

Text2
Categorical2
DateTime2

Dataset

Description국토안전관리원에서 제공하는 데이터이며 시설물 유형별 공공시설물(공동주택 제외)의 점검진단 실시 정보 등의 항목을 제공합니다.
URLhttps://www.data.go.kr/data/15083082/fileData.do

Alerts

Dataset has 5 (0.1%) duplicate rowsDuplicates
점검구분 is highly imbalanced (63.4%)Imbalance

Reproduction

Analysis started2023-12-12 19:13:50.157756
Analysis finished2023-12-12 19:13:51.107801
Duration0.95 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct9786
Distinct (%)97.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T04:13:51.421475image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length49
Median length36
Mean length8.7471
Min length2

Characters and Unicode

Total characters87471
Distinct characters626
Distinct categories13 ?
Distinct scripts3 ?
Distinct blocks7 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9600 ?
Unique (%)96.0%

Sample

1st row중부내륙선(창원) 222.2k 절토사면
2nd row하구암2교(양평)
3rd row내수IC교
4th row미산9배수문
5th row중부선 204.35k(통영방향)절토사면
ValueCountFrequency (%)
절토사면 248
 
1.7%
옹벽 121
 
0.8%
터널 96
 
0.7%
본관동 90
 
0.6%
교사동 78
 
0.5%
본관 74
 
0.5%
역사 62
 
0.4%
중앙선 61
 
0.4%
배수문 51
 
0.4%
보강토옹벽 46
 
0.3%
Other values (11074) 13344
93.5%
2023-12-13T04:13:52.025897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4720
 
5.4%
4285
 
4.9%
) 2960
 
3.4%
( 2959
 
3.4%
1 2457
 
2.8%
2363
 
2.7%
2 1971
 
2.3%
0 1792
 
2.0%
1388
 
1.6%
1360
 
1.6%
Other values (616) 61216
70.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 60538
69.2%
Decimal Number 11419
 
13.1%
Space Separator 4285
 
4.9%
Close Punctuation 2979
 
3.4%
Open Punctuation 2978
 
3.4%
Uppercase Letter 2411
 
2.8%
Other Punctuation 1085
 
1.2%
Math Symbol 625
 
0.7%
Lowercase Letter 561
 
0.6%
Dash Punctuation 557
 
0.6%
Other values (3) 33
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4720
 
7.8%
2363
 
3.9%
1388
 
2.3%
1360
 
2.2%
1288
 
2.1%
1262
 
2.1%
1163
 
1.9%
1106
 
1.8%
1085
 
1.8%
1073
 
1.8%
Other values (545) 43730
72.2%
Uppercase Letter
ValueCountFrequency (%)
C 335
13.9%
A 242
10.0%
I 238
9.9%
T 232
9.6%
D 196
 
8.1%
S 192
 
8.0%
B 143
 
5.9%
K 115
 
4.8%
R 107
 
4.4%
U 105
 
4.4%
Other values (16) 506
21.0%
Lowercase Letter
ValueCountFrequency (%)
k 455
81.1%
m 42
 
7.5%
a 22
 
3.9%
p 19
 
3.4%
t 8
 
1.4%
e 4
 
0.7%
s 4
 
0.7%
i 2
 
0.4%
y 2
 
0.4%
o 1
 
0.2%
Other values (2) 2
 
0.4%
Decimal Number
ValueCountFrequency (%)
1 2457
21.5%
2 1971
17.3%
0 1792
15.7%
3 1155
10.1%
4 878
 
7.7%
5 810
 
7.1%
6 675
 
5.9%
7 661
 
5.8%
8 541
 
4.7%
9 479
 
4.2%
Other Punctuation
ValueCountFrequency (%)
. 892
82.2%
, 132
 
12.2%
/ 32
 
2.9%
# 17
 
1.6%
: 9
 
0.8%
· 3
 
0.3%
Letter Number
ValueCountFrequency (%)
11
40.7%
10
37.0%
2
 
7.4%
2
 
7.4%
1
 
3.7%
1
 
3.7%
Math Symbol
ValueCountFrequency (%)
~ 374
59.8%
+ 245
39.2%
6
 
1.0%
Close Punctuation
ValueCountFrequency (%)
) 2960
99.4%
] 19
 
0.6%
Open Punctuation
ValueCountFrequency (%)
( 2959
99.4%
[ 19
 
0.6%
Space Separator
ValueCountFrequency (%)
4285
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 557
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 3
100.0%
Other Symbol
ValueCountFrequency (%)
3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 60538
69.2%
Common 23934
 
27.4%
Latin 2999
 
3.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4720
 
7.8%
2363
 
3.9%
1388
 
2.3%
1360
 
2.2%
1288
 
2.1%
1262
 
2.1%
1163
 
1.9%
1106
 
1.8%
1085
 
1.8%
1073
 
1.8%
Other values (545) 43730
72.2%
Latin
ValueCountFrequency (%)
k 455
15.2%
C 335
11.2%
A 242
 
8.1%
I 238
 
7.9%
T 232
 
7.7%
D 196
 
6.5%
S 192
 
6.4%
B 143
 
4.8%
K 115
 
3.8%
R 107
 
3.6%
Other values (34) 744
24.8%
Common
ValueCountFrequency (%)
4285
17.9%
) 2960
12.4%
( 2959
12.4%
1 2457
10.3%
2 1971
8.2%
0 1792
7.5%
3 1155
 
4.8%
. 892
 
3.7%
4 878
 
3.7%
5 810
 
3.4%
Other values (17) 3775
15.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 60537
69.2%
ASCII 26894
30.7%
Number Forms 27
 
< 0.1%
Math Operators 6
 
< 0.1%
None 3
 
< 0.1%
CJK Compat 3
 
< 0.1%
Compat Jamo 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4720
 
7.8%
2363
 
3.9%
1388
 
2.3%
1360
 
2.2%
1288
 
2.1%
1262
 
2.1%
1163
 
1.9%
1106
 
1.8%
1085
 
1.8%
1073
 
1.8%
Other values (544) 43729
72.2%
ASCII
ValueCountFrequency (%)
4285
15.9%
) 2960
11.0%
( 2959
11.0%
1 2457
 
9.1%
2 1971
 
7.3%
0 1792
 
6.7%
3 1155
 
4.3%
. 892
 
3.3%
4 878
 
3.3%
5 810
 
3.0%
Other values (52) 6735
25.0%
Number Forms
ValueCountFrequency (%)
11
40.7%
10
37.0%
2
 
7.4%
2
 
7.4%
1
 
3.7%
1
 
3.7%
Math Operators
ValueCountFrequency (%)
6
100.0%
None
ValueCountFrequency (%)
· 3
100.0%
CJK Compat
ValueCountFrequency (%)
3
100.0%
Compat Jamo
ValueCountFrequency (%)
1
100.0%

시설물구분
Categorical

Distinct11
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
교량
3569 
하천
1792 
건축물
1398 
절토사면
1006 
터널
912 
Other values (6)
1323 

Length

Max length4
Median length2
Mean length2.4456
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row절토사면
2nd row교량
3rd row교량
4th row하천
5th row절토사면

Common Values

ValueCountFrequency (%)
교량 3569
35.7%
하천 1792
17.9%
건축물 1398
 
14.0%
절토사면 1006
 
10.1%
터널 912
 
9.1%
상하수도 600
 
6.0%
옹벽 419
 
4.2%
162
 
1.6%
항만 108
 
1.1%
기타 26
 
0.3%

Length

2023-12-13T04:13:52.194140image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
교량 3569
35.7%
하천 1792
17.9%
건축물 1398
 
14.0%
절토사면 1006
 
10.1%
터널 912
 
9.1%
상하수도 600
 
6.0%
옹벽 419
 
4.2%
162
 
1.6%
항만 108
 
1.1%
기타 26
 
0.3%
Distinct279
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T04:13:52.479982image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length7
Mean length7.6757
Min length5

Characters and Unicode

Total characters76757
Distinct characters151
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)0.1%

Sample

1st row충청북도충주시
2nd row충청북도충주시
3rd row충청북도청주시 청원구
4th row경기도연천군
5th row대전광역시동구
ValueCountFrequency (%)
경상남도진주시 152
 
1.4%
경상남도김해시 151
 
1.4%
경기도평택시 141
 
1.3%
경상북도경주시 128
 
1.2%
경상남도창원시 120
 
1.1%
충청북도청주시 120
 
1.1%
강원특별자치도평창군 120
 
1.1%
충청북도충주시 119
 
1.1%
경기도화성시 116
 
1.1%
경상북도상주시 108
 
1.0%
Other values (269) 9547
88.2%
2023-12-13T04:13:52.919232image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7792
 
10.2%
7004
 
9.1%
4043
 
5.3%
3327
 
4.3%
3244
 
4.2%
3104
 
4.0%
2723
 
3.5%
2427
 
3.2%
1942
 
2.5%
1942
 
2.5%
Other values (141) 39209
51.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 75935
98.9%
Space Separator 822
 
1.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7792
 
10.3%
7004
 
9.2%
4043
 
5.3%
3327
 
4.4%
3244
 
4.3%
3104
 
4.1%
2723
 
3.6%
2427
 
3.2%
1942
 
2.6%
1942
 
2.6%
Other values (140) 38387
50.6%
Space Separator
ValueCountFrequency (%)
822
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 75935
98.9%
Common 822
 
1.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7792
 
10.3%
7004
 
9.2%
4043
 
5.3%
3327
 
4.4%
3244
 
4.3%
3104
 
4.1%
2723
 
3.6%
2427
 
3.2%
1942
 
2.6%
1942
 
2.6%
Other values (140) 38387
50.6%
Common
ValueCountFrequency (%)
822
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 75935
98.9%
ASCII 822
 
1.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
7792
 
10.3%
7004
 
9.2%
4043
 
5.3%
3327
 
4.4%
3244
 
4.3%
3104
 
4.1%
2723
 
3.6%
2427
 
3.2%
1942
 
2.6%
1942
 
2.6%
Other values (140) 38387
50.6%
ASCII
ValueCountFrequency (%)
822
100.0%

점검구분
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
정밀안전점검
9300 
정밀안전진단
 
700

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row정밀안전점검
2nd row정밀안전점검
3rd row정밀안전점검
4th row정밀안전점검
5th row정밀안전점검

Common Values

ValueCountFrequency (%)
정밀안전점검 9300
93.0%
정밀안전진단 700
 
7.0%

Length

2023-12-13T04:13:53.044179image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:13:53.129699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
정밀안전점검 9300
93.0%
정밀안전진단 700
 
7.0%
Distinct454
Distinct (%)4.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2020-06-12 00:00:00
Maximum2023-06-19 00:00:00
2023-12-13T04:13:53.227703image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:13:53.359834image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct510
Distinct (%)5.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2022-01-01 00:00:00
Maximum2023-06-30 00:00:00
2023-12-13T04:13:53.497834image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:13:53.639814image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Correlations

2023-12-13T04:13:53.789252image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설물구분점검구분
시설물구분1.0000.162
점검구분0.1621.000
2023-12-13T04:13:53.976648image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설물구분점검구분
시설물구분1.0000.155
점검구분0.1551.000
2023-12-13T04:13:54.160339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설물구분점검구분
시설물구분1.0000.155
점검구분0.1551.000

Missing values

2023-12-13T04:13:50.890945image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T04:13:51.037337image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시설물명시설물구분소재지점검구분시작일종료일
18943중부내륙선(창원) 222.2k 절토사면절토사면충청북도충주시정밀안전점검2022-03-282022-06-15
18988하구암2교(양평)교량충청북도충주시정밀안전점검2022-05-092022-08-26
7638내수IC교교량충청북도청주시 청원구정밀안전점검2023-02-142023-03-24
1593미산9배수문하천경기도연천군정밀안전점검2022-03-242022-05-07
17237중부선 204.35k(통영방향)절토사면절토사면대전광역시동구정밀안전점검2022-03-302022-07-22
9388능주IC교교량전라남도화순군정밀안전점검2022-03-252022-09-20
13379서해안선 235.79(목포) 절토사면절토사면충청남도서산시정밀안전점검2022-03-302022-11-24
14774두지배수펌프장하천경기도파주시정밀안전점검2022-11-042022-12-21
7623신천교교량충청남도아산시정밀안전점검2022-05-022022-10-28
6709좌로2교교량광주광역시동구정밀안전점검2021-09-082022-04-29
시설물명시설물구분소재지점검구분시작일종료일
627성환6배수문하천충청남도천안시 서북구정밀안전점검2023-03-092023-04-18
9717계산1교(상)교량전라남도곡성군정밀안전점검2022-04-012022-11-15
20139안양동초등학교 1호동건축물경기도안양시정밀안전점검2022-04-222022-05-31
10595경민지하차도B터널경기도의정부시정밀안전점검2022-02-212022-05-21
21321서울청구초등학교 동산관건축물서울특별시중구정밀안전점검2022-05-102022-06-15
20224영천교교량경기도화성시정밀안전점검2022-04-052022-06-30
2917영산교교량전라남도나주시정밀안전점검2023-02-222023-05-22
10707번영교교량울산광역시남구정밀안전점검2022-07-112022-12-07
21294화랑마을 교육관건축물경상북도경주시정밀안전점검2022-04-272022-06-25
1227대청2교교량경상남도김해시정밀안전점검2022-01-262022-04-25

Duplicate rows

Most frequently occurring

시설물명시설물구분소재지점검구분시작일종료일# duplicates
017.850k~18.020k(근덕) 절토사면절토사면강원특별자치도동해시정밀안전점검2022-03-302022-09-162
1구삼호교교량울산광역시중구정밀안전점검2023-04-062023-06-042
2덕수교교량경기도고양시 덕양구정밀안전점검2022-02-212022-04-212
3임학가압장상하수도인천광역시계양구정밀안전점검2023-05-172023-06-142
4정문제하천경기도화성시정밀안전점검2022-07-222022-10-192