Overview

Dataset statistics

Number of variables9
Number of observations2206
Missing cells1
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory157.4 KiB
Average record size in memory73.1 B

Variable types

Numeric1
Text3
Categorical5

Dataset

Description파일 다운로드
Author서울교통공사
URLhttps://data.seoul.go.kr/dataList/OA-12926/F/1/datasetView.do

Alerts

기능 is highly overall correlated with 필터유무High correlation
필터유무 is highly overall correlated with 기능High correlation

Reproduction

Analysis started2024-04-29 16:38:47.789999
Analysis finished2024-04-29 16:38:48.886071
Duration1.1 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

호선
Real number (ℝ)

Distinct8
Distinct (%)0.4%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean4.4875283
Minimum1
Maximum8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size19.5 KiB
2024-04-30T01:38:48.935119image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1.2
Q13
median5
Q36
95-th percentile8
Maximum8
Range7
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.036298
Coefficient of variation (CV)0.45376828
Kurtosis-1.2121972
Mean4.4875283
Median Absolute Deviation (MAD)2
Skewness-0.035548376
Sum9895
Variance4.1465095
MonotonicityIncreasing
2024-04-30T01:38:49.040507image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
5 459
20.8%
2 438
19.9%
7 365
16.5%
6 273
12.4%
3 268
12.1%
4 178
 
8.1%
8 113
 
5.1%
1 111
 
5.0%
(Missing) 1
 
< 0.1%
ValueCountFrequency (%)
1 111
 
5.0%
2 438
19.9%
3 268
12.1%
4 178
 
8.1%
5 459
20.8%
6 273
12.4%
7 365
16.5%
8 113
 
5.1%
ValueCountFrequency (%)
8 113
 
5.1%
7 365
16.5%
6 273
12.4%
5 459
20.8%
4 178
 
8.1%
3 268
12.1%
2 438
19.9%
1 111
 
5.0%

번호
Text

Distinct525
Distinct (%)23.8%
Missing0
Missing (%)0.0%
Memory size17.4 KiB
2024-04-30T01:38:49.406062image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length3
Mean length2.6781505
Min length1

Characters and Unicode

Total characters5908
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique110 ?
Unique (%)5.0%

Sample

1st row3
2nd row4
3rd row5
4th row6
5th row7
ValueCountFrequency (%)
3 8
 
0.4%
67 8
 
0.4%
92 8
 
0.4%
78 8
 
0.4%
77 8
 
0.4%
76 8
 
0.4%
75 8
 
0.4%
74 8
 
0.4%
73 8
 
0.4%
69 8
 
0.4%
Other values (515) 2126
96.4%
2024-04-30T01:38:49.919557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1131
19.1%
2 900
15.2%
3 726
12.3%
4 544
9.2%
5 454
7.7%
6 440
 
7.4%
7 430
 
7.3%
0 417
 
7.1%
9 405
 
6.9%
8 394
 
6.7%
Other values (2) 67
 
1.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5841
98.9%
Dash Punctuation 65
 
1.1%
Other Punctuation 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1131
19.4%
2 900
15.4%
3 726
12.4%
4 544
9.3%
5 454
7.8%
6 440
 
7.5%
7 430
 
7.4%
0 417
 
7.1%
9 405
 
6.9%
8 394
 
6.7%
Dash Punctuation
ValueCountFrequency (%)
- 65
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5908
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1131
19.1%
2 900
15.2%
3 726
12.3%
4 544
9.2%
5 454
7.7%
6 440
 
7.4%
7 430
 
7.3%
0 417
 
7.1%
9 405
 
6.9%
8 394
 
6.7%
Other values (2) 67
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5908
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1131
19.1%
2 900
15.2%
3 726
12.3%
4 544
9.2%
5 454
7.7%
6 440
 
7.4%
7 430
 
7.3%
0 417
 
7.1%
9 405
 
6.9%
8 394
 
6.7%
Other values (2) 67
 
1.1%

구간
Text

Distinct481
Distinct (%)21.8%
Missing0
Missing (%)0.0%
Memory size17.4 KiB
2024-04-30T01:38:50.185212image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length12
Mean length4.5299184
Min length2

Characters and Unicode

Total characters9993
Distinct characters212
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18 ?
Unique (%)0.8%

Sample

1st row서울역
2nd row서울역
3rd row서울역
4th row서울역
5th row서울역~시청
ValueCountFrequency (%)
종로3가 20
 
0.9%
영등포구청 18
 
0.8%
충정로 17
 
0.8%
동대문역사문화공원 15
 
0.7%
신설동~용두 14
 
0.6%
사당 14
 
0.6%
시청 13
 
0.6%
잠원~고속터미널 12
 
0.5%
삼각지 12
 
0.5%
합정 12
 
0.5%
Other values (471) 2059
93.3%
2024-04-30T01:38:50.574411image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
~ 820
 
8.2%
376
 
3.8%
316
 
3.2%
291
 
2.9%
247
 
2.5%
192
 
1.9%
177
 
1.8%
175
 
1.8%
159
 
1.6%
157
 
1.6%
Other values (202) 7083
70.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 8923
89.3%
Math Symbol 933
 
9.3%
Decimal Number 100
 
1.0%
Dash Punctuation 27
 
0.3%
Uppercase Letter 6
 
0.1%
Open Punctuation 2
 
< 0.1%
Close Punctuation 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
376
 
4.2%
316
 
3.5%
291
 
3.3%
247
 
2.8%
192
 
2.2%
177
 
2.0%
175
 
2.0%
159
 
1.8%
157
 
1.8%
145
 
1.6%
Other values (193) 6688
75.0%
Decimal Number
ValueCountFrequency (%)
3 57
57.0%
4 24
24.0%
5 19
 
19.0%
Math Symbol
ValueCountFrequency (%)
~ 820
87.9%
113
 
12.1%
Dash Punctuation
ValueCountFrequency (%)
- 27
100.0%
Uppercase Letter
ValueCountFrequency (%)
U 6
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 8923
89.3%
Common 1064
 
10.6%
Latin 6
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
376
 
4.2%
316
 
3.5%
291
 
3.3%
247
 
2.8%
192
 
2.2%
177
 
2.0%
175
 
2.0%
159
 
1.8%
157
 
1.8%
145
 
1.6%
Other values (193) 6688
75.0%
Common
ValueCountFrequency (%)
~ 820
77.1%
113
 
10.6%
3 57
 
5.4%
- 27
 
2.5%
4 24
 
2.3%
5 19
 
1.8%
( 2
 
0.2%
) 2
 
0.2%
Latin
ValueCountFrequency (%)
U 6
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 8923
89.3%
ASCII 957
 
9.6%
Math Operators 113
 
1.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
~ 820
85.7%
3 57
 
6.0%
- 27
 
2.8%
4 24
 
2.5%
5 19
 
2.0%
U 6
 
0.6%
( 2
 
0.2%
) 2
 
0.2%
Hangul
ValueCountFrequency (%)
376
 
4.2%
316
 
3.5%
291
 
3.3%
247
 
2.8%
192
 
2.2%
177
 
2.0%
175
 
2.0%
159
 
1.8%
157
 
1.8%
145
 
1.6%
Other values (193) 6688
75.0%
Math Operators
ValueCountFrequency (%)
113
100.0%

용도
Categorical

Distinct7
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size17.4 KiB
역사
1085 
본선
947 
변전실
 
105
냉각탑
 
62
유치선
 
5
Other values (2)
 
2

Length

Max length3
Median length2
Mean length2.0784225
Min length2

Unique

Unique2 ?
Unique (%)0.1%

Sample

1st row역사
2nd row역사
3rd row역사
4th row역사
5th row본선

Common Values

ValueCountFrequency (%)
역사 1085
49.2%
본선 947
42.9%
변전실 105
 
4.8%
냉각탑 62
 
2.8%
유치선 5
 
0.2%
기타 1
 
< 0.1%
출고선 1
 
< 0.1%

Length

2024-04-30T01:38:50.714147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-30T01:38:50.812878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
역사 1085
49.2%
본선 947
42.9%
변전실 105
 
4.8%
냉각탑 62
 
2.8%
유치선 5
 
0.2%
기타 1
 
< 0.1%
출고선 1
 
< 0.1%

기능
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size17.4 KiB
배기
1015 
급기
825 
자연
317 
급배기
 
49

Length

Max length3
Median length2
Mean length2.0222121
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row급기
2nd row배기
3rd row배기
4th row급기
5th row자연

Common Values

ValueCountFrequency (%)
배기 1015
46.0%
급기 825
37.4%
자연 317
 
14.4%
급배기 49
 
2.2%

Length

2024-04-30T01:38:50.925429image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-30T01:38:51.027504image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
배기 1015
46.0%
급기 825
37.4%
자연 317
 
14.4%
급배기 49
 
2.2%

위치
Categorical

Distinct5
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size17.4 KiB
보도
1565 
녹지
416 
중앙분리대
 
144
기타
 
57
차도
 
24

Length

Max length5
Median length2
Mean length2.1958296
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row중앙분리대
2nd row중앙분리대
3rd row녹지
4th row녹지
5th row보도

Common Values

ValueCountFrequency (%)
보도 1565
70.9%
녹지 416
 
18.9%
중앙분리대 144
 
6.5%
기타 57
 
2.6%
차도 24
 
1.1%

Length

2024-04-30T01:38:51.123638image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-30T01:38:51.220295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
보도 1565
70.9%
녹지 416
 
18.9%
중앙분리대 144
 
6.5%
기타 57
 
2.6%
차도 24
 
1.1%
Distinct403
Distinct (%)18.3%
Missing0
Missing (%)0.0%
Memory size17.4 KiB
2024-04-30T01:38:51.527175image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length5
Mean length3.8236627
Min length1

Characters and Unicode

Total characters8435
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique124 ?
Unique (%)5.6%

Sample

1st row0.8
2nd row0.8
3rd row0.95
4th row1.15
5th row0.65
ValueCountFrequency (%)
0.6 63
 
2.9%
1.3 56
 
2.5%
0.65 51
 
2.3%
0.55 49
 
2.2%
1.45 47
 
2.1%
1.6 46
 
2.1%
1.5 39
 
1.8%
1.15 39
 
1.8%
1.55 37
 
1.7%
0.2 36
 
1.6%
Other values (393) 1743
79.0%
2024-04-30T01:38:51.948274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 2135
25.3%
1 1498
17.8%
5 1375
16.3%
0 953
11.3%
2 579
 
6.9%
6 402
 
4.8%
3 390
 
4.6%
7 341
 
4.0%
4 314
 
3.7%
8 236
 
2.8%
Other values (7) 212
 
2.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6288
74.5%
Other Punctuation 2141
 
25.4%
Uppercase Letter 6
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1498
23.8%
5 1375
21.9%
0 953
15.2%
2 579
 
9.2%
6 402
 
6.4%
3 390
 
6.2%
7 341
 
5.4%
4 314
 
5.0%
8 236
 
3.8%
9 200
 
3.2%
Other Punctuation
ValueCountFrequency (%)
. 2135
99.7%
# 2
 
0.1%
/ 2
 
0.1%
! 2
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
D 2
33.3%
I 2
33.3%
V 2
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 8429
99.9%
Latin 6
 
0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
. 2135
25.3%
1 1498
17.8%
5 1375
16.3%
0 953
11.3%
2 579
 
6.9%
6 402
 
4.8%
3 390
 
4.6%
7 341
 
4.0%
4 314
 
3.7%
8 236
 
2.8%
Other values (4) 206
 
2.4%
Latin
ValueCountFrequency (%)
D 2
33.3%
I 2
33.3%
V 2
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8435
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 2135
25.3%
1 1498
17.8%
5 1375
16.3%
0 953
11.3%
2 579
 
6.9%
6 402
 
4.8%
3 390
 
4.6%
7 341
 
4.0%
4 314
 
3.7%
8 236
 
2.8%
Other values (7) 212
 
2.5%

구조물형태
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size17.4 KiB
탑형
1953 
지면형
253 

Length

Max length3
Median length2
Mean length2.1146872
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row탑형
2nd row탑형
3rd row탑형
4th row탑형
5th row탑형

Common Values

ValueCountFrequency (%)
탑형 1953
88.5%
지면형 253
 
11.5%

Length

2024-04-30T01:38:52.061841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-30T01:38:52.165362image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
탑형 1953
88.5%
지면형 253
 
11.5%

필터유무
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size17.4 KiB
1331 
875 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
1331
60.3%
875
39.7%

Length

2024-04-30T01:38:52.261207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-30T01:38:52.350933image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1331
60.3%
875
39.7%

Interactions

2024-04-30T01:38:48.582172image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-30T01:38:52.419438image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
호선용도기능위치구조물형태필터유무
호선1.0000.3680.6340.3710.3400.253
용도0.3681.0000.3880.1560.1020.231
기능0.6340.3881.0000.2080.2910.986
위치0.3710.1560.2081.0000.2000.090
구조물형태0.3400.1020.2910.2001.0000.232
필터유무0.2530.2310.9860.0900.2321.000
2024-04-30T01:38:52.523057image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
기능위치구조물형태필터유무용도
기능1.0000.1710.1930.8950.276
위치0.1711.0000.2440.1100.100
구조물형태0.1930.2441.0000.1490.109
필터유무0.8950.1100.1491.0000.246
용도0.2760.1000.1090.2461.000
2024-04-30T01:38:52.629316image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
호선용도기능위치구조물형태필터유무
호선1.0000.2070.3250.2380.2550.190
용도0.2071.0000.2760.1000.1090.246
기능0.3250.2761.0000.1710.1930.895
위치0.2380.1000.1711.0000.2440.110
구조물형태0.2550.1090.1930.2441.0000.149
필터유무0.1900.2460.8950.1100.1491.000

Missing values

2024-04-30T01:38:48.709723image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-30T01:38:48.832204image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

호선번호구간용도기능위치평균높이(m)구조물형태필터유무
013서울역역사급기중앙분리대0.8탑형
114서울역역사배기중앙분리대0.8탑형
215서울역역사배기녹지0.95탑형
316서울역역사급기녹지1.15탑형
417서울역~시청본선자연보도0.65탑형
518서울역~시청본선자연보도1.65탑형
619서울역~시청본선자연보도0.75탑형
7110서울역~시청본선자연녹지0.8탑형
8111서울역~시청본선자연보도1.55탑형
9112서울역~시청본선자연차도0지면형
호선번호구간용도기능위치평균높이(m)구조물형태필터유무
21968102강동구청역사배기중앙분리대0.91탑형
21978103강동구청역사급기보도1.265탑형
21988104강동구청∼토성본선배기보도0.6탑형
21998105강동구청∼토성본선급기녹지0.97탑형
22008106강동구청∼토성본선배기보도0.74탑형
22018107몽촌토성역사배기녹지0.455탑형
22028108몽촌토성역사급기녹지1.45탑형
22038109몽촌토성역사배기보도0.6탑형
22048110몽촌토성역사급기보도1.55탑형
22058111몽촌토성∼잠실본선배기보도0.63탑형