Overview

Dataset statistics

Number of variables7
Number of observations275
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory16.0 KiB
Average record size in memory59.5 B

Variable types

Numeric3
Text2
Categorical2

Dataset

Description파일 다운로드
Author서울교통공사
URLhttps://data.seoul.go.kr/dataList/OA-11572/S/1/datasetView.do

Alerts

호선 is highly overall correlated with 길이(M) and 1 other fieldsHigh correlation
길이(M) is highly overall correlated with 호선 and 2 other fieldsHigh correlation
준공년도 is highly overall correlated with 호선 and 1 other fieldsHigh correlation
층수 is highly overall correlated with 길이(M)High correlation

Reproduction

Analysis started2024-04-29 15:52:22.049618
Analysis finished2024-04-29 15:52:23.258577
Duration1.21 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

호선
Real number (ℝ)

HIGH CORRELATION 

Distinct8
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.6109091
Minimum1
Maximum8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.5 KiB
2024-04-30T00:52:23.319070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q13
median5
Q36
95-th percentile8
Maximum8
Range7
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.00307
Coefficient of variation (CV)0.43441975
Kurtosis-1.1455822
Mean4.6109091
Median Absolute Deviation (MAD)2
Skewness-0.058341332
Sum1268
Variance4.0122893
MonotonicityIncreasing
2024-04-30T00:52:23.458784image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
5 56
20.4%
2 50
18.2%
7 42
15.3%
6 39
14.2%
3 34
12.4%
4 26
9.5%
8 18
 
6.5%
1 10
 
3.6%
ValueCountFrequency (%)
1 10
 
3.6%
2 50
18.2%
3 34
12.4%
4 26
9.5%
5 56
20.4%
6 39
14.2%
7 42
15.3%
8 18
 
6.5%
ValueCountFrequency (%)
8 18
 
6.5%
7 42
15.3%
6 39
14.2%
5 56
20.4%
4 26
9.5%
3 34
12.4%
2 50
18.2%
1 10
 
3.6%

역명
Text

Distinct240
Distinct (%)87.3%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
2024-04-30T00:52:23.707557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length2
Mean length2.9127273
Min length2

Characters and Unicode

Total characters801
Distinct characters207
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique207 ?
Unique (%)75.3%

Sample

1st row서울
2nd row시청
3rd row종각
4th row종로3가
5th row종로5가
ValueCountFrequency (%)
종로3가 3
 
1.1%
동대문역사문화공원 3
 
1.1%
영등포구청 2
 
0.7%
삼각지 2
 
0.7%
서울 2
 
0.7%
가락시장 2
 
0.7%
충정로 2
 
0.7%
사당 2
 
0.7%
대림 2
 
0.7%
교대 2
 
0.7%
Other values (230) 253
92.0%
2024-04-30T00:52:24.080864image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
32
 
4.0%
28
 
3.5%
24
 
3.0%
22
 
2.7%
19
 
2.4%
15
 
1.9%
15
 
1.9%
15
 
1.9%
14
 
1.7%
14
 
1.7%
Other values (197) 603
75.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 793
99.0%
Decimal Number 8
 
1.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
32
 
4.0%
28
 
3.5%
24
 
3.0%
22
 
2.8%
19
 
2.4%
15
 
1.9%
15
 
1.9%
15
 
1.9%
14
 
1.8%
14
 
1.8%
Other values (194) 595
75.0%
Decimal Number
ValueCountFrequency (%)
3 5
62.5%
4 2
 
25.0%
5 1
 
12.5%

Most occurring scripts

ValueCountFrequency (%)
Hangul 793
99.0%
Common 8
 
1.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
32
 
4.0%
28
 
3.5%
24
 
3.0%
22
 
2.8%
19
 
2.4%
15
 
1.9%
15
 
1.9%
15
 
1.9%
14
 
1.8%
14
 
1.8%
Other values (194) 595
75.0%
Common
ValueCountFrequency (%)
3 5
62.5%
4 2
 
25.0%
5 1
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 793
99.0%
ASCII 8
 
1.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
32
 
4.0%
28
 
3.5%
24
 
3.0%
22
 
2.8%
19
 
2.4%
15
 
1.9%
15
 
1.9%
15
 
1.9%
14
 
1.8%
14
 
1.8%
Other values (194) 595
75.0%
ASCII
ValueCountFrequency (%)
3 5
62.5%
4 2
 
25.0%
5 1
 
12.5%

형식
Categorical

Distinct3
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
상대식
195 
섬식
67 
복합식
 
13

Length

Max length3
Median length3
Mean length2.7563636
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row섬식
2nd row상대식
3rd row상대식
4th row상대식
5th row상대식

Common Values

ValueCountFrequency (%)
상대식 195
70.9%
섬식 67
 
24.4%
복합식 13
 
4.7%

Length

2024-04-30T00:52:24.202314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-30T00:52:24.286312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
상대식 195
70.9%
섬식 67
 
24.4%
복합식 13
 
4.7%

길이(M)
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean178.61818
Minimum90
Maximum210
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.5 KiB
2024-04-30T00:52:24.370063image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum90
5-th percentile125
Q1165
median165
Q3205
95-th percentile205
Maximum210
Range120
Interquartile range (IQR)40

Descriptive statistics

Standard deviation25.092886
Coefficient of variation (CV)0.14048338
Kurtosis-0.33725366
Mean178.61818
Median Absolute Deviation (MAD)0
Skewness-0.42705306
Sum49120
Variance629.65295
MonotonicityNot monotonic
2024-04-30T00:52:24.489697image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
165 140
50.9%
205 104
37.8%
125 18
 
6.5%
210 10
 
3.6%
130 2
 
0.7%
90 1
 
0.4%
ValueCountFrequency (%)
90 1
 
0.4%
125 18
 
6.5%
130 2
 
0.7%
165 140
50.9%
205 104
37.8%
210 10
 
3.6%
ValueCountFrequency (%)
210 10
 
3.6%
205 104
37.8%
165 140
50.9%
130 2
 
0.7%
125 18
 
6.5%
90 1
 
0.4%

층수
Categorical

HIGH CORRELATION 

Distinct18
Distinct (%)6.5%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
B2
106 
B3
83 
B4
34 
B5
16 
3F
14 
Other values (13)
22 

Length

Max length4
Median length2
Mean length2.0581818
Min length2

Unique

Unique10 ?
Unique (%)3.6%

Sample

1st rowB2
2nd rowB2
3rd rowB2
4th rowB2
5th rowB2

Common Values

ValueCountFrequency (%)
B2 106
38.5%
B3 83
30.2%
B4 34
 
12.4%
B5 16
 
5.8%
3F 14
 
5.1%
2F 7
 
2.5%
B6 3
 
1.1%
1FB3 2
 
0.7%
5FB2 1
 
0.4%
1F 1
 
0.4%
Other values (8) 8
 
2.9%

Length

2024-04-30T00:52:24.611716image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
b2 106
38.5%
b3 83
30.2%
b4 34
 
12.4%
b5 16
 
5.8%
3f 14
 
5.1%
2f 7
 
2.5%
b6 3
 
1.1%
1fb3 2
 
0.7%
1fb5 1
 
0.4%
2fb2 1
 
0.4%
Other values (8) 8
 
2.9%
Distinct273
Distinct (%)99.3%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
2024-04-30T00:52:24.894164image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length8.2581818
Min length8

Characters and Unicode

Total characters2271
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique271 ?
Unique (%)98.5%

Sample

1st row10,805.00
2nd row11,317.00
3rd row10,410.20
4th row9,311.00
5th row10,465.00
ValueCountFrequency (%)
6,439.00 2
 
0.7%
6,086.00 2
 
0.7%
8,133.20 1
 
0.4%
5,530.70 1
 
0.4%
7,667.00 1
 
0.4%
6,518.70 1
 
0.4%
10,805.00 1
 
0.4%
5,723.10 1
 
0.4%
6,193.80 1
 
0.4%
11,229.50 1
 
0.4%
Other values (263) 263
95.6%
2024-04-30T00:52:25.296148image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 480
21.1%
, 275
12.1%
. 275
12.1%
1 179
 
7.9%
6 150
 
6.6%
9 148
 
6.5%
8 144
 
6.3%
5 144
 
6.3%
7 133
 
5.9%
2 119
 
5.2%
Other values (2) 224
9.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1721
75.8%
Other Punctuation 550
 
24.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 480
27.9%
1 179
 
10.4%
6 150
 
8.7%
9 148
 
8.6%
8 144
 
8.4%
5 144
 
8.4%
7 133
 
7.7%
2 119
 
6.9%
4 113
 
6.6%
3 111
 
6.4%
Other Punctuation
ValueCountFrequency (%)
, 275
50.0%
. 275
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2271
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 480
21.1%
, 275
12.1%
. 275
12.1%
1 179
 
7.9%
6 150
 
6.6%
9 148
 
6.5%
8 144
 
6.3%
5 144
 
6.3%
7 133
 
5.9%
2 119
 
5.2%
Other values (2) 224
9.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2271
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 480
21.1%
, 275
12.1%
. 275
12.1%
1 179
 
7.9%
6 150
 
6.6%
9 148
 
6.5%
8 144
 
6.3%
5 144
 
6.3%
7 133
 
5.9%
2 119
 
5.2%
Other values (2) 224
9.9%

준공년도
Real number (ℝ)

HIGH CORRELATION 

Distinct23
Distinct (%)8.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1992.9818
Minimum1974
Maximum2022
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.5 KiB
2024-04-30T00:52:25.424594image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1974
5-th percentile1980
Q11985
median1996
Q32000
95-th percentile2001
Maximum2022
Range48
Interquartile range (IQR)15

Descriptive statistics

Standard deviation9.0756973
Coefficient of variation (CV)0.0045538284
Kurtosis0.59634383
Mean1992.9818
Median Absolute Deviation (MAD)5
Skewness0.27740893
Sum548070
Variance82.368281
MonotonicityNot monotonic
2024-04-30T00:52:25.520463image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
1985 47
17.1%
1996 46
16.7%
2001 41
14.9%
1997 26
9.5%
2000 19
6.9%
1984 16
 
5.8%
1983 14
 
5.1%
1995 12
 
4.4%
1980 11
 
4.0%
1974 9
 
3.3%
Other values (13) 34
12.4%
ValueCountFrequency (%)
1974 9
 
3.3%
1980 11
 
4.0%
1982 5
 
1.8%
1983 14
 
5.1%
1984 16
 
5.8%
1985 47
17.1%
1990 1
 
0.4%
1992 2
 
0.7%
1993 8
 
2.9%
1994 1
 
0.4%
ValueCountFrequency (%)
2022 1
 
0.4%
2021 3
 
1.1%
2020 2
 
0.7%
2019 1
 
0.4%
2010 3
 
1.1%
2005 2
 
0.7%
2002 1
 
0.4%
2001 41
14.9%
2000 19
6.9%
1999 4
 
1.5%

Interactions

2024-04-30T00:52:22.833308image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:22.311672image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:22.545499image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:22.926358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:22.395260image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:22.627366image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:22.998914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:22.470883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T00:52:22.720175image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-30T00:52:25.614069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
호선형식길이(M)층수준공년도
호선1.0000.2360.8250.4970.935
형식0.2361.0000.1050.0000.255
길이(M)0.8250.1051.0000.8590.761
층수0.4970.0000.8591.0000.763
준공년도0.9350.2550.7610.7631.000
2024-04-30T00:52:25.731456image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
층수형식
층수1.0000.000
형식0.0001.000
2024-04-30T00:52:25.807114image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
호선길이(M)준공년도형식층수
호선1.000-0.8410.7900.1520.229
길이(M)-0.8411.000-0.7410.0780.643
준공년도0.790-0.7411.0000.1550.414
형식0.1520.0780.1551.0000.000
층수0.2290.6430.4140.0001.000

Missing values

2024-04-30T00:52:23.105402image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-30T00:52:23.215592image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

호선역명형식길이(M)층수면적(㎡)준공년도
01서울섬식210B210,805.001974
11시청상대식210B211,317.001974
21종각상대식210B210,410.201974
31종로3가상대식210B29,311.001974
41종로5가상대식210B210,465.001974
51동대문상대식210B25,490.001974
61동묘앞상대식2105FB27,031.702005
71신설동상대식210B27,240.001974
81제기동상대식210B28,662.001974
91청량리섬식210B27,125.001974
호선역명형식길이(M)층수면적(㎡)준공년도
2658문정상대식125B25,194.001997
2668장지상대식125B25,727.901997
2678복정섬식125B26,585.901997
2688남위례상대식125B24,177.502021
2698산성상대식125B46,526.601997
2708남한산성입구상대식125B35,412.301997
2718단대오거리상대식125B38,133.201997
2728신흥상대식125B24,861.601997
2738수진상대식125B25,067.301997
2748모란상대식125B39,918.801997