Overview

Dataset statistics

Number of variables6
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory585.9 KiB
Average record size in memory60.0 B

Variable types

Numeric4
Text1
Categorical1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15067/S/1/datasetView.do

Alerts

NODE_ID is highly overall correlated with ARS_ID and 1 other fieldsHigh correlation
ARS_ID is highly overall correlated with NODE_ID and 1 other fieldsHigh correlation
Y좌표 is highly overall correlated with NODE_ID and 1 other fieldsHigh correlation
NODE_ID has unique valuesUnique
ARS_ID has unique valuesUnique

Reproduction

Analysis started2024-05-11 09:37:00.498113
Analysis finished2024-05-11 09:37:09.341985
Duration8.84 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

NODE_ID
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.1322055 × 108
Minimum1 × 108
Maximum1.6700064 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T09:37:09.592916image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1 × 108
5-th percentile1.0100032 × 108
Q11.0790011 × 108
median1.1390007 × 108
Q31.1990009 × 108
95-th percentile1.2300049 × 108
Maximum1.6700064 × 108
Range67000639
Interquartile range (IQR)11999982

Descriptive statistics

Standard deviation6971204.1
Coefficient of variation (CV)0.061571897
Kurtosis-0.79676554
Mean1.1322055 × 108
Median Absolute Deviation (MAD)5999992
Skewness-0.11193869
Sum1.1322055 × 1012
Variance4.8597687 × 1013
MonotonicityNot monotonic
2024-05-11T09:37:10.082769image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
116000148 1
 
< 0.1%
115000305 1
 
< 0.1%
107000525 1
 
< 0.1%
110000244 1
 
< 0.1%
110000376 1
 
< 0.1%
121000013 1
 
< 0.1%
119900285 1
 
< 0.1%
105000271 1
 
< 0.1%
119900021 1
 
< 0.1%
113900149 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
100000001 1
< 0.1%
100000002 1
< 0.1%
100000003 1
< 0.1%
100000006 1
< 0.1%
100000007 1
< 0.1%
100000009 1
< 0.1%
100000011 1
< 0.1%
100000012 1
< 0.1%
100000014 1
< 0.1%
100000015 1
< 0.1%
ValueCountFrequency (%)
167000640 1
< 0.1%
124900140 1
< 0.1%
124900139 1
< 0.1%
124900136 1
< 0.1%
124900135 1
< 0.1%
124900134 1
< 0.1%
124900133 1
< 0.1%
124900129 1
< 0.1%
124900128 1
< 0.1%
124900127 1
< 0.1%

ARS_ID
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14332.294
Minimum1001
Maximum25999
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T09:37:10.492705image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1001
5-th percentile2708.9
Q18588.75
median14600.5
Q320694.25
95-th percentile24421.05
Maximum25999
Range24998
Interquartile range (IQR)12105.5

Descriptive statistics

Standard deviation6959.7223
Coefficient of variation (CV)0.48559724
Kurtosis-1.1247855
Mean14332.294
Median Absolute Deviation (MAD)6029.5
Skewness-0.151614
Sum1.4332294 × 108
Variance48437734
MonotonicityNot monotonic
2024-05-11T09:37:10.871140image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
17239 1
 
< 0.1%
16419 1
 
< 0.1%
8410 1
 
< 0.1%
11345 1
 
< 0.1%
11477 1
 
< 0.1%
22013 1
 
< 0.1%
20984 1
 
< 0.1%
6359 1
 
< 0.1%
20579 1
 
< 0.1%
14877 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
1001 1
< 0.1%
1002 1
< 0.1%
1003 1
< 0.1%
1006 1
< 0.1%
1007 1
< 0.1%
1009 1
< 0.1%
1010 1
< 0.1%
1011 1
< 0.1%
1013 1
< 0.1%
1014 1
< 0.1%
ValueCountFrequency (%)
25999 1
< 0.1%
25998 1
< 0.1%
25997 1
< 0.1%
25996 1
< 0.1%
25995 1
< 0.1%
25994 1
< 0.1%
25990 1
< 0.1%
25783 1
< 0.1%
25782 1
< 0.1%
25781 1
< 0.1%
Distinct6667
Distinct (%)66.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T09:37:11.405068image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length19
Mean length7.7769
Min length2

Characters and Unicode

Total characters77769
Distinct characters661
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4056 ?
Unique (%)40.6%

Sample

1st row디지털단지입구
2nd row개봉역.영화아파트
3rd row강변역.테크노마트앞
4th row고려슈퍼
5th row금호베스트빌.래미안하이베르
ValueCountFrequency (%)
벽산아파트 11
 
0.1%
새마을금고 10
 
0.1%
현대아파트 9
 
0.1%
구로디지털단지역 9
 
0.1%
가산디지털단지역 9
 
0.1%
성원아파트 8
 
0.1%
합정역 8
 
0.1%
신대방역 8
 
0.1%
건영아파트 8
 
0.1%
북서울꿈의숲 8
 
0.1%
Other values (6658) 9913
99.1%
2024-05-11T09:37:12.385252image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2274
 
2.9%
2134
 
2.7%
2116
 
2.7%
. 2112
 
2.7%
2045
 
2.6%
1805
 
2.3%
1589
 
2.0%
1496
 
1.9%
1313
 
1.7%
1271
 
1.6%
Other values (651) 59614
76.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 72197
92.8%
Decimal Number 2494
 
3.2%
Other Punctuation 2138
 
2.7%
Uppercase Letter 645
 
0.8%
Close Punctuation 127
 
0.2%
Open Punctuation 125
 
0.2%
Lowercase Letter 33
 
< 0.1%
Dash Punctuation 9
 
< 0.1%
Space Separator 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2274
 
3.1%
2134
 
3.0%
2116
 
2.9%
2045
 
2.8%
1805
 
2.5%
1589
 
2.2%
1496
 
2.1%
1313
 
1.8%
1271
 
1.8%
1259
 
1.7%
Other values (606) 54895
76.0%
Uppercase Letter
ValueCountFrequency (%)
T 80
12.4%
S 71
11.0%
K 67
10.4%
C 63
9.8%
A 55
8.5%
P 48
 
7.4%
G 41
 
6.4%
M 35
 
5.4%
B 33
 
5.1%
D 29
 
4.5%
Other values (13) 123
19.1%
Decimal Number
ValueCountFrequency (%)
1 749
30.0%
2 474
19.0%
3 339
13.6%
4 210
 
8.4%
5 165
 
6.6%
0 155
 
6.2%
7 124
 
5.0%
6 116
 
4.7%
9 102
 
4.1%
8 60
 
2.4%
Other Punctuation
ValueCountFrequency (%)
. 2112
98.8%
· 13
 
0.6%
& 11
 
0.5%
, 2
 
0.1%
Lowercase Letter
ValueCountFrequency (%)
e 25
75.8%
k 4
 
12.1%
t 2
 
6.1%
s 2
 
6.1%
Close Punctuation
ValueCountFrequency (%)
) 127
100.0%
Open Punctuation
ValueCountFrequency (%)
( 125
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 9
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 72197
92.8%
Common 4894
 
6.3%
Latin 678
 
0.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2274
 
3.1%
2134
 
3.0%
2116
 
2.9%
2045
 
2.8%
1805
 
2.5%
1589
 
2.2%
1496
 
2.1%
1313
 
1.8%
1271
 
1.8%
1259
 
1.7%
Other values (606) 54895
76.0%
Latin
ValueCountFrequency (%)
T 80
11.8%
S 71
10.5%
K 67
9.9%
C 63
9.3%
A 55
 
8.1%
P 48
 
7.1%
G 41
 
6.0%
M 35
 
5.2%
B 33
 
4.9%
D 29
 
4.3%
Other values (17) 156
23.0%
Common
ValueCountFrequency (%)
. 2112
43.2%
1 749
 
15.3%
2 474
 
9.7%
3 339
 
6.9%
4 210
 
4.3%
5 165
 
3.4%
0 155
 
3.2%
) 127
 
2.6%
( 125
 
2.6%
7 124
 
2.5%
Other values (8) 314
 
6.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 72197
92.8%
ASCII 5559
 
7.1%
None 13
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2274
 
3.1%
2134
 
3.0%
2116
 
2.9%
2045
 
2.8%
1805
 
2.5%
1589
 
2.2%
1496
 
2.1%
1313
 
1.8%
1271
 
1.8%
1259
 
1.7%
Other values (606) 54895
76.0%
ASCII
ValueCountFrequency (%)
. 2112
38.0%
1 749
 
13.5%
2 474
 
8.5%
3 339
 
6.1%
4 210
 
3.8%
5 165
 
3.0%
0 155
 
2.8%
) 127
 
2.3%
( 125
 
2.2%
7 124
 
2.2%
Other values (34) 979
17.6%
None
ValueCountFrequency (%)
· 13
100.0%

X좌표
Real number (ℝ)

Distinct9997
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean126.98602
Minimum126.45723
Maximum127.18176
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T09:37:13.054222image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum126.45723
5-th percentile126.84248
Q1126.91678
median126.99461
Q3127.05134
95-th percentile127.12762
Maximum127.18176
Range0.72453
Interquartile range (IQR)0.13456685

Descriptive statistics

Standard deviation0.086292844
Coefficient of variation (CV)0.00067954601
Kurtosis-0.7337225
Mean126.98602
Median Absolute Deviation (MAD)0.068258669
Skewness-0.060877533
Sum1269860.2
Variance0.007446455
MonotonicityNot monotonic
2024-05-11T09:37:13.536257image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
127.1480886874 2
 
< 0.1%
127.013138 2
 
< 0.1%
127.0520250488 2
 
< 0.1%
126.8764578415 1
 
< 0.1%
127.0685176937 1
 
< 0.1%
127.0596902572 1
 
< 0.1%
127.039436762 1
 
< 0.1%
127.0571622955 1
 
< 0.1%
127.0590758831 1
 
< 0.1%
127.0230319441 1
 
< 0.1%
Other values (9987) 9987
99.9%
ValueCountFrequency (%)
126.45723 1
< 0.1%
126.7210313414 1
< 0.1%
126.7974938496 1
< 0.1%
126.797811 1
< 0.1%
126.7978638462 1
< 0.1%
126.798335 1
< 0.1%
126.7984631135 1
< 0.1%
126.7985207144 1
< 0.1%
126.7985641294 1
< 0.1%
126.7987623811 1
< 0.1%
ValueCountFrequency (%)
127.18176 1
< 0.1%
127.1817343335 1
< 0.1%
127.1816669472 1
< 0.1%
127.18013794 1
< 0.1%
127.18013 1
< 0.1%
127.1799002887 1
< 0.1%
127.1798392415 1
< 0.1%
127.179726 1
< 0.1%
127.1797196537 1
< 0.1%
127.1794626974 1
< 0.1%

Y좌표
Real number (ℝ)

HIGH CORRELATION 

Distinct9993
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean37.549895
Minimum37.43052
Maximum37.690177
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T09:37:14.068025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum37.43052
5-th percentile37.471726
Q137.502544
median37.549192
Q337.589605
95-th percentile37.646853
Maximum37.690177
Range0.25965706
Interquartile range (IQR)0.087060405

Descriptive statistics

Standard deviation0.05463045
Coefficient of variation (CV)0.0014548762
Kurtosis-0.75913267
Mean37.549895
Median Absolute Deviation (MAD)0.044012701
Skewness0.27174826
Sum375498.95
Variance0.002984486
MonotonicityNot monotonic
2024-05-11T09:37:14.644818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
37.5361286235 2
 
< 0.1%
37.477818 2
 
< 0.1%
37.5042586234 2
 
< 0.1%
37.6015704448 2
 
< 0.1%
37.4892113897 2
 
< 0.1%
37.450172246 2
 
< 0.1%
37.5625221156 2
 
< 0.1%
37.5830057298 1
 
< 0.1%
37.4646017803 1
 
< 0.1%
37.5608966456 1
 
< 0.1%
Other values (9983) 9983
99.8%
ValueCountFrequency (%)
37.4305199435 1
< 0.1%
37.4309469125 1
< 0.1%
37.4345128931 1
< 0.1%
37.4347964213 1
< 0.1%
37.4348585994 1
< 0.1%
37.4349735461 1
< 0.1%
37.4350042057 1
< 0.1%
37.4355241561 1
< 0.1%
37.4371542291 1
< 0.1%
37.4379594347 1
< 0.1%
ValueCountFrequency (%)
37.690177 1
< 0.1%
37.6899483575 1
< 0.1%
37.6898762161 1
< 0.1%
37.6893500743 1
< 0.1%
37.6893310475 1
< 0.1%
37.689202857 1
< 0.1%
37.6890118581 1
< 0.1%
37.688568 1
< 0.1%
37.6879883235 1
< 0.1%
37.6879397664 1
< 0.1%

정류소타입
Categorical

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
일반차로
5495 
마을버스
3689 
중앙차로
 
356
가로변시간
 
244
가로변전일
 
134

Length

Max length5
Median length4
Mean length4.046
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row일반차로
2nd row중앙차로
3rd row마을버스
4th row마을버스
5th row일반차로

Common Values

ValueCountFrequency (%)
일반차로 5495
54.9%
마을버스 3689
36.9%
중앙차로 356
 
3.6%
가로변시간 244
 
2.4%
가로변전일 134
 
1.3%
가상정류장 82
 
0.8%

Length

2024-05-11T09:37:15.224909image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T09:37:15.617360image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반차로 5495
54.9%
마을버스 3689
36.9%
중앙차로 356
 
3.6%
가로변시간 244
 
2.4%
가로변전일 134
 
1.3%
가상정류장 82
 
0.8%

Interactions

2024-05-11T09:37:07.470403image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T09:37:03.226396image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T09:37:04.749402image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T09:37:06.034323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T09:37:07.788284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T09:37:03.623887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T09:37:05.055306image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T09:37:06.566562image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T09:37:08.079511image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T09:37:03.932983image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T09:37:05.358904image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T09:37:06.850241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T09:37:08.399690image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T09:37:04.276008image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T09:37:05.707514image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T09:37:07.156162image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T09:37:15.855739image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
NODE_IDARS_IDX좌표Y좌표정류소타입
NODE_ID1.0000.9770.8220.8350.130
ARS_ID0.9771.0000.7910.8590.288
X좌표0.8220.7911.0000.4010.202
Y좌표0.8350.8590.4011.0000.169
정류소타입0.1300.2880.2020.1691.000
2024-05-11T09:37:16.277399image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
NODE_IDARS_IDX좌표Y좌표정류소타입
NODE_ID1.0000.998-0.061-0.6730.088
ARS_ID0.9981.000-0.061-0.6740.156
X좌표-0.061-0.0611.0000.2260.113
Y좌표-0.673-0.6740.2261.0000.089
정류소타입0.0880.1560.1130.0891.000

Missing values

2024-05-11T09:37:08.779930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T09:37:09.145923image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

NODE_IDARS_ID정류소명X좌표Y좌표정류소타입
696911600014817239디지털단지입구126.87645837.486424일반차로
683211600000917009개봉역.영화아파트126.86017937.496907중앙차로
16161049000385543강변역.테크노마트앞127.09485937.536393마을버스
27201079000778466고려슈퍼127.02776837.598641마을버스
5261010001542259금호베스트빌.래미안하이베르127.02405137.561645일반차로
821611900001120011노량진역126.94381137.513733중앙차로
994412200014323246래미안하이스턴.대치순복음교회127.06386637.499945일반차로
894712090005821531충남슈퍼.생각대로통신126.92278937.474919마을버스
11171030001674269성수2가3동주민센터127.05423837.550425일반차로
455311100025312343하나은행역촌동지점126.92042437.608336일반차로
NODE_IDARS_ID정류소명X좌표Y좌표정류소타입
479211190112512867상신초등학교입구126.9084837.5966마을버스
788611800057619291영진시장126.91549737.500155일반차로
1039412300004124130잠실중학교.장미종합상가127.1006637.51715일반차로
394711000015511255상계주공6단지127.06114537.653184가로변전일
15151040001065199구의역3번출구127.08828937.537519일반차로
651011590023216464강원슈퍼126.85692937.535661마을버스
20881060000547148동부시장남문입구127.07766537.590431일반차로
33581089000419834수유역.강북구청127.02592337.63764마을버스
1017612200072423489래미안개포루체하임아파트127.08147537.489303일반차로
751511790009118540박미마을126.90316937.440932마을버스