Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells10000
Missing cells (%)20.0%
Duplicate rows230
Duplicate rows (%)2.3%
Total size in memory517.6 KiB
Average record size in memory53.0 B

Variable types

Numeric4
Unsupported1

Alerts

Dataset has 230 (2.3%) duplicate rowsDuplicates
시작노드아이디 is highly overall correlated with 링크아이디 and 1 other fieldsHigh correlation
링크아이디 is highly overall correlated with 시작노드아이디 and 1 other fieldsHigh correlation
종점노드아이디 is highly overall correlated with 시작노드아이디 and 1 other fieldsHigh correlation
도로명 has 10000 (100.0%) missing valuesMissing
도로명 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-10 21:09:04.699141
Analysis finished2023-12-10 21:09:06.775268
Duration2.08 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시작노드아이디
Real number (ℝ)

HIGH CORRELATION 

Distinct8703
Distinct (%)87.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.216906 × 109
Minimum1.0000002 × 109
Maximum3.9700005 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T06:09:06.833799image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.0000002 × 109
5-th percentile1.1600006 × 109
Q12.110007 × 109
median2.2401006 × 109
Q32.360027 × 109
95-th percentile3.0700466 × 109
Maximum3.9700005 × 109
Range2.9700003 × 109
Interquartile range (IQR)2.5002008 × 108

Descriptive statistics

Standard deviation4.9856451 × 108
Coefficient of variation (CV)0.22489204
Kurtosis2.146233
Mean2.216906 × 109
Median Absolute Deviation (MAD)1.1997115 × 108
Skewness0.047261728
Sum2.216906 × 1013
Variance2.4856657 × 1017
MonotonicityNot monotonic
2023-12-11T06:09:06.946884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2160033700 5
 
0.1%
2160023300 4
 
< 0.1%
1020011600 4
 
< 0.1%
1090005200 4
 
< 0.1%
2030015400 4
 
< 0.1%
2140093700 4
 
< 0.1%
2030016800 4
 
< 0.1%
2270006700 4
 
< 0.1%
2240075200 4
 
< 0.1%
2070034000 4
 
< 0.1%
Other values (8693) 9959
99.6%
ValueCountFrequency (%)
1000000200 1
< 0.1%
1000000300 1
< 0.1%
1000000400 1
< 0.1%
1000000700 1
< 0.1%
1000001200 1
< 0.1%
1000001800 1
< 0.1%
1000002800 1
< 0.1%
1000004000 1
< 0.1%
1000005300 2
< 0.1%
1000005400 1
< 0.1%
ValueCountFrequency (%)
3970000500 1
< 0.1%
3970000400 1
< 0.1%
3970000100 1
< 0.1%
3950001400 1
< 0.1%
3950001200 1
< 0.1%
3950001001 1
< 0.1%
3940000900 1
< 0.1%
3930000101 1
< 0.1%
3920002400 1
< 0.1%
3920001700 1
< 0.1%

링크거리
Real number (ℝ)

Distinct9165
Distinct (%)91.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean660.06878
Minimum0
Maximum59426.22
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T06:09:07.070067image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile60.9475
Q1157.2325
median294.06
Q3562.5775
95-th percentile2270.021
Maximum59426.22
Range59426.22
Interquartile range (IQR)405.345

Descriptive statistics

Standard deviation1581.5981
Coefficient of variation (CV)2.3961111
Kurtosis278.82767
Mean660.06878
Median Absolute Deviation (MAD)167.955
Skewness12.091326
Sum6600687.8
Variance2501452.5
MonotonicityNot monotonic
2023-12-11T06:09:07.177847image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
292.59 4
 
< 0.1%
127.81 4
 
< 0.1%
247.92 4
 
< 0.1%
219.59 4
 
< 0.1%
115.73 3
 
< 0.1%
98.49 3
 
< 0.1%
162.35 3
 
< 0.1%
45.14 3
 
< 0.1%
269.28 3
 
< 0.1%
122.27 3
 
< 0.1%
Other values (9155) 9966
99.7%
ValueCountFrequency (%)
0.0 1
< 0.1%
8.79 1
< 0.1%
13.15 1
< 0.1%
13.39 1
< 0.1%
13.52 1
< 0.1%
15.62 1
< 0.1%
16.42 1
< 0.1%
16.63 1
< 0.1%
17.56 1
< 0.1%
17.9 1
< 0.1%
ValueCountFrequency (%)
59426.22 1
< 0.1%
36610.3 1
< 0.1%
30849.29 1
< 0.1%
30593.61 1
< 0.1%
23813.07 1
< 0.1%
23751.76 1
< 0.1%
19771.01 1
< 0.1%
19365.02 1
< 0.1%
18742.95 1
< 0.1%
17733.28 1
< 0.1%

링크아이디
Real number (ℝ)

HIGH CORRELATION 

Distinct9770
Distinct (%)97.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.2167236 × 109
Minimum1.0000056 × 109
Maximum3.9700006 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T06:09:07.287746image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.0000056 × 109
5-th percentile1.160004 × 109
Q12.1100242 × 109
median2.2402718 × 109
Q32.3600434 × 109
95-th percentile3.0701314 × 109
Maximum3.9700006 × 109
Range2.969995 × 109
Interquartile range (IQR)2.5001922 × 108

Descriptive statistics

Standard deviation4.982746 × 108
Coefficient of variation (CV)0.22477976
Kurtosis2.1316995
Mean2.2167236 × 109
Median Absolute Deviation (MAD)1.198846 × 108
Skewness0.04153746
Sum2.2167236 × 1013
Variance2.4827758 × 1017
MonotonicityNot monotonic
2023-12-11T06:09:07.400752image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2240019000 2
 
< 0.1%
1630017700 2
 
< 0.1%
2370038400 2
 
< 0.1%
2400162400 2
 
< 0.1%
2370048300 2
 
< 0.1%
2030008000 2
 
< 0.1%
2080014400 2
 
< 0.1%
2330145000 2
 
< 0.1%
2370178900 2
 
< 0.1%
2340101800 2
 
< 0.1%
Other values (9760) 9980
99.8%
ValueCountFrequency (%)
1000005600 1
< 0.1%
1000006800 1
< 0.1%
1000007600 1
< 0.1%
1000007700 1
< 0.1%
1000008500 1
< 0.1%
1000010000 1
< 0.1%
1000010600 1
< 0.1%
1000011700 1
< 0.1%
1000013300 2
< 0.1%
1000013700 1
< 0.1%
ValueCountFrequency (%)
3970000600 1
< 0.1%
3970000300 1
< 0.1%
3970000200 1
< 0.1%
3950002600 1
< 0.1%
3950000201 1
< 0.1%
3930000201 1
< 0.1%
3920003100 1
< 0.1%
3920002200 1
< 0.1%
3910004700 1
< 0.1%
3910004500 1
< 0.1%

도로명
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing10000
Missing (%)100.0%
Memory size166.0 KiB

종점노드아이디
Real number (ℝ)

HIGH CORRELATION 

Distinct8673
Distinct (%)86.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.2158872 × 109
Minimum1.0000009 × 109
Maximum3.970001 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T06:09:07.517427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.0000009 × 109
5-th percentile1.1600016 × 109
Q12.1100111 × 109
median2.240101 × 109
Q32.3600178 × 109
95-th percentile3.0700421 × 109
Maximum3.970001 × 109
Range2.9700001 × 109
Interquartile range (IQR)2.5000672 × 108

Descriptive statistics

Standard deviation4.9903793 × 108
Coefficient of variation (CV)0.22520908
Kurtosis2.1303501
Mean2.2158872 × 109
Median Absolute Deviation (MAD)1.199585 × 108
Skewness0.040229969
Sum2.2158872 × 1013
Variance2.4903885 × 1017
MonotonicityNot monotonic
2023-12-11T06:09:07.631136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2310018100 5
 
0.1%
2230017200 4
 
< 0.1%
2050011700 3
 
< 0.1%
2140069900 3
 
< 0.1%
2330004500 3
 
< 0.1%
1110003100 3
 
< 0.1%
2160025300 3
 
< 0.1%
1630004000 3
 
< 0.1%
1650011400 3
 
< 0.1%
2720007400 3
 
< 0.1%
Other values (8663) 9967
99.7%
ValueCountFrequency (%)
1000000900 1
< 0.1%
1000001500 1
< 0.1%
1000002300 1
< 0.1%
1000002400 2
< 0.1%
1000002500 1
< 0.1%
1000002700 2
< 0.1%
1000003900 1
< 0.1%
1000004100 1
< 0.1%
1000004400 1
< 0.1%
1000004600 1
< 0.1%
ValueCountFrequency (%)
3970001000 1
< 0.1%
3970000900 1
< 0.1%
3970000400 1
< 0.1%
3950003800 1
< 0.1%
3950002500 1
< 0.1%
3950001101 1
< 0.1%
3940001000 1
< 0.1%
3940000300 1
< 0.1%
3930000201 1
< 0.1%
3920001400 1
< 0.1%

Interactions

2023-12-11T06:09:06.289724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:09:05.095858image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:09:05.631065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:09:05.981840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:09:06.369403image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:09:05.372524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:09:05.730075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:09:06.059796image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:09:06.448031image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:09:05.448621image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:09:05.823378image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:09:06.133565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:09:06.549756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:09:05.537708image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:09:05.906595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:09:06.211851image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T06:09:07.707649image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시작노드아이디링크거리링크아이디종점노드아이디
시작노드아이디1.0000.2661.0001.000
링크거리0.2661.0000.2650.230
링크아이디1.0000.2651.0001.000
종점노드아이디1.0000.2301.0001.000
2023-12-11T06:09:07.779622image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시작노드아이디링크거리링크아이디종점노드아이디
시작노드아이디1.0000.1940.9850.981
링크거리0.1941.0000.1940.192
링크아이디0.9850.1941.0000.985
종점노드아이디0.9810.1920.9851.000

Missing values

2023-12-11T06:09:06.667562image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T06:09:06.742322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시작노드아이디링크거리링크아이디도로명종점노드아이디
2986423400485005869.692300096300<NA>2300008700
541222330024200950.162330416000<NA>2330163500
99131160011200438.451160027700<NA>1160012300
448643640008200826.283640019400<NA>3640010100
914822160007500168.622160053900<NA>2160006000
902212310027300423.262310096600<NA>2310027200
336422220035000348.862220101600<NA>2220099900
608882350007600127.862350052200<NA>2350007700
923042520041600136.182520013100<NA>2520041700
95043370000030142.13700005100<NA>3700002300
시작노드아이디링크거리링크아이디도로명종점노드아이디
968791180009800406.971180007200<NA>1180008600
399311070007600438.11070020600<NA>1070007500
157432250009200357.462250035600<NA>2250012500
711032330055700592.982330058400<NA>2330055800
42402207001370030.92070023800<NA>2070047100
434431610002000303.191610004400<NA>1610001900
817701170001000231.281170007200<NA>1170001100
353542310025600550.242310094400<NA>2310070400
602232230037800158.52230103200<NA>2230028300
528852180050600170.652180135300<NA>2180055500

Duplicate rows

Most frequently occurring

시작노드아이디링크거리링크아이디종점노드아이디# duplicates
01010002800313.24101000680010100029002
11010003800241.25101000560010100039002
21010004000291.93100001330010000024002
31020011600374.61102000400010200122002
41090005200595.61109000570010900018002
5109000590088.6109001050010900061002
61100000200264.22110001350011000037002
71110004200296.55111000950011100031002
81180010900159.76118003280011800110002
91180014900503.02118001030011800137002