Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory419.9 KiB
Average record size in memory43.0 B

Variable types

Numeric3
Categorical1

Dataset

Description경상북도 구미시 버스정보시스템의 위반통계 테이블 데이터로 버스단말기번호, 위반유형ID, 건수 정보를 제공합니다.
Author경상북도 구미시
URLhttps://www.data.go.kr/data/15049479/fileData.do

Alerts

위반유형 is highly overall correlated with 건수High correlation
건수 is highly overall correlated with 위반유형High correlation
위반유형 has 1423 (14.2%) zerosZeros
건수 has 8409 (84.1%) zerosZeros

Reproduction

Analysis started2023-12-12 18:06:32.735711
Analysis finished2023-12-12 18:06:34.433461
Duration1.7 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

버스단말기번호
Real number (ℝ)

Distinct236
Distinct (%)2.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5622.7185
Minimum3333
Maximum9148
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T03:06:34.512762image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3333
5-th percentile5010
Q15058
median5511
Q35569
95-th percentile9109
Maximum9148
Range5815
Interquartile range (IQR)511

Descriptive statistics

Standard deviation1103.8546
Coefficient of variation (CV)0.19632044
Kurtosis5.6983611
Mean5622.7185
Median Absolute Deviation (MAD)414
Skewness2.6070917
Sum56227185
Variance1218494.9
MonotonicityNot monotonic
2023-12-13T03:06:34.986050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5002 59
 
0.6%
5538 58
 
0.6%
5090 55
 
0.5%
5558 54
 
0.5%
9102 53
 
0.5%
5575 53
 
0.5%
5591 53
 
0.5%
9108 52
 
0.5%
5095 52
 
0.5%
5584 51
 
0.5%
Other values (226) 9460
94.6%
ValueCountFrequency (%)
3333 47
0.5%
5001 44
0.4%
5002 59
0.6%
5003 46
0.5%
5004 45
0.4%
5005 51
0.5%
5006 39
0.4%
5007 49
0.5%
5008 43
0.4%
5009 47
0.5%
ValueCountFrequency (%)
9148 36
0.4%
9135 28
0.3%
9133 46
0.5%
9125 40
0.4%
9119 51
0.5%
9117 38
0.4%
9115 49
0.5%
9114 41
0.4%
9113 35
0.4%
9112 38
0.4%

년월일
Categorical

Distinct30
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2022-08-22
 
374
2022-08-12
 
368
2022-08-19
 
359
2022-08-08
 
354
2022-08-20
 
354
Other values (25)
8191 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2022-08-17
2nd row2022-08-17
3rd row2022-08-11
4th row2022-08-13
5th row2022-07-24

Common Values

ValueCountFrequency (%)
2022-08-22 374
 
3.7%
2022-08-12 368
 
3.7%
2022-08-19 359
 
3.6%
2022-08-08 354
 
3.5%
2022-08-20 354
 
3.5%
2022-08-11 353
 
3.5%
2022-07-28 348
 
3.5%
2022-08-17 343
 
3.4%
2022-07-31 342
 
3.4%
2022-07-26 340
 
3.4%
Other values (20) 6465
64.6%

Length

2023-12-13T03:06:35.164886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2022-08-22 374
 
3.7%
2022-08-12 368
 
3.7%
2022-08-19 359
 
3.6%
2022-08-08 354
 
3.5%
2022-08-20 354
 
3.5%
2022-08-11 353
 
3.5%
2022-07-28 348
 
3.5%
2022-08-17 343
 
3.4%
2022-07-31 342
 
3.4%
2022-07-26 340
 
3.4%
Other values (20) 6465
64.6%

위반유형
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.0047
Minimum0
Maximum6
Zeros1423
Zeros (%)14.2%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T03:06:35.298692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median3
Q35
95-th percentile6
Maximum6
Range6
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.0023683
Coefficient of variation (CV)0.66641206
Kurtosis-1.2527704
Mean3.0047
Median Absolute Deviation (MAD)2
Skewness-0.001075013
Sum30047
Variance4.0094789
MonotonicityNot monotonic
2023-12-13T03:06:35.420991image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
6 1444
14.4%
1 1436
14.4%
4 1428
14.3%
3 1425
14.2%
5 1424
14.2%
0 1423
14.2%
2 1420
14.2%
ValueCountFrequency (%)
0 1423
14.2%
1 1436
14.4%
2 1420
14.2%
3 1425
14.2%
4 1428
14.3%
5 1424
14.2%
6 1444
14.4%
ValueCountFrequency (%)
6 1444
14.4%
5 1424
14.2%
4 1428
14.3%
3 1425
14.2%
2 1420
14.2%
1 1436
14.4%
0 1423
14.2%

건수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct234
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.1984
Minimum0
Maximum327
Zeros8409
Zeros (%)84.1%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T03:06:35.579773image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile15
Maximum327
Range327
Interquartile range (IQR)0

Descriptive statistics

Standard deviation37.973215
Coefficient of variation (CV)4.6317836
Kurtosis30.043789
Mean8.1984
Median Absolute Deviation (MAD)0
Skewness5.4527759
Sum81984
Variance1441.965
MonotonicityNot monotonic
2023-12-13T03:06:35.747689image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 8409
84.1%
5 127
 
1.3%
6 118
 
1.2%
4 101
 
1.0%
7 100
 
1.0%
3 86
 
0.9%
1 83
 
0.8%
8 80
 
0.8%
2 79
 
0.8%
9 68
 
0.7%
Other values (224) 749
 
7.5%
ValueCountFrequency (%)
0 8409
84.1%
1 83
 
0.8%
2 79
 
0.8%
3 86
 
0.9%
4 101
 
1.0%
5 127
 
1.3%
6 118
 
1.2%
7 100
 
1.0%
8 80
 
0.8%
9 68
 
0.7%
ValueCountFrequency (%)
327 1
< 0.1%
325 1
< 0.1%
312 2
< 0.1%
310 1
< 0.1%
309 2
< 0.1%
307 1
< 0.1%
300 1
< 0.1%
298 1
< 0.1%
297 1
< 0.1%
296 1
< 0.1%

Interactions

2023-12-13T03:06:33.806096image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:06:33.100616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:06:33.486298image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:06:33.946638image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:06:33.225644image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:06:33.592637image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:06:34.113459image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:06:33.364691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:06:33.697409image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T03:06:35.866061image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
버스단말기번호년월일위반유형건수
버스단말기번호1.0000.0000.0000.079
년월일0.0001.0000.0000.075
위반유형0.0000.0001.0000.375
건수0.0790.0750.3751.000
2023-12-13T03:06:35.978709image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
버스단말기번호위반유형건수년월일
버스단말기번호1.000-0.008-0.0100.000
위반유형-0.0081.0000.5110.000
건수-0.0100.5111.0000.024
년월일0.0000.0000.0241.000

Missing values

2023-12-13T03:06:34.286427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T03:06:34.387792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

버스단말기번호년월일위반유형건수
1684450062022-08-1740
1378455522022-08-17616
6455202022-08-1140
2971355082022-08-1330
4477450652022-07-2400
1437450472022-08-1540
4001556042022-07-2710
148050932022-08-1062
1105950462022-08-0800
2095750822022-08-1940
버스단말기번호년월일위반유형건수
1295950422022-08-1700
2818655262022-08-1220
305650832022-08-1100
1874555932022-08-1810
162255072022-08-1010
439955092022-07-2866
2647555172022-08-0966
2778550762022-08-1200
1793350442022-08-2065
4197555692022-08-0230