Overview

Dataset statistics

Number of variables8
Number of observations247
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory16.5 KiB
Average record size in memory68.5 B

Variable types

Numeric3
Text1
Categorical3
Boolean1

Dataset

Description한국환경공단에서 운영하는 폐기물처분부담금시스템의 지역별 사업장 및 사업자에 대한 우편번호 등록 시 현황 정보 입니다.
URLhttps://www.data.go.kr/data/15069390/fileData.do

Alerts

사용여부 has constant value ""Constant
기타 is highly overall correlated with 레벨High correlation
레벨 is highly overall correlated with 기타High correlation
시군구코드 is highly overall correlated with 시도코드 and 1 other fieldsHigh correlation
시도코드 is highly overall correlated with 시군구코드 and 1 other fieldsHigh correlation
시도명 is highly overall correlated with 시군구코드 and 1 other fieldsHigh correlation
레벨 is highly imbalanced (63.8%)Imbalance
기타 is highly imbalanced (63.8%)Imbalance
시군구코드 has unique valuesUnique

Reproduction

Analysis started2023-12-12 14:46:32.398280
Analysis finished2023-12-12 14:46:33.897336
Duration1.5 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시군구코드
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct247
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.8250947 × 109
Minimum1.1 × 109
Maximum5.183 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.3 KiB
2023-12-12T23:46:33.974417image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.1 × 109
5-th percentile1.1389 × 109
Q12.8485 × 109
median4.3745 × 109
Q34.714 × 109
95-th percentile5.1204 × 109
Maximum5.183 × 109
Range4.083 × 109
Interquartile range (IQR)1.8655 × 109

Descriptive statistics

Standard deviation1.212776 × 109
Coefficient of variation (CV)0.31705778
Kurtosis-0.065304792
Mean3.8250947 × 109
Median Absolute Deviation (MAD)4.375 × 108
Skewness-1.0361967
Sum9.447984 × 1011
Variance1.4708257 × 1018
MonotonicityNot monotonic
2023-12-12T23:46:34.128611image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1100000000 1
 
0.4%
4680000000 1
 
0.4%
5176000000 1
 
0.4%
4380000000 1
 
0.4%
4477000000 1
 
0.4%
4575000000 1
 
0.4%
4679000000 1
 
0.4%
4772000000 1
 
0.4%
4874000000 1
 
0.4%
1138000000 1
 
0.4%
Other values (237) 237
96.0%
ValueCountFrequency (%)
1100000000 1
0.4%
1111000000 1
0.4%
1114000000 1
0.4%
1117000000 1
0.4%
1120000000 1
0.4%
1121500000 1
0.4%
1123000000 1
0.4%
1126000000 1
0.4%
1129000000 1
0.4%
1130500000 1
0.4%
ValueCountFrequency (%)
5183000000 1
0.4%
5182000000 1
0.4%
5181000000 1
0.4%
5180000000 1
0.4%
5179000000 1
0.4%
5178000000 1
0.4%
5177000000 1
0.4%
5176000000 1
0.4%
5175000000 1
0.4%
5173000000 1
0.4%
Distinct223
Distinct (%)90.3%
Missing0
Missing (%)0.0%
Memory size2.1 KiB
2023-12-12T23:46:34.525665image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length3
Mean length3.0850202
Min length2

Characters and Unicode

Total characters762
Distinct characters139
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique214 ?
Unique (%)86.6%

Sample

1st row서울특별시
2nd row부산광역시
3rd row대구광역시
4th row인천광역시
5th row광주광역시
ValueCountFrequency (%)
동구 6
 
2.4%
중구 6
 
2.4%
서구 5
 
2.0%
남구 4
 
1.6%
북구 4
 
1.6%
군위군 2
 
0.8%
세종특별자치시 2
 
0.8%
고성군 2
 
0.8%
강서구 2
 
0.8%
서대문구 1
 
0.4%
Other values (213) 213
86.2%
2023-12-12T23:46:35.011342image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
87
 
11.4%
87
 
11.4%
75
 
9.8%
23
 
3.0%
22
 
2.9%
18
 
2.4%
18
 
2.4%
18
 
2.4%
17
 
2.2%
15
 
2.0%
Other values (129) 382
50.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 762
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
87
 
11.4%
87
 
11.4%
75
 
9.8%
23
 
3.0%
22
 
2.9%
18
 
2.4%
18
 
2.4%
18
 
2.4%
17
 
2.2%
15
 
2.0%
Other values (129) 382
50.1%

Most occurring scripts

ValueCountFrequency (%)
Hangul 762
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
87
 
11.4%
87
 
11.4%
75
 
9.8%
23
 
3.0%
22
 
2.9%
18
 
2.4%
18
 
2.4%
18
 
2.4%
17
 
2.2%
15
 
2.0%
Other values (129) 382
50.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 762
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
87
 
11.4%
87
 
11.4%
75
 
9.8%
23
 
3.0%
22
 
2.9%
18
 
2.4%
18
 
2.4%
18
 
2.4%
17
 
2.2%
15
 
2.0%
Other values (129) 382
50.1%

레벨
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size2.1 KiB
1
230 
0
 
17

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
1 230
93.1%
0 17
 
6.9%

Length

2023-12-12T23:46:35.146803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:46:35.236189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 230
93.1%
0 17
 
6.9%

시도명
Categorical

HIGH CORRELATION 

Distinct17
Distinct (%)6.9%
Missing0
Missing (%)0.0%
Memory size2.1 KiB
경기도
32 
서울특별시
26 
경상북도
24 
전라남도
23 
강원특별자치도
19 
Other values (12)
123 

Length

Max length7
Median length5
Mean length4.4939271
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울특별시
2nd row부산광역시
3rd row대구광역시
4th row인천광역시
5th row광주광역시

Common Values

ValueCountFrequency (%)
경기도 32
13.0%
서울특별시 26
10.5%
경상북도 24
9.7%
전라남도 23
9.3%
강원특별자치도 19
7.7%
경상남도 19
7.7%
부산광역시 17
 
6.9%
충청남도 16
 
6.5%
전라북도 15
 
6.1%
충청북도 12
 
4.9%
Other values (7) 44
17.8%

Length

2023-12-12T23:46:35.341439image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기도 32
13.0%
서울특별시 26
10.5%
경상북도 24
9.7%
전라남도 23
9.3%
강원특별자치도 19
7.7%
경상남도 19
7.7%
부산광역시 17
 
6.9%
충청남도 16
 
6.5%
전라북도 15
 
6.1%
충청북도 12
 
4.9%
Other values (7) 44
17.8%

시도코드
Real number (ℝ)

HIGH CORRELATION 

Distinct17
Distinct (%)6.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.7821862 × 109
Minimum1.1 × 109
Maximum5.1 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.3 KiB
2023-12-12T23:46:35.749651image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.1 × 109
5-th percentile1.1 × 109
Q12.8 × 109
median4.3 × 109
Q34.7 × 109
95-th percentile5.1 × 109
Maximum5.1 × 109
Range4 × 109
Interquartile range (IQR)1.9 × 109

Descriptive statistics

Standard deviation1.205242 × 109
Coefficient of variation (CV)0.31866278
Kurtosis-0.018415962
Mean3.7821862 × 109
Median Absolute Deviation (MAD)5 × 108
Skewness-1.0526831
Sum9.342 × 1011
Variance1.4526082 × 1018
MonotonicityNot monotonic
2023-12-12T23:46:35.896478image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
4100000000 32
13.0%
1100000000 26
10.5%
4700000000 24
9.7%
4600000000 23
9.3%
4800000000 19
7.7%
5100000000 19
7.7%
2600000000 17
 
6.9%
4400000000 16
 
6.5%
4500000000 15
 
6.1%
4300000000 12
 
4.9%
Other values (7) 44
17.8%
ValueCountFrequency (%)
1100000000 26
10.5%
2600000000 17
6.9%
2700000000 10
 
4.0%
2800000000 11
 
4.5%
2900000000 6
 
2.4%
3000000000 6
 
2.4%
3100000000 6
 
2.4%
3600000000 2
 
0.8%
4100000000 32
13.0%
4300000000 12
 
4.9%
ValueCountFrequency (%)
5100000000 19
7.7%
5000000000 3
 
1.2%
4800000000 19
7.7%
4700000000 24
9.7%
4600000000 23
9.3%
4500000000 15
6.1%
4400000000 16
6.5%
4300000000 12
 
4.9%
4100000000 32
13.0%
3600000000 2
 
0.8%

기타
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size2.1 KiB
지자체
230 
광역시도
 
17

Length

Max length4
Median length3
Mean length3.0688259
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row광역시도
2nd row광역시도
3rd row광역시도
4th row광역시도
5th row광역시도

Common Values

ValueCountFrequency (%)
지자체 230
93.1%
광역시도 17
 
6.9%

Length

2023-12-12T23:46:36.042934image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:46:36.134828image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지자체 230
93.1%
광역시도 17
 
6.9%

정렬
Real number (ℝ)

Distinct32
Distinct (%)13.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.182186
Minimum1
Maximum32
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.3 KiB
2023-12-12T23:46:36.240064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median9
Q315
95-th percentile23.7
Maximum32
Range31
Interquartile range (IQR)11

Descriptive statistics

Standard deviation7.1427949
Coefficient of variation (CV)0.70149914
Kurtosis-0.072259808
Mean10.182186
Median Absolute Deviation (MAD)5
Skewness0.75760583
Sum2515
Variance51.019519
MonotonicityIncreasing
2023-12-12T23:46:36.374264image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=32)
ValueCountFrequency (%)
1 17
 
6.9%
2 17
 
6.9%
3 16
 
6.5%
4 15
 
6.1%
5 15
 
6.1%
6 15
 
6.1%
7 12
 
4.9%
8 12
 
4.9%
9 12
 
4.9%
10 12
 
4.9%
Other values (22) 104
42.1%
ValueCountFrequency (%)
1 17
6.9%
2 17
6.9%
3 16
6.5%
4 15
6.1%
5 15
6.1%
6 15
6.1%
7 12
4.9%
8 12
4.9%
9 12
4.9%
10 12
4.9%
ValueCountFrequency (%)
32 1
 
0.4%
31 1
 
0.4%
30 1
 
0.4%
29 1
 
0.4%
28 1
 
0.4%
27 1
 
0.4%
26 2
0.8%
25 2
0.8%
24 3
1.2%
23 4
1.6%

사용여부
Boolean

CONSTANT 

Distinct1
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size379.0 B
True
247 
ValueCountFrequency (%)
True 247
100.0%
2023-12-12T23:46:36.511641image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Interactions

2023-12-12T23:46:33.415843image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:46:32.861892image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:46:33.143480image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:46:33.499005image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:46:32.956653image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:46:33.242040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:46:33.600369image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:46:33.056555image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:46:33.335601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T23:46:36.579044image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군구코드레벨시도명시도코드기타정렬
시군구코드1.0000.1660.9780.9600.1660.326
레벨0.1661.0000.0000.1720.9990.545
시도명0.9780.0001.0001.0000.0000.085
시도코드0.9600.1721.0001.0000.1720.316
기타0.1660.9990.0000.1721.0000.545
정렬0.3260.5450.0850.3160.5451.000
2023-12-12T23:46:36.717686image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도명기타레벨
시도명1.0000.0000.000
기타0.0001.0000.968
레벨0.0000.9681.000
2023-12-12T23:46:36.832467image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군구코드시도코드정렬레벨시도명기타
시군구코드1.0000.9960.1200.0830.8800.083
시도코드0.9961.0000.0510.1120.9790.112
정렬0.1200.0511.0000.4140.0280.414
레벨0.0830.1120.4141.0000.0000.968
시도명0.8800.9790.0280.0001.0000.000
기타0.0830.1120.4140.9680.0001.000

Missing values

2023-12-12T23:46:33.717307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T23:46:33.854053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시군구코드시군구명레벨시도명시도코드기타정렬사용여부
01100000000서울특별시0서울특별시1100000000광역시도1Y
12600000000부산광역시0부산광역시2600000000광역시도1Y
22700000000대구광역시0대구광역시2700000000광역시도1Y
32800000000인천광역시0인천광역시2800000000광역시도1Y
42900000000광주광역시0광주광역시2900000000광역시도1Y
53000000000대전광역시0대전광역시3000000000광역시도1Y
63100000000울산광역시0울산광역시3100000000광역시도1Y
73600000000세종특별자치시0세종특별자치시3600000000광역시도1Y
84100000000경기도0경기도4100000000광역시도1Y
95100000000강원특별자치도0강원특별자치도5100000000광역시도1Y
시군구코드시군구명레벨시도명시도코드기타정렬사용여부
2371171000000송파구1서울특별시1100000000지자체25Y
2384159000000화성시1경기도4100000000지자체25Y
2391174000000강동구1서울특별시1100000000지자체26Y
2404161000000광주시1경기도4100000000지자체26Y
2414163000000양주시1경기도4100000000지자체27Y
2424165000000포천시1경기도4100000000지자체28Y
2434167000000여주시1경기도4100000000지자체29Y
2444180000000연천군1경기도4100000000지자체30Y
2454182000000가평군1경기도4100000000지자체31Y
2464183000000양평군1경기도4100000000지자체32Y