Overview

Dataset statistics

Number of variables9
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.6 KiB
Average record size in memory78.3 B

Variable types

Categorical8
Numeric1

Dataset

Description샘플 데이터
Author지디에스컨설팅그룹
URLhttps://www.bigdata-environment.kr/user/data_market/detail.do?id=e8039240-2dff-11ea-9713-eb3e5186fb38

Alerts

오염원 지역 코드 has constant value ""Constant
오염원 경도 is highly overall correlated with 오염원 고유번호 and 3 other fieldsHigh correlation
오염원 고유번호 is highly overall correlated with 오염원 종류 명 and 3 other fieldsHigh correlation
오염원 종류 명 is highly overall correlated with 오염원 고유번호 and 3 other fieldsHigh correlation
오염원 상세 종류 명 is highly overall correlated with 오염원 고유번호 and 3 other fieldsHigh correlation
오염원 위도 is highly overall correlated with 오염원 고유번호 and 3 other fieldsHigh correlation
인구수 is highly overall correlated with 연령대High correlation
연령대 is highly overall correlated with 인구수High correlation

Reproduction

Analysis started2023-12-10 12:33:57.701455
Analysis finished2023-12-10 12:33:58.963659
Duration1.26 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

오염원 고유번호
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
1
40 
2
40 
3
20 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 40
40.0%
2 40
40.0%
3 20
20.0%

Length

2023-12-10T21:33:59.065264image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T21:33:59.209520image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 40
40.0%
2 40
40.0%
3 20
20.0%

오염원 지역 코드
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
27000
100 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row27000
2nd row27000
3rd row27000
4th row27000
5th row27000

Common Values

ValueCountFrequency (%)
27000 100
100.0%

Length

2023-12-10T21:33:59.452289image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T21:33:59.577440image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
27000 100
100.0%

오염원 종류 명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
주유소
60 
세차장
40 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row주유소
2nd row주유소
3rd row주유소
4th row주유소
5th row주유소

Common Values

ValueCountFrequency (%)
주유소 60
60.0%
세차장 40
40.0%

Length

2023-12-10T21:33:59.724144image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T21:33:59.852766image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
주유소 60
60.0%
세차장 40
40.0%

오염원 상세 종류 명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
주유소
60 
세차장
40 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row주유소
2nd row주유소
3rd row주유소
4th row주유소
5th row주유소

Common Values

ValueCountFrequency (%)
주유소 60
60.0%
세차장 40
40.0%

Length

2023-12-10T21:33:59.989056image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T21:34:00.144250image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
주유소 60
60.0%
세차장 40
40.0%

오염원 경도
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
1095595.01174
40 
1096153.37895
40 
1096684.3279
20 

Length

Max length13
Median length13
Mean length12.8
Min length12

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1095595.01174
2nd row1095595.01174
3rd row1095595.01174
4th row1095595.01174
5th row1095595.01174

Common Values

ValueCountFrequency (%)
1095595.01174 40
40.0%
1096153.37895 40
40.0%
1096684.3279 20
20.0%

Length

2023-12-10T21:34:00.287108image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T21:34:00.549564image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1095595.01174 40
40.0%
1096153.37895 40
40.0%
1096684.3279 20
20.0%

오염원 위도
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
1760933.90633
40 
1761643.77909
40 
1761283.12639
20 

Length

Max length13
Median length13
Mean length13
Min length13

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1760933.90633
2nd row1760933.90633
3rd row1760933.90633
4th row1760933.90633
5th row1760933.90633

Common Values

ValueCountFrequency (%)
1760933.90633 40
40.0%
1761643.77909 40
40.0%
1761283.12639 20
20.0%

Length

2023-12-10T21:34:00.740325image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T21:34:00.902557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1760933.90633 40
40.0%
1761643.77909 40
40.0%
1761283.12639 20
20.0%

연령대
Categorical

HIGH CORRELATION 

Distinct20
Distinct (%)20.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
0 - 4세
 
6
45 - 49세
 
6
15 - 19세
 
6
25 - 29세
 
6
20 - 24세
 
6
Other values (15)
70 

Length

Max length8
Median length8
Mean length7.76
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0 - 4세
2nd row75 - 79세
3rd row45 - 49세
4th row15 - 19세
5th row15 - 19세

Common Values

ValueCountFrequency (%)
0 - 4세 6
 
6.0%
45 - 49세 6
 
6.0%
15 - 19세 6
 
6.0%
25 - 29세 6
 
6.0%
20 - 24세 6
 
6.0%
35 - 39세 6
 
6.0%
5 - 9세 6
 
6.0%
30 - 34세 6
 
6.0%
10 - 14세 6
 
6.0%
40 - 44세 6
 
6.0%
Other values (10) 40
40.0%

Length

2023-12-10T21:34:01.251206image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
100
33.3%
0 6
 
2.0%
35 6
 
2.0%
40 6
 
2.0%
14세 6
 
2.0%
10 6
 
2.0%
34세 6
 
2.0%
30 6
 
2.0%
9세 6
 
2.0%
5 6
 
2.0%
Other values (31) 146
48.7%

성별
Categorical

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
F
50 
M
50 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowF
2nd rowM
3rd rowF
4th rowF
5th rowM

Common Values

ValueCountFrequency (%)
F 50
50.0%
M 50
50.0%

Length

2023-12-10T21:34:01.439346image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T21:34:01.570764image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
f 50
50.0%
m 50
50.0%

인구수
Real number (ℝ)

HIGH CORRELATION 

Distinct60
Distinct (%)60.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean545.01
Minimum2
Maximum1094
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T21:34:01.772328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile20
Q1334.75
median492
Q3802
95-th percentile1082
Maximum1094
Range1092
Interquartile range (IQR)467.25

Descriptive statistics

Standard deviation321.67282
Coefficient of variation (CV)0.59021453
Kurtosis-1.0250131
Mean545.01
Median Absolute Deviation (MAD)282
Skewness0.05902979
Sum54501
Variance103473.4
MonotonicityNot monotonic
2023-12-10T21:34:02.026978image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
320 2
 
2.0%
210 2
 
2.0%
492 2
 
2.0%
472 2
 
2.0%
666 2
 
2.0%
339 2
 
2.0%
736 2
 
2.0%
10 2
 
2.0%
71 2
 
2.0%
369 2
 
2.0%
Other values (50) 80
80.0%
ValueCountFrequency (%)
2 2
2.0%
10 2
2.0%
20 2
2.0%
54 2
2.0%
71 2
2.0%
88 1
1.0%
119 1
1.0%
158 1
1.0%
160 1
1.0%
166 1
1.0%
ValueCountFrequency (%)
1094 2
2.0%
1089 2
2.0%
1082 2
2.0%
1046 2
2.0%
1036 2
2.0%
1007 2
2.0%
994 2
2.0%
926 2
2.0%
893 2
2.0%
854 2
2.0%

Interactions

2023-12-10T21:33:58.441673image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T21:34:02.223498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
오염원 고유번호오염원 종류 명오염원 상세 종류 명오염원 경도오염원 위도연령대성별인구수
오염원 고유번호1.0001.0001.0001.0001.0000.0000.0000.481
오염원 종류 명1.0001.0000.9991.0001.0000.0000.0000.000
오염원 상세 종류 명1.0000.9991.0001.0001.0000.0000.0000.000
오염원 경도1.0001.0001.0001.0001.0000.0000.0000.481
오염원 위도1.0001.0001.0001.0001.0000.0000.0000.481
연령대0.0000.0000.0000.0000.0001.0000.0000.936
성별0.0000.0000.0000.0000.0000.0001.0000.201
인구수0.4810.0000.0000.4810.4810.9360.2011.000
2023-12-10T21:34:02.508823image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
오염원 경도오염원 고유번호오염원 종류 명오염원 상세 종류 명연령대오염원 위도성별
오염원 경도1.0001.0000.9950.9950.0001.0000.000
오염원 고유번호1.0001.0000.9950.9950.0001.0000.000
오염원 종류 명0.9950.9951.0000.9790.0000.9950.000
오염원 상세 종류 명0.9950.9950.9791.0000.0000.9950.000
연령대0.0000.0000.0000.0001.0000.0000.000
오염원 위도1.0001.0000.9950.9950.0001.0000.000
성별0.0000.0000.0000.0000.0000.0001.000
2023-12-10T21:34:02.791130image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
인구수오염원 고유번호오염원 종류 명오염원 상세 종류 명오염원 경도오염원 위도연령대성별
인구수1.0000.3170.0000.0000.3170.3170.5760.144
오염원 고유번호0.3171.0000.9950.9951.0001.0000.0000.000
오염원 종류 명0.0000.9951.0000.9790.9950.9950.0000.000
오염원 상세 종류 명0.0000.9950.9791.0000.9950.9950.0000.000
오염원 경도0.3171.0000.9950.9951.0001.0000.0000.000
오염원 위도0.3171.0000.9950.9951.0001.0000.0000.000
연령대0.5760.0000.0000.0000.0000.0001.0000.000
성별0.1440.0000.0000.0000.0000.0000.0001.000

Missing values

2023-12-10T21:33:58.632502image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T21:33:58.876711image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

오염원 고유번호오염원 지역 코드오염원 종류 명오염원 상세 종류 명오염원 경도오염원 위도연령대성별인구수
0127000주유소주유소1095595.011741760933.906330 - 4세F320
1127000주유소주유소1095595.011741760933.9063375 - 79세M369
2127000주유소주유소1095595.011741760933.9063345 - 49세F893
3127000주유소주유소1095595.011741760933.9063315 - 19세F584
4127000주유소주유소1095595.011741760933.9063315 - 19세M485
5127000주유소주유소1095595.011741760933.9063325 - 29세M780
6127000주유소주유소1095595.011741760933.9063365 - 69세F926
7127000주유소주유소1095595.011741760933.9063345 - 49세M1007
8127000주유소주유소1095595.011741760933.9063350 - 54세F1036
9127000주유소주유소1095595.011741760933.9063320 - 24세M727
오염원 고유번호오염원 지역 코드오염원 종류 명오염원 상세 종류 명오염원 경도오염원 위도연령대성별인구수
90327000주유소주유소1096684.32791761283.126390 - 4세F88
91327000주유소주유소1096684.32791761283.126390 - 4세M119
92327000주유소주유소1096684.32791761283.126395 - 9세F166
93327000주유소주유소1096684.32791761283.1263915 - 19세F187
94327000주유소주유소1096684.32791761283.1263935 - 39세F367
95327000주유소주유소1096684.32791761283.1263940 - 44세F346
96327000주유소주유소1096684.32791761283.126395 - 9세M158
97327000주유소주유소1096684.32791761283.1263940 - 44세M384
98327000주유소주유소1096684.32791761283.1263945 - 49세F427
99327000주유소주유소1096684.32791761283.1263945 - 49세M453