Overview

Dataset statistics

Number of variables5
Number of observations1984
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory83.4 KiB
Average record size in memory43.1 B

Variable types

Numeric3
Categorical1
Text1

Dataset

Description연도별 유역의 지하수 이용정보를 아래와 같이 제공합니다.제공정보- 연도,대권역,중권역코드,이용량(m3/년)
Author한국수자원공사
URLhttps://www.data.go.kr/data/15054545/fileData.do

Alerts

중권역코드 is highly overall correlated with 대권역High correlation
대권역 is highly overall correlated with 중권역코드High correlation

Reproduction

Analysis started2024-05-04 07:48:05.841288
Analysis finished2024-05-04 07:48:10.517777
Duration4.68 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연도
Real number (ℝ)

Distinct17
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2013.9829
Minimum2006
Maximum2022
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size17.6 KiB
2024-05-04T07:48:10.770904image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2006
5-th percentile2006
Q12010
median2014
Q32018
95-th percentile2022
Maximum2022
Range16
Interquartile range (IQR)8

Descriptive statistics

Standard deviation4.894109
Coefficient of variation (CV)0.0024300649
Kurtosis-1.2058869
Mean2013.9829
Median Absolute Deviation (MAD)4
Skewness0.0031603796
Sum3995742
Variance23.952303
MonotonicityIncreasing
2024-05-04T07:48:11.310631image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
2006 117
 
5.9%
2013 117
 
5.9%
2018 117
 
5.9%
2017 117
 
5.9%
2016 117
 
5.9%
2007 117
 
5.9%
2014 117
 
5.9%
2015 117
 
5.9%
2012 117
 
5.9%
2011 117
 
5.9%
Other values (7) 814
41.0%
ValueCountFrequency (%)
2006 117
5.9%
2007 117
5.9%
2008 117
5.9%
2009 117
5.9%
2010 117
5.9%
2011 117
5.9%
2012 117
5.9%
2013 117
5.9%
2014 117
5.9%
2015 117
5.9%
ValueCountFrequency (%)
2022 115
5.8%
2021 116
5.8%
2020 116
5.8%
2019 116
5.8%
2018 117
5.9%
2017 117
5.9%
2016 117
5.9%
2015 117
5.9%
2014 117
5.9%
2013 117
5.9%

대권역
Categorical

HIGH CORRELATION 

Distinct23
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size15.6 KiB
한강
403 
낙동강
374 
금강
238 
섬진강
153 
영산강
136 
Other values (18)
680 

Length

Max length5
Median length4
Mean length3.1043347
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row한강
2nd row한강
3rd row한강
4th row한강
5th row한강

Common Values

ValueCountFrequency (%)
한강 403
20.3%
낙동강 374
18.9%
금강 238
12.0%
섬진강 153
 
7.7%
영산강 136
 
6.9%
섬진강남해 102
 
5.1%
제주도 68
 
3.4%
낙동강남해 68
 
3.4%
금강서해 51
 
2.6%
낙동강동해 51
 
2.6%
Other values (13) 340
17.1%

Length

2024-05-04T07:48:11.770439image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
한강 403
20.3%
낙동강 374
18.9%
금강 238
12.0%
섬진강 153
 
7.7%
영산강 136
 
6.9%
섬진강남해 102
 
5.1%
제주도 68
 
3.4%
낙동강남해 68
 
3.4%
한강동해 51
 
2.6%
영산강서해 51
 
2.6%
Other values (13) 340
17.1%

중권역코드
Real number (ℝ)

HIGH CORRELATION 

Distinct117
Distinct (%)5.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2764.4254
Minimum1001
Maximum6004
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size17.6 KiB
2024-05-04T07:48:12.383474image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1001
5-th percentile1006
Q11303
median2403
Q34004
95-th percentile5302
Maximum6004
Range5003
Interquartile range (IQR)2701

Descriptive statistics

Standard deviation1454.8492
Coefficient of variation (CV)0.52627543
Kurtosis-0.75655644
Mean2764.4254
Median Absolute Deviation (MAD)1202
Skewness0.52843584
Sum5484620
Variance2116586.1
MonotonicityNot monotonic
2024-05-04T07:48:12.930820image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1001 17
 
0.9%
3012 17
 
0.9%
4003 17
 
0.9%
4002 17
 
0.9%
4001 17
 
0.9%
3303 17
 
0.9%
3302 17
 
0.9%
3301 17
 
0.9%
3203 17
 
0.9%
3202 17
 
0.9%
Other values (107) 1814
91.4%
ValueCountFrequency (%)
1001 17
0.9%
1002 17
0.9%
1003 17
0.9%
1004 17
0.9%
1005 17
0.9%
1006 17
0.9%
1007 17
0.9%
1008 16
0.8%
1009 17
0.9%
1010 17
0.9%
ValueCountFrequency (%)
6004 17
0.9%
6003 17
0.9%
6002 17
0.9%
6001 17
0.9%
5303 17
0.9%
5302 17
0.9%
5301 17
0.9%
5202 17
0.9%
5201 17
0.9%
5101 17
0.9%
Distinct117
Distinct (%)5.9%
Missing0
Missing (%)0.0%
Memory size15.6 KiB
2024-05-04T07:48:13.714172image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length5
Mean length3.5544355
Min length2

Characters and Unicode

Total characters7052
Distinct characters110
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row남한강상류
2nd row평창강
3rd row충주댐
4th row달천
5th row충주댐하류
ValueCountFrequency (%)
남한강상류 17
 
0.9%
미호천 17
 
0.9%
오수천 17
 
0.9%
섬진강댐하류 17
 
0.9%
섬진강댐 17
 
0.9%
새만금 17
 
0.9%
동진강 17
 
0.9%
만경강 17
 
0.9%
금강서해 17
 
0.9%
부남방조제 17
 
0.9%
Other values (107) 1814
91.4%
2024-05-04T07:48:15.019174image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
747
 
10.6%
625
 
8.9%
356
 
5.0%
238
 
3.4%
238
 
3.4%
204
 
2.9%
187
 
2.7%
187
 
2.7%
170
 
2.4%
153
 
2.2%
Other values (100) 3947
56.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 7052
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
747
 
10.6%
625
 
8.9%
356
 
5.0%
238
 
3.4%
238
 
3.4%
204
 
2.9%
187
 
2.7%
187
 
2.7%
170
 
2.4%
153
 
2.2%
Other values (100) 3947
56.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 7052
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
747
 
10.6%
625
 
8.9%
356
 
5.0%
238
 
3.4%
238
 
3.4%
204
 
2.9%
187
 
2.7%
187
 
2.7%
170
 
2.4%
153
 
2.2%
Other values (100) 3947
56.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 7052
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
747
 
10.6%
625
 
8.9%
356
 
5.0%
238
 
3.4%
238
 
3.4%
204
 
2.9%
187
 
2.7%
187
 
2.7%
170
 
2.4%
153
 
2.2%
Other values (100) 3947
56.0%

이용량(톤_년)
Real number (ℝ)

Distinct1853
Distinct (%)93.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30825041
Minimum0
Maximum2.1042865 × 108
Zeros15
Zeros (%)0.8%
Negative0
Negative (%)0.0%
Memory size17.6 KiB
2024-05-04T07:48:15.582776image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2631455.4
Q112358954
median20486232
Q340863592
95-th percentile88500521
Maximum2.1042865 × 108
Range2.1042865 × 108
Interquartile range (IQR)28504638

Descriptive statistics

Standard deviation29137286
Coefficient of variation (CV)0.9452473
Kurtosis5.8959155
Mean30825041
Median Absolute Deviation (MAD)11629540
Skewness2.097906
Sum6.1156881 × 1010
Variance8.4898146 × 1014
MonotonicityNot monotonic
2024-05-04T07:48:16.157402image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 15
 
0.8%
20106802.0 2
 
0.1%
13855092.0 2
 
0.1%
25655979.0 2
 
0.1%
8544825.0 2
 
0.1%
8987104.0 2
 
0.1%
7926577.0 2
 
0.1%
15457198.0 2
 
0.1%
9563386.0 2
 
0.1%
11355448.0 2
 
0.1%
Other values (1843) 1951
98.3%
ValueCountFrequency (%)
0.0 15
0.8%
76761.0 1
 
0.1%
84872.0 1
 
0.1%
86954.0 1
 
0.1%
88208.0 1
 
0.1%
88714.0 1
 
0.1%
89891.0 1
 
0.1%
89983.0 1
 
0.1%
102306.0 1
 
0.1%
102703.0 1
 
0.1%
ValueCountFrequency (%)
210428646.0 1
0.1%
182583526.0 1
0.1%
182528841.0 1
0.1%
181768402.0 1
0.1%
181400178.0 1
0.1%
176259929.0 1
0.1%
176177154.0 1
0.1%
171426391.0 1
0.1%
170733364.0 1
0.1%
170050268.0 1
0.1%

Interactions

2024-05-04T07:48:08.791233image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T07:48:06.481197image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T07:48:07.668147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T07:48:09.098180image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T07:48:06.944409image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T07:48:07.973194image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T07:48:09.655376image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T07:48:07.318843image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T07:48:08.469602image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-04T07:48:16.521427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도대권역중권역코드이용량(톤_년)
연도1.0000.0000.0000.100
대권역0.0001.0000.9590.709
중권역코드0.0000.9591.0000.489
이용량(톤_년)0.1000.7090.4891.000
2024-05-04T07:48:16.870481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도중권역코드이용량(톤_년)대권역
연도1.0000.004-0.0700.000
중권역코드0.0041.0000.0360.791
이용량(톤_년)-0.0700.0361.0000.352
대권역0.0000.7910.3521.000

Missing values

2024-05-04T07:48:10.012068image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-04T07:48:10.368808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연도대권역중권역코드중권역이용량(톤_년)
02006한강1001남한강상류10532608.0
12006한강1002평창강19333138.0
22006한강1003충주댐79977581.0
32006한강1004달천60773531.0
42006한강1005충주댐하류31825004.0
52006한강1006섬강60180911.0
62006한강1007남한강하류164501344.0
72006한강1008금강산댐76761.0
82006한강1009평화의댐3790490.0
92006한강1010춘천댐13085620.0
연도대권역중권역코드중권역이용량(톤_년)
19742022탐진강5101탐진강13620654.0
19752022영산강남해5201진도6274501.0
19762022영산강남해5202영암방조제31668176.0
19772022영산강서해5301주진천28476598.0
19782022영산강서해5302와탄천43692316.0
19792022영산강서해5303신안군1735487.0
19802022제주도6001제주서해54854843.22
19812022제주도6002제주북해53427500.0
19822022제주도6003제주남해45693364.0
19832022제주도6004제주동해106803303.0