Overview

Dataset statistics

Number of variables5
Number of observations2082
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory87.6 KiB
Average record size in memory43.1 B

Variable types

Numeric3
Categorical1
Text1

Dataset

Description김해시에서 통계기반 도시현황 파악을 위해 개발한 통계지수 중 하나로서, 통계연도, 시도명, 시군구명, 전국대비인구비율(퍼센트), 총인구수(명)로 구성되어 있습니다. 김해시 중심의 통계지수로서, 데이터 수집, 가공 등의 어려움으로 김해시 외 지역의 정보는 누락될 수 있습니다.
Author경상남도 김해시
URLhttps://www.data.go.kr/data/15110126/fileData.do

Alerts

전국대비인구비율(퍼센트) is highly overall correlated with 총인구수(명)High correlation
총인구수(명) is highly overall correlated with 전국대비인구비율(퍼센트)High correlation
전국대비인구비율(퍼센트) is highly skewed (γ1 = 21.96343211)Skewed
총인구수(명) is highly skewed (γ1 = 21.83804163)Skewed

Reproduction

Analysis started2024-04-29 23:02:15.418644
Analysis finished2024-04-29 23:02:18.378870
Duration2.96 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

통계연도
Real number (ℝ)

Distinct8
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2019.5029
Minimum2016
Maximum2023
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size18.4 KiB
2024-04-30T08:02:18.432095image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2016
5-th percentile2016
Q12018
median2020
Q32022
95-th percentile2023
Maximum2023
Range7
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.292675
Coefficient of variation (CV)0.001135267
Kurtosis-1.2391607
Mean2019.5029
Median Absolute Deviation (MAD)2
Skewness-0.0014396359
Sum4204605
Variance5.2563588
MonotonicityIncreasing
2024-04-30T08:02:18.541318image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
2022 261
12.5%
2023 261
12.5%
2016 260
12.5%
2017 260
12.5%
2018 260
12.5%
2019 260
12.5%
2020 260
12.5%
2021 260
12.5%
ValueCountFrequency (%)
2016 260
12.5%
2017 260
12.5%
2018 260
12.5%
2019 260
12.5%
2020 260
12.5%
2021 260
12.5%
2022 261
12.5%
2023 261
12.5%
ValueCountFrequency (%)
2023 261
12.5%
2022 261
12.5%
2021 260
12.5%
2020 260
12.5%
2019 260
12.5%
2018 260
12.5%
2017 260
12.5%
2016 260
12.5%

시도명
Categorical

Distinct19
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size16.4 KiB
경기도
384 
서울특별시
200 
경상북도
200 
경상남도
184 
전라남도
176 
Other values (14)
938 

Length

Max length7
Median length5
Mean length4.17195
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울특별시
2nd row서울특별시
3rd row서울특별시
4th row서울특별시
5th row서울특별시

Common Values

ValueCountFrequency (%)
경기도 384
18.4%
서울특별시 200
9.6%
경상북도 200
9.6%
경상남도 184
8.8%
전라남도 176
8.5%
충청남도 136
 
6.5%
부산광역시 128
 
6.1%
충청북도 120
 
5.8%
강원도 108
 
5.2%
전라북도 96
 
4.6%
Other values (9) 350
16.8%

Length

2024-04-30T08:02:18.688526image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기도 384
18.4%
서울특별시 200
9.6%
경상북도 200
9.6%
경상남도 184
8.8%
전라남도 176
8.5%
충청남도 136
 
6.5%
부산광역시 128
 
6.1%
충청북도 120
 
5.8%
강원도 108
 
5.2%
전라북도 96
 
4.6%
Other values (9) 350
16.8%
Distinct237
Distinct (%)11.4%
Missing0
Missing (%)0.0%
Memory size16.4 KiB
2024-04-30T08:02:18.992829image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length3
Mean length2.9558117
Min length2

Characters and Unicode

Total characters6154
Distinct characters142
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row종로구
2nd row중구
3rd row용산구
4th row성동구
5th row광진구
ValueCountFrequency (%)
중구 48
 
2.3%
동구 48
 
2.3%
남구 42
 
2.0%
북구 40
 
1.9%
서구 40
 
1.9%
강서구 16
 
0.8%
고성군 16
 
0.8%
남원시 8
 
0.4%
덕진구 8
 
0.4%
군산시 8
 
0.4%
Other values (227) 1808
86.8%
2024-04-30T08:02:19.425747image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
848
 
13.8%
680
 
11.0%
626
 
10.2%
176
 
2.9%
176
 
2.9%
160
 
2.6%
160
 
2.6%
152
 
2.5%
152
 
2.5%
128
 
2.1%
Other values (132) 2896
47.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 6154
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
848
 
13.8%
680
 
11.0%
626
 
10.2%
176
 
2.9%
176
 
2.9%
160
 
2.6%
160
 
2.6%
152
 
2.5%
152
 
2.5%
128
 
2.1%
Other values (132) 2896
47.1%

Most occurring scripts

ValueCountFrequency (%)
Hangul 6154
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
848
 
13.8%
680
 
11.0%
626
 
10.2%
176
 
2.9%
176
 
2.9%
160
 
2.6%
160
 
2.6%
152
 
2.5%
152
 
2.5%
128
 
2.1%
Other values (132) 2896
47.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 6154
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
848
 
13.8%
680
 
11.0%
626
 
10.2%
176
 
2.9%
176
 
2.9%
160
 
2.6%
160
 
2.6%
152
 
2.5%
152
 
2.5%
128
 
2.1%
Other values (132) 2896
47.1%

전국대비인구비율(퍼센트)
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct681
Distinct (%)32.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.47932356
Minimum0.017488663
Maximum26.557689
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size18.4 KiB
2024-04-30T08:02:19.580172image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.017488663
5-th percentile0.05
Q10.12
median0.37
Q30.66958225
95-th percentile1.26
Maximum26.557689
Range26.5402
Interquartile range (IQR)0.54958225

Descriptive statistics

Standard deviation0.73788775
Coefficient of variation (CV)1.5394356
Kurtosis752.23289
Mean0.47932356
Median Absolute Deviation (MAD)0.26080478
Skewness21.963432
Sum997.95165
Variance0.54447833
MonotonicityNot monotonic
2024-04-30T08:02:19.721296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.08 68
 
3.3%
0.05 62
 
3.0%
0.06 54
 
2.6%
0.1 51
 
2.4%
0.09 51
 
2.4%
0.07 43
 
2.1%
0.12 42
 
2.0%
0.13 32
 
1.5%
0.45 24
 
1.2%
0.22 24
 
1.2%
Other values (671) 1631
78.3%
ValueCountFrequency (%)
0.017488663 1
 
< 0.1%
0.017685225 1
 
< 0.1%
0.02 6
 
0.3%
0.03 6
 
0.3%
0.030513199 1
 
< 0.1%
0.03114755 1
 
< 0.1%
0.04 16
0.8%
0.040072678 1
 
< 0.1%
0.040882349 1
 
< 0.1%
0.041024579 1
 
< 0.1%
ValueCountFrequency (%)
26.55768851 1
< 0.1%
5.840021016 1
< 0.1%
4.976731859 1
< 0.1%
4.150229607 1
< 0.1%
3.51525657 1
< 0.1%
3.418890895 1
< 0.1%
3.104644492 1
< 0.1%
2.976711557 1
< 0.1%
2.80994984 1
< 0.1%
2.765178573 1
< 0.1%

총인구수(명)
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct2077
Distinct (%)99.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean247623.4
Minimum8867
Maximum13630821
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size18.4 KiB
2024-04-30T08:02:19.872535image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum8867
5-th percentile27315.65
Q162055.5
median191281
Q3344651.25
95-th percentile651474.8
Maximum13630821
Range13621954
Interquartile range (IQR)282595.75

Descriptive statistics

Standard deviation379425.49
Coefficient of variation (CV)1.5322683
Kurtosis746.30623
Mean247623.4
Median Absolute Deviation (MAD)136623.5
Skewness21.838042
Sum5.1555191 × 108
Variance1.439637 × 1011
MonotonicityNot monotonic
2024-04-30T08:02:20.048263image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
122499 2
 
0.1%
9077 2
 
0.1%
50093 2
 
0.1%
51750 2
 
0.1%
30139 2
 
0.1%
547178 1
 
< 0.1%
238311 1
 
< 0.1%
308867 1
 
< 0.1%
806067 1
 
< 0.1%
152737 1
 
< 0.1%
Other values (2067) 2067
99.3%
ValueCountFrequency (%)
8867 1
< 0.1%
8996 1
< 0.1%
9077 2
0.1%
9617 1
< 0.1%
9832 1
< 0.1%
9975 1
< 0.1%
10001 1
< 0.1%
15661 1
< 0.1%
16022 1
< 0.1%
16320 1
< 0.1%
ValueCountFrequency (%)
13630821 1
< 0.1%
2997410 1
< 0.1%
2554324 1
< 0.1%
2130119 1
< 0.1%
1804217 1
< 0.1%
1754757 1
< 0.1%
1593469 1
< 0.1%
1527807 1
< 0.1%
1442216 1
< 0.1%
1419237 1
< 0.1%

Interactions

2024-04-30T08:02:17.901311image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T08:02:17.136333image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T08:02:17.559748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T08:02:18.001727image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T08:02:17.306763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T08:02:17.679306image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T08:02:18.103063image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T08:02:17.444952image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T08:02:17.791580image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-30T08:02:20.160262image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통계연도시도명전국대비인구비율(퍼센트)총인구수(명)
통계연도1.0000.2460.1280.128
시도명0.2461.0000.0390.039
전국대비인구비율(퍼센트)0.1280.0391.0001.000
총인구수(명)0.1280.0391.0001.000
2024-04-30T08:02:20.266997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
통계연도전국대비인구비율(퍼센트)총인구수(명)시도명
통계연도1.0000.0130.0090.102
전국대비인구비율(퍼센트)0.0131.0001.0000.021
총인구수(명)0.0091.0001.0000.021
시도명0.1020.0210.0211.000

Missing values

2024-04-30T08:02:18.243155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-30T08:02:18.336208image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

통계연도시도명시군구명전국대비인구비율(퍼센트)총인구수(명)
02016서울특별시종로구0.3152737
12016서울특별시중구0.24125249
22016서울특별시용산구0.45230241
32016서울특별시성동구0.58299259
42016서울특별시광진구0.69357215
52016서울특별시동대문구0.69355069
62016서울특별시중랑구0.8411005
72016서울특별시성북구0.87450355
82016서울특별시강북구0.63327195
92016서울특별시도봉구0.67348220
통계연도시도명시군구명전국대비인구비율(퍼센트)총인구수(명)
20722023경상남도창녕군0.11121857083
20732023경상남도고성군0.09638149468
20742023경상남도남해군0.07945440780
20752023경상남도하동군0.08106341606
20762023경상남도산청군0.06576133752
20772023경상남도함양군0.07198236945
20782023경상남도거창군0.11699360047
20792023경상남도합천군0.08027841203
20802023제주특별자치도제주시0.957917491654
20812023제주특별자치도서귀포시0.357714183598