Overview

Dataset statistics

Number of variables10
Number of observations208
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory16.8 KiB
Average record size in memory82.6 B

Variable types

Numeric2
Categorical6
Text2

Dataset

Description경상북도 경산시 산사태취약지역 지정 현황에 대한 데이터로 소재지, 지정면적, 취약지유형, 등급 등에 대한 자료를 제공합니다.
Author경상북도 경산시
URLhttps://www.data.go.kr/data/15123801/fileData.do

Alerts

시군구 has constant value ""Constant
연번 is highly overall correlated with 행정동 and 1 other fieldsHigh correlation
지정면적(제곱미터) is highly overall correlated with 행정동High correlation
행정동 is highly overall correlated with 연번 and 2 other fieldsHigh correlation
법정리동 is highly overall correlated with 연번 and 1 other fieldsHigh correlation
지목 is highly imbalanced (59.9%)Imbalance
취약지유형 is highly imbalanced (55.9%)Imbalance
연번 has unique valuesUnique

Reproduction

Analysis started2024-03-30 07:59:07.561468
Analysis finished2024-03-30 07:59:15.907900
Duration8.35 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct208
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean104.5
Minimum1
Maximum208
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.0 KiB
2024-03-30T07:59:16.180834image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile11.35
Q152.75
median104.5
Q3156.25
95-th percentile197.65
Maximum208
Range207
Interquartile range (IQR)103.5

Descriptive statistics

Standard deviation60.188592
Coefficient of variation (CV)0.57596739
Kurtosis-1.2
Mean104.5
Median Absolute Deviation (MAD)52
Skewness0
Sum21736
Variance3622.6667
MonotonicityStrictly increasing
2024-03-30T07:59:16.602244image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.5%
106 1
 
0.5%
134 1
 
0.5%
135 1
 
0.5%
136 1
 
0.5%
137 1
 
0.5%
138 1
 
0.5%
139 1
 
0.5%
140 1
 
0.5%
141 1
 
0.5%
Other values (198) 198
95.2%
ValueCountFrequency (%)
1 1
0.5%
2 1
0.5%
3 1
0.5%
4 1
0.5%
5 1
0.5%
6 1
0.5%
7 1
0.5%
8 1
0.5%
9 1
0.5%
10 1
0.5%
ValueCountFrequency (%)
208 1
0.5%
207 1
0.5%
206 1
0.5%
205 1
0.5%
204 1
0.5%
203 1
0.5%
202 1
0.5%
201 1
0.5%
200 1
0.5%
199 1
0.5%

시군구
Categorical

CONSTANT 

Distinct1
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
경산시
208 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row경산시
2nd row경산시
3rd row경산시
4th row경산시
5th row경산시

Common Values

ValueCountFrequency (%)
경산시 208
100.0%

Length

2024-03-30T07:59:16.975756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-30T07:59:17.294076image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
경산시 208
100.0%

행정동
Categorical

HIGH CORRELATION 

Distinct11
Distinct (%)5.3%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
남천면
61 
용성면
38 
하양읍
31 
와촌면
29 
남산면
22 
Other values (6)
27 

Length

Max length4
Median length3
Mean length3.0576923
Min length3

Unique

Unique1 ?
Unique (%)0.5%

Sample

1st row하양읍
2nd row하양읍
3rd row하양읍
4th row하양읍
5th row하양읍

Common Values

ValueCountFrequency (%)
남천면 61
29.3%
용성면 38
18.3%
하양읍 31
14.9%
와촌면 29
13.9%
남산면 22
 
10.6%
진량읍 8
 
3.8%
남천면 6
 
2.9%
서부1동 5
 
2.4%
동부동 4
 
1.9%
남부동 3
 
1.4%

Length

2024-03-30T07:59:17.608862image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
남천면 67
32.2%
용성면 39
18.8%
하양읍 31
14.9%
와촌면 29
13.9%
남산면 22
 
10.6%
진량읍 8
 
3.8%
서부1동 5
 
2.4%
동부동 4
 
1.9%
남부동 3
 
1.4%

법정리동
Categorical

HIGH CORRELATION 

Distinct45
Distinct (%)21.6%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
대곡리
19 
송백리
15 
송림리
 
13
대한리
 
12
매남리
 
9
Other values (40)
140 

Length

Max length3
Median length3
Mean length2.9759615
Min length2

Unique

Unique9 ?
Unique (%)4.3%

Sample

1st row남하리
2nd row남하리
3rd row남하리
4th row대곡리
5th row대곡리

Common Values

ValueCountFrequency (%)
대곡리 19
 
9.1%
송백리 15
 
7.2%
송림리 13
 
6.2%
대한리 12
 
5.8%
매남리 9
 
4.3%
산전리 9
 
4.3%
하도리 9
 
4.3%
금곡리 8
 
3.8%
음양리 8
 
3.8%
사기리 7
 
3.4%
Other values (35) 99
47.6%

Length

2024-03-30T07:59:18.074068image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
대곡리 19
 
9.1%
송백리 15
 
7.2%
송림리 13
 
6.2%
대한리 12
 
5.8%
매남리 9
 
4.3%
산전리 9
 
4.3%
하도리 9
 
4.3%
금곡리 8
 
3.8%
음양리 8
 
3.8%
사기리 7
 
3.4%
Other values (35) 99
47.6%

번지
Text

Distinct176
Distinct (%)84.6%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
2024-03-30T07:59:18.691509image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length10
Mean length5.3413462
Min length2

Characters and Unicode

Total characters1111
Distinct characters21
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique154 ?
Unique (%)74.0%

Sample

1st row산149구
2nd row산127임
3rd row672임
4th row산65
5th row산190구
ValueCountFrequency (%)
25
 
9.7%
2필 7
 
2.7%
산33임 6
 
2.3%
1필 6
 
2.3%
3필 6
 
2.3%
1849구 4
 
1.5%
산36임 3
 
1.2%
산44임 3
 
1.2%
산283임 3
 
1.2%
4필 3
 
1.2%
Other values (174) 193
74.5%
2024-03-30T07:59:19.850086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
167
15.0%
132
11.9%
1 127
11.4%
2 83
 
7.5%
3 70
 
6.3%
- 64
 
5.8%
4 61
 
5.5%
51
 
4.6%
6 51
 
4.6%
5 50
 
4.5%
Other values (11) 255
23.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 601
54.1%
Other Letter 395
35.6%
Dash Punctuation 64
 
5.8%
Space Separator 51
 
4.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 127
21.1%
2 83
13.8%
3 70
11.6%
4 61
10.1%
6 51
8.5%
5 50
 
8.3%
8 45
 
7.5%
9 39
 
6.5%
7 38
 
6.3%
0 37
 
6.2%
Other Letter
ValueCountFrequency (%)
167
42.3%
132
33.4%
33
 
8.4%
25
 
6.3%
25
 
6.3%
9
 
2.3%
2
 
0.5%
1
 
0.3%
1
 
0.3%
Dash Punctuation
ValueCountFrequency (%)
- 64
100.0%
Space Separator
ValueCountFrequency (%)
51
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 716
64.4%
Hangul 395
35.6%

Most frequent character per script

Common
ValueCountFrequency (%)
1 127
17.7%
2 83
11.6%
3 70
9.8%
- 64
8.9%
4 61
8.5%
51
7.1%
6 51
7.1%
5 50
 
7.0%
8 45
 
6.3%
9 39
 
5.4%
Other values (2) 75
10.5%
Hangul
ValueCountFrequency (%)
167
42.3%
132
33.4%
33
 
8.4%
25
 
6.3%
25
 
6.3%
9
 
2.3%
2
 
0.5%
1
 
0.3%
1
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 716
64.4%
Hangul 395
35.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
167
42.3%
132
33.4%
33
 
8.4%
25
 
6.3%
25
 
6.3%
9
 
2.3%
2
 
0.5%
1
 
0.3%
1
 
0.3%
ASCII
ValueCountFrequency (%)
1 127
17.7%
2 83
11.6%
3 70
9.8%
- 64
8.9%
4 61
8.5%
51
7.1%
6 51
7.1%
5 50
 
7.0%
8 45
 
6.3%
9 39
 
5.4%
Other values (2) 75
10.5%

지목
Categorical

IMBALANCE 

Distinct6
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
162 
33 
 
9
 
2
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique2 ?
Unique (%)1.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
162
77.9%
33
 
15.9%
9
 
4.3%
2
 
1.0%
1
 
0.5%
1
 
0.5%

Length

2024-03-30T07:59:20.244298image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-30T07:59:20.569164image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
162
77.9%
33
 
15.9%
9
 
4.3%
2
 
1.0%
1
 
0.5%
1
 
0.5%

지정면적(제곱미터)
Real number (ℝ)

HIGH CORRELATION 

Distinct190
Distinct (%)91.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3126.1923
Minimum58
Maximum128760
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.0 KiB
2024-03-30T07:59:21.182282image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum58
5-th percentile203.35
Q1715.5
median1349
Q32498
95-th percentile5437.15
Maximum128760
Range128702
Interquartile range (IQR)1782.5

Descriptive statistics

Standard deviation11280.535
Coefficient of variation (CV)3.6083945
Kurtosis90.016811
Mean3126.1923
Median Absolute Deviation (MAD)809
Skewness9.1235735
Sum650248
Variance1.2725047 × 108
MonotonicityNot monotonic
2024-03-30T07:59:21.800869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3000 5
 
2.4%
1249 3
 
1.4%
500 3
 
1.4%
1000 3
 
1.4%
2083 2
 
1.0%
1191 2
 
1.0%
1056 2
 
1.0%
300 2
 
1.0%
1341 2
 
1.0%
198 2
 
1.0%
Other values (180) 182
87.5%
ValueCountFrequency (%)
58 1
0.5%
64 1
0.5%
71 1
0.5%
85 1
0.5%
91 1
0.5%
157 1
0.5%
197 1
0.5%
198 2
1.0%
200 1
0.5%
203 1
0.5%
ValueCountFrequency (%)
128760 1
0.5%
86083 1
0.5%
53554 1
0.5%
17872 1
0.5%
12265 1
0.5%
9398 1
0.5%
7458 1
0.5%
7449 1
0.5%
5834 1
0.5%
5685 1
0.5%

취약지유형
Categorical

IMBALANCE 

Distinct2
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
토석류
189 
산사태
19 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row토석류
2nd row토석류
3rd row토석류
4th row산사태
5th row토석류

Common Values

ValueCountFrequency (%)
토석류 189
90.9%
산사태 19
 
9.1%

Length

2024-03-30T07:59:22.279981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-30T07:59:22.687670image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
토석류 189
90.9%
산사태 19
 
9.1%

등급(A-C)
Categorical

Distinct3
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
C
91 
B
62 
A
55 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowC
2nd rowB
3rd rowA
4th rowC
5th rowA

Common Values

ValueCountFrequency (%)
C 91
43.8%
B 62
29.8%
A 55
26.4%

Length

2024-03-30T07:59:23.206358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-30T07:59:23.679815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
c 91
43.8%
b 62
29.8%
a 55
26.4%
Distinct182
Distinct (%)87.5%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
2024-03-30T07:59:24.331104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length9
Mean length9.0384615
Min length8

Characters and Unicode

Total characters1880
Distinct characters13
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique180 ?
Unique (%)86.5%

Sample

1st row제2013-1호
2nd row제2016-1호
3rd row제2017-12호
4th row제2013-5호
5th row제2018-4호
ValueCountFrequency (%)
제2023-205호 26
 
12.5%
제2017-11호 2
 
1.0%
제2015-37호 1
 
0.5%
제2015-79호 1
 
0.5%
제2013-1호 1
 
0.5%
제2015-36호 1
 
0.5%
제2015-38호 1
 
0.5%
제2015-87호 1
 
0.5%
제2015-109호 1
 
0.5%
제2015-110호 1
 
0.5%
Other values (172) 172
82.7%
2024-03-30T07:59:25.656658image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 303
16.1%
1 273
14.5%
0 261
13.9%
208
11.1%
- 208
11.1%
208
11.1%
5 166
8.8%
3 71
 
3.8%
6 48
 
2.6%
7 45
 
2.4%
Other values (3) 89
 
4.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1256
66.8%
Other Letter 416
 
22.1%
Dash Punctuation 208
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 303
24.1%
1 273
21.7%
0 261
20.8%
5 166
13.2%
3 71
 
5.7%
6 48
 
3.8%
7 45
 
3.6%
8 34
 
2.7%
4 30
 
2.4%
9 25
 
2.0%
Other Letter
ValueCountFrequency (%)
208
50.0%
208
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 208
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1464
77.9%
Hangul 416
 
22.1%

Most frequent character per script

Common
ValueCountFrequency (%)
2 303
20.7%
1 273
18.6%
0 261
17.8%
- 208
14.2%
5 166
11.3%
3 71
 
4.8%
6 48
 
3.3%
7 45
 
3.1%
8 34
 
2.3%
4 30
 
2.0%
Hangul
ValueCountFrequency (%)
208
50.0%
208
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1464
77.9%
Hangul 416
 
22.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 303
20.7%
1 273
18.6%
0 261
17.8%
- 208
14.2%
5 166
11.3%
3 71
 
4.8%
6 48
 
3.3%
7 45
 
3.1%
8 34
 
2.3%
4 30
 
2.0%
Hangul
ValueCountFrequency (%)
208
50.0%
208
50.0%

Interactions

2024-03-30T07:59:14.204750image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-30T07:59:13.708279image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-30T07:59:14.470902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-30T07:59:13.948263image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-30T07:59:26.249416image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번행정동법정리동지목지정면적(제곱미터)취약지유형등급(A-C)
연번1.0000.8890.9950.0470.0410.1020.206
행정동0.8891.0000.9910.0000.7650.0000.333
법정리동0.9950.9911.0000.6360.5440.4700.569
지목0.0470.0000.6361.0000.0000.0940.068
지정면적(제곱미터)0.0410.7650.5440.0001.0000.0000.000
취약지유형0.1020.0000.4700.0940.0001.0000.154
등급(A-C)0.2060.3330.5690.0680.0000.1541.000
2024-03-30T07:59:26.679248image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
법정리동행정동지목취약지유형등급(A-C)
법정리동1.0000.8240.2960.3490.286
행정동0.8241.0000.0000.0000.202
지목0.2960.0001.0000.0660.026
취약지유형0.3490.0000.0661.0000.253
등급(A-C)0.2860.2020.0260.2531.000
2024-03-30T07:59:27.029363image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번지정면적(제곱미터)행정동법정리동지목취약지유형등급(A-C)
연번1.000-0.0070.6510.8490.0200.0750.122
지정면적(제곱미터)-0.0071.0000.5480.2380.0000.0000.000
행정동0.6510.5481.0000.8240.0000.0000.202
법정리동0.8490.2380.8241.0000.2960.3490.286
지목0.0200.0000.0000.2961.0000.0660.026
취약지유형0.0750.0000.0000.3490.0661.0000.253
등급(A-C)0.1220.0000.2020.2860.0260.2531.000

Missing values

2024-03-30T07:59:14.896982image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-30T07:59:15.514397image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번시군구행정동법정리동번지지목지정면적(제곱미터)취약지유형등급(A-C)고시번호
01경산시하양읍남하리산149구3000토석류C제2013-1호
12경산시하양읍남하리산127임417토석류B제2016-1호
23경산시하양읍남하리672임197토석류A제2017-12호
34경산시하양읍대곡리산653000산사태C제2013-5호
45경산시하양읍대곡리산190구1179토석류A제2018-4호
56경산시하양읍대곡리552구1766토석류C제2015-1호
67경산시하양읍대곡리산51임2040토석류C제2015-2호
78경산시하양읍대곡리산70-1임1932토석류C제2015-3호
89경산시하양읍대곡리산95-2임2953토석류C제2015-4호
910경산시하양읍대곡리산100임3500토석류C제2015-5호
연번시군구행정동법정리동번지지목지정면적(제곱미터)취약지유형등급(A-C)고시번호
198199경산시남부동백천동146-12임17872토석류C제2015-65호
199200경산시서부1동사정동산7300토석류C제2013-20호
200201경산시서부1동사정동314-4구2672토석류C제2015-52호
201202경산시서부1동사정동산2-1임984토석류C제2015-53호
202203경산시서부1동사정동산7임1353토석류C제2015-116호
203204경산시서부1동옥곡동산32-2임3274토석류C제2015-54호
204205경산시동부동유곡동산79-2임3318토석류B제2017-19호
205206경산시동부동유곡동360구2083토석류B제2018-2호
206207경산시동부동점촌동360천376토석류B제2017-11호
207208경산시동부동점촌동산37임4949산사태A제2017-22호