Overview

Dataset statistics

Number of variables9
Number of observations58
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.5 KiB
Average record size in memory80.3 B

Variable types

Categorical5
Numeric4

Dataset

Description2018년도 에서 2022년까지 연도별 지방세 과세 및 비과세 현황을 세목별로 제공 국민조세혜택 규모를 파악하는데 사용
URLhttps://www.data.go.kr/data/15078527/fileData.do

Alerts

시도명 has constant value ""Constant
시군구명 has constant value ""Constant
자치단체코드 has constant value ""Constant
과세건수 is highly overall correlated with 과세금액 and 3 other fieldsHigh correlation
과세금액 is highly overall correlated with 과세건수 and 3 other fieldsHigh correlation
비과세건수 is highly overall correlated with 과세건수 and 3 other fieldsHigh correlation
비과세금액 is highly overall correlated with 과세건수 and 3 other fieldsHigh correlation
세목명 is highly overall correlated with 과세건수 and 3 other fieldsHigh correlation
과세건수 has 14 (24.1%) zerosZeros
과세금액 has 14 (24.1%) zerosZeros
비과세건수 has 23 (39.7%) zerosZeros
비과세금액 has 23 (39.7%) zerosZeros

Reproduction

Analysis started2023-12-12 13:43:30.005323
Analysis finished2023-12-12 13:43:32.262017
Duration2.26 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시도명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size596.0 B
대구광역시
58 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row대구광역시
2nd row대구광역시
3rd row대구광역시
4th row대구광역시
5th row대구광역시

Common Values

ValueCountFrequency (%)
대구광역시 58
100.0%

Length

2023-12-12T22:43:32.318149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:43:32.403379image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
대구광역시 58
100.0%

시군구명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size596.0 B
서구
58 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서구
2nd row서구
3rd row서구
4th row서구
5th row서구

Common Values

ValueCountFrequency (%)
서구 58
100.0%

Length

2023-12-12T22:43:32.501267image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:43:32.580743image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서구 58
100.0%

자치단체코드
Categorical

CONSTANT 

Distinct1
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size596.0 B
27170
58 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row27170
2nd row27170
3rd row27170
4th row27170
5th row27170

Common Values

ValueCountFrequency (%)
27170 58
100.0%

Length

2023-12-12T22:43:32.671300image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:43:32.761137image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
27170 58
100.0%

과세년도
Categorical

Distinct5
Distinct (%)8.6%
Missing0
Missing (%)0.0%
Memory size596.0 B
2018
12 
2019
12 
2021
12 
2022
12 
2020
10 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2018
2nd row2018
3rd row2018
4th row2018
5th row2018

Common Values

ValueCountFrequency (%)
2018 12
20.7%
2019 12
20.7%
2021 12
20.7%
2022 12
20.7%
2020 10
17.2%

Length

2023-12-12T22:43:32.848352image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:43:32.938647image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2018 12
20.7%
2019 12
20.7%
2021 12
20.7%
2022 12
20.7%
2020 10
17.2%

세목명
Categorical

HIGH CORRELATION 

Distinct12
Distinct (%)20.7%
Missing0
Missing (%)0.0%
Memory size596.0 B
취득세
주민세
재산세
자동차세
레저세
Other values (7)
33 

Length

Max length7
Median length5
Mean length4.2241379
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row취득세
2nd row주민세
3rd row재산세
4th row자동차세
5th row레저세

Common Values

ValueCountFrequency (%)
취득세 5
8.6%
주민세 5
8.6%
재산세 5
8.6%
자동차세 5
8.6%
레저세 5
8.6%
지방소비세 5
8.6%
등록면허세 5
8.6%
지역자원시설세 5
8.6%
지방소득세 5
8.6%
교육세 5
8.6%
Other values (2) 8
13.8%

Length

2023-12-12T22:43:33.056532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
취득세 5
8.6%
주민세 5
8.6%
재산세 5
8.6%
자동차세 5
8.6%
레저세 5
8.6%
지방소비세 5
8.6%
등록면허세 5
8.6%
지역자원시설세 5
8.6%
지방소득세 5
8.6%
교육세 5
8.6%
Other values (2) 8
13.8%

과세건수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct45
Distinct (%)77.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean72190.776
Minimum0
Maximum358307
Zeros14
Zeros (%)24.1%
Negative0
Negative (%)0.0%
Memory size654.0 B
2023-12-12T22:43:33.205405image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q15.5
median65983
Q391313.5
95-th percentile328656.05
Maximum358307
Range358307
Interquartile range (IQR)91308

Descriptive statistics

Standard deviation91290.227
Coefficient of variation (CV)1.2645691
Kurtosis4.1686231
Mean72190.776
Median Absolute Deviation (MAD)42839.5
Skewness2.1069469
Sum4187065
Variance8.3339055 × 109
MonotonicityNot monotonic
2023-12-12T22:43:33.360592image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=45)
ValueCountFrequency (%)
0 14
 
24.1%
22067 1
 
1.7%
18945 1
 
1.7%
21290 1
 
1.7%
81232 1
 
1.7%
91222 1
 
1.7%
107746 1
 
1.7%
7 1
 
1.7%
67480 1
 
1.7%
69658 1
 
1.7%
Other values (35) 35
60.3%
ValueCountFrequency (%)
0 14
24.1%
5 1
 
1.7%
7 1
 
1.7%
9 1
 
1.7%
45 1
 
1.7%
18945 1
 
1.7%
21290 1
 
1.7%
22067 1
 
1.7%
24941 1
 
1.7%
25442 1
 
1.7%
ValueCountFrequency (%)
358307 1
1.7%
350605 1
1.7%
332334 1
1.7%
328007 1
1.7%
324042 1
1.7%
118038 1
1.7%
112702 1
1.7%
110264 1
1.7%
107746 1
1.7%
105040 1
1.7%

과세금액
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct45
Distinct (%)77.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.223041 × 1010
Minimum0
Maximum6.0016258 × 1010
Zeros14
Zeros (%)24.1%
Negative0
Negative (%)0.0%
Memory size654.0 B
2023-12-12T22:43:33.543075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q18.02952 × 108
median5.2293525 × 109
Q31.8748484 × 1010
95-th percentile4.2322479 × 1010
Maximum6.0016258 × 1010
Range6.0016258 × 1010
Interquartile range (IQR)1.7945532 × 1010

Descriptive statistics

Standard deviation1.4773914 × 1010
Coefficient of variation (CV)1.2079656
Kurtosis1.7267012
Mean1.223041 × 1010
Median Absolute Deviation (MAD)5.2293525 × 109
Skewness1.5112347
Sum7.0936376 × 1011
Variance2.1826852 × 1020
MonotonicityNot monotonic
2023-12-12T22:43:33.785287image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=45)
ValueCountFrequency (%)
0 14
 
24.1%
39834709000 1
 
1.7%
54149421000 1
 
1.7%
60016258000 1
 
1.7%
4549270000 1
 
1.7%
31638316000 1
 
1.7%
12236327000 1
 
1.7%
5294284000 1
 
1.7%
5244744000 1
 
1.7%
3117967000 1
 
1.7%
Other values (35) 35
60.3%
ValueCountFrequency (%)
0 14
24.1%
95389000 1
 
1.7%
2925641000 1
 
1.7%
2998662000 1
 
1.7%
3005089000 1
 
1.7%
3117967000 1
 
1.7%
3227182000 1
 
1.7%
3736627000 1
 
1.7%
3947449000 1
 
1.7%
4003689000 1
 
1.7%
ValueCountFrequency (%)
60016258000 1
1.7%
54149421000 1
1.7%
45425507000 1
1.7%
41774886000 1
1.7%
39834709000 1
1.7%
33974519000 1
1.7%
31638316000 1
1.7%
29672275000 1
1.7%
28502500000 1
1.7%
28025706000 1
1.7%

비과세건수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct35
Distinct (%)60.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12200.328
Minimum0
Maximum59043
Zeros23
Zeros (%)39.7%
Negative0
Negative (%)0.0%
Memory size654.0 B
2023-12-12T22:43:33.928929image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median564
Q328149.25
95-th percentile53610.55
Maximum59043
Range59043
Interquartile range (IQR)28149.25

Descriptive statistics

Standard deviation18131.942
Coefficient of variation (CV)1.4861849
Kurtosis0.32158838
Mean12200.328
Median Absolute Deviation (MAD)564
Skewness1.2526685
Sum707619
Variance3.2876732 × 108
MonotonicityNot monotonic
2023-12-12T22:43:34.096138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=35)
ValueCountFrequency (%)
0 23
39.7%
13 2
 
3.4%
595 1
 
1.7%
19 1
 
1.7%
32281 1
 
1.7%
21201 1
 
1.7%
32671 1
 
1.7%
53569 1
 
1.7%
1581 1
 
1.7%
32741 1
 
1.7%
Other values (25) 25
43.1%
ValueCountFrequency (%)
0 23
39.7%
11 1
 
1.7%
12 1
 
1.7%
13 2
 
3.4%
19 1
 
1.7%
557 1
 
1.7%
571 1
 
1.7%
595 1
 
1.7%
656 1
 
1.7%
1008 1
 
1.7%
ValueCountFrequency (%)
59043 1
1.7%
55980 1
1.7%
53846 1
1.7%
53569 1
1.7%
52082 1
1.7%
34616 1
1.7%
33250 1
1.7%
32741 1
1.7%
32671 1
1.7%
32281 1
1.7%

비과세금액
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct33
Distinct (%)56.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.2717546 × 109
Minimum0
Maximum3.9781043 × 1010
Zeros23
Zeros (%)39.7%
Negative0
Negative (%)0.0%
Memory size654.0 B
2023-12-12T22:43:34.250189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median24168000
Q31.9171092 × 109
95-th percentile3.5677165 × 1010
Maximum3.9781043 × 1010
Range3.9781043 × 1010
Interquartile range (IQR)1.9171092 × 109

Descriptive statistics

Standard deviation1.0406411 × 1010
Coefficient of variation (CV)2.4360975
Kurtosis6.3771645
Mean4.2717546 × 109
Median Absolute Deviation (MAD)24168000
Skewness2.7632869
Sum2.4776177 × 1011
Variance1.0829338 × 1020
MonotonicityNot monotonic
2023-12-12T22:43:34.406687image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=33)
ValueCountFrequency (%)
0 23
39.7%
2000 4
 
6.9%
33491403000 1
 
1.7%
236702000 1
 
1.7%
3000 1
 
1.7%
39781043000 1
 
1.7%
1476061000 1
 
1.7%
9544876000 1
 
1.7%
1971878000 1
 
1.7%
37041000 1
 
1.7%
Other values (23) 23
39.7%
ValueCountFrequency (%)
0 23
39.7%
2000 4
 
6.9%
3000 1
 
1.7%
23279000 1
 
1.7%
25057000 1
 
1.7%
29147000 1
 
1.7%
37041000 1
 
1.7%
51099000 1
 
1.7%
222844000 1
 
1.7%
222928000 1
 
1.7%
ValueCountFrequency (%)
39781043000 1
1.7%
38401356000 1
1.7%
36637764000 1
1.7%
35507648000 1
1.7%
33491403000 1
1.7%
10178307000 1
1.7%
9544876000 1
1.7%
8927434000 1
1.7%
8648840000 1
1.7%
8278943000 1
1.7%

Interactions

2023-12-12T22:43:31.674701image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:43:30.272945image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:43:30.645907image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:43:31.275336image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:43:31.778480image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:43:30.368938image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:43:30.734435image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:43:31.363227image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:43:31.874261image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:43:30.454118image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:43:31.094565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:43:31.456371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:43:31.958968image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:43:30.541559image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:43:31.181317image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:43:31.559053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T22:43:34.524844image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
과세년도세목명과세건수과세금액비과세건수비과세금액
과세년도1.0000.0000.0000.0000.0000.000
세목명0.0001.0000.9570.8580.9590.967
과세건수0.0000.9571.0000.7930.7090.336
과세금액0.0000.8580.7931.0000.7680.918
비과세건수0.0000.9590.7090.7681.0000.690
비과세금액0.0000.9670.3360.9180.6901.000
2023-12-12T22:43:34.657582image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
과세년도세목명
과세년도1.0000.000
세목명0.0001.000
2023-12-12T22:43:34.767229image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
과세건수과세금액비과세건수비과세금액과세년도세목명
과세건수1.0000.5300.6650.5950.0000.846
과세금액0.5301.0000.6060.6390.0000.570
비과세건수0.6650.6061.0000.9620.0000.668
비과세금액0.5950.6390.9621.0000.0000.707
과세년도0.0000.0000.0000.0001.0000.000
세목명0.8460.5700.6680.7070.0001.000

Missing values

2023-12-12T22:43:32.095472image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T22:43:32.213575image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시도명시군구명자치단체코드과세년도세목명과세건수과세금액비과세건수비과세금액
0대구광역시서구271702018취득세22067398347090003274133491403000
1대구광역시서구271702018주민세92464469988500016579981134000
2대구광역시서구271702018재산세9245626321180000306598278943000
3대구광역시서구271702018자동차세11803813095486000590432345093000
4대구광역시서구271702018레저세0000
5대구광역시서구271702018담배소비세0000
6대구광역시서구271702018지방소비세0000
7대구광역시서구271702018등록면허세691394003689000131123279000
8대구광역시서구271702018도시계획세0000
9대구광역시서구271702018지역자원시설세7623429256410001008222844000
시도명시군구명자치단체코드과세년도세목명과세건수과세금액비과세건수비과세금액
48대구광역시서구271702022재산세91457339745190003325010178307000
49대구광역시서구271702022자동차세10504011880451000520822028502000
50대구광역시서구271702022레저세459538900000
51대구광역시서구271702022담배소비세0000
52대구광역시서구271702022지방소비세91162325000000
53대구광역시서구271702022등록면허세676543736627000555451099000
54대구광역시서구271702022도시계획세0000
55대구광역시서구271702022지역자원시설세644863227182000656257972000
56대구광역시서구271702022지방소득세640412850250000000
57대구광역시서구271702022교육세32404211638748000132000