Dataset statistics
Number of variables | 4 |
---|---|
Number of observations | 32 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 1.2 KiB |
Average record size in memory | 38.0 B |
Variable types
Categorical | 2 |
---|---|
Numeric | 2 |
Dataset
Description | 맞춤형복지 연도별, 소비항목별 제휴업체 이용실적 데이터 자료로 소비항목에는 자기계발, 건강관리, 여가생활 등이 있습니다. |
---|---|
Author | 공무원연금공단 |
URL | https://www.data.go.kr/data/3062872/fileData.do |
이용건수 is highly correlated with 판매액 | High correlation |
판매액 is highly correlated with 이용건수 | High correlation |
이용건수 has unique values | Unique |
판매액 has unique values | Unique |
Reproduction
Analysis started | 2022-11-19 08:49:23.259081 |
---|---|
Analysis finished | 2022-11-19 08:49:23.954976 |
Duration | 0.7 seconds |
Software version | pandas-profiling v3.2.0 |
Download configuration | config.json |
구분
Categorical
Distinct | 8 |
---|---|
Distinct (%) | 25.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 384.0 B |
2021년 | |
---|---|
2020년 | |
2019년 | |
2018년 | |
2017년 | |
Other values (3) |
Length
Max length | 5 |
---|---|
Median length | 5 |
Mean length | 5 |
Min length | 5 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 2021년 |
---|---|
2nd row | 2021년 |
3rd row | 2021년 |
4th row | 2021년 |
5th row | 2020년 |
Common Values
Value | Count | Frequency (%) |
2021년 | 4 | |
2020년 | 4 | |
2019년 | 4 | |
2018년 | 4 | |
2017년 | 4 | |
2016년 | 4 | |
2015년 | 4 | |
2014년 | 4 |
Length
Category Frequency Plot
Value | Count | Frequency (%) |
2021년 | 4 | |
2020년 | 4 | |
2019년 | 4 | |
2018년 | 4 | |
2017년 | 4 | |
2016년 | 4 | |
2015년 | 4 | |
2014년 | 4 |
소비항목
Categorical
Distinct | 5 |
---|---|
Distinct (%) | 15.6% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 384.0 B |
자기계발 | |
---|---|
건강관리 | |
가정친화 | |
여가활동 | |
여가생활 |
Length
Max length | 4 |
---|---|
Median length | 4 |
Mean length | 4 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 자기계발 |
---|---|
2nd row | 건강관리 |
3rd row | 여가생활 |
4th row | 가정친화 |
5th row | 자기계발 |
Common Values
Value | Count | Frequency (%) |
자기계발 | 8 | |
건강관리 | 8 | |
가정친화 | 8 | |
여가활동 | 5 | |
여가생활 | 3 | 9.4% |
Length
Category Frequency Plot
Value | Count | Frequency (%) |
자기계발 | 8 | |
건강관리 | 8 | |
가정친화 | 8 | |
여가활동 | 5 | |
여가생활 | 3 | 9.4% |
Distinct | 32 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 167452.2188 |
Minimum | 138 |
---|---|
Maximum | 890368 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 416.0 B |
Quantile statistics
Minimum | 138 |
---|---|
5-th percentile | 1864.4 |
Q1 | 8919.5 |
median | 19059 |
Q3 | 147501 |
95-th percentile | 748002 |
Maximum | 890368 |
Range | 890230 |
Interquartile range (IQR) | 138581.5 |
Descriptive statistics
Standard deviation | 280498.5441 |
---|---|
Coefficient of variation (CV) | 1.675096014 |
Kurtosis | 0.9016869299 |
Mean | 167452.2188 |
Median Absolute Deviation (MAD) | 12223 |
Skewness | 1.530044541 |
Sum | 5358471 |
Variance | 7.867943324 × 1010 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
138 | 1 | 3.1% |
608949 | 1 | 3.1% |
459442 | 1 | 3.1% |
31045 | 1 | 3.1% |
890368 | 1 | 3.1% |
4870 | 1 | 3.1% |
10880 | 1 | 3.1% |
858244 | 1 | 3.1% |
1640 | 1 | 3.1% |
5872 | 1 | 3.1% |
Other values (22) | 22 |
Value | Count | Frequency (%) |
138 | 1 | |
1640 | 1 | |
2048 | 1 | |
4278 | 1 | |
4870 | 1 | |
5872 | 1 | |
6599 | 1 | |
8552 | 1 | |
9042 | 1 | |
10880 | 1 |
Value | Count | Frequency (%) |
890368 | 1 | |
858244 | 1 | |
657804 | 1 | |
608949 | 1 | |
567298 | 1 | |
546013 | 1 | |
459442 | 1 | |
403581 | 1 | |
62141 | 1 | |
31045 | 1 |
Distinct | 32 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 12457682.44 |
Minimum | 39801 |
---|---|
Maximum | 66802101 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 416.0 B |
Quantile statistics
Minimum | 39801 |
---|---|
5-th percentile | 230173.75 |
Q1 | 2660618 |
median | 3924835 |
Q3 | 10557159.5 |
95-th percentile | 52096875.6 |
Maximum | 66802101 |
Range | 66762300 |
Interquartile range (IQR) | 7896541.5 |
Descriptive statistics
Standard deviation | 18331620.62 |
---|---|
Coefficient of variation (CV) | 1.471511311 |
Kurtosis | 2.163073746 |
Mean | 12457682.44 |
Median Absolute Deviation (MAD) | 1666380.5 |
Skewness | 1.766791928 |
Sum | 398645838 |
Variance | 3.360483145 × 1014 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
39801 | 1 | 3.1% |
36870538 | 1 | 3.1% |
26620218 | 1 | 3.1% |
3889103 | 1 | 3.1% |
66802101 | 1 | 3.1% |
216982 | 1 | 3.1% |
3708046 | 1 | 3.1% |
58806968 | 1 | 3.1% |
2137013 | 1 | 3.1% |
240967 | 1 | 3.1% |
Other values (22) | 22 |
Value | Count | Frequency (%) |
39801 | 1 | |
216982 | 1 | |
240967 | 1 | |
313895 | 1 | |
612139 | 1 | |
832183 | 1 | |
1218097 | 1 | |
2137013 | 1 | |
2835153 | 1 | |
2843290 | 1 |
Value | Count | Frequency (%) |
66802101 | 1 | |
58806968 | 1 | |
46606800 | 1 | |
36870538 | 1 | |
35722325 | 1 | |
31695210 | 1 | |
26620218 | 1 | |
25819316 | 1 | |
5469774 | 1 | |
5341609 | 1 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
구분 | 소비항목 | 이용건수 | 판매액 | |
---|---|---|---|---|
0 | 2021년 | 자기계발 | 138 | 39801 |
1 | 2021년 | 건강관리 | 12442 | 3946677 |
2 | 2021년 | 여가생활 | 31045 | 3889103 |
3 | 2021년 | 가정친화 | 890368 | 66802101 |
4 | 2020년 | 자기계발 | 4870 | 216982 |
5 | 2020년 | 건강관리 | 10880 | 3708046 |
6 | 2020년 | 여가생활 | 858244 | 58806968 |
7 | 2020년 | 가정친화 | 1640 | 2137013 |
8 | 2019년 | 자기계발 | 5872 | 240967 |
9 | 2019년 | 건강관리 | 9042 | 2845511 |
Last rows
구분 | 소비항목 | 이용건수 | 판매액 | |
---|---|---|---|---|
22 | 2016년 | 여가활동 | 18809 | 4952600 |
23 | 2016년 | 가정친화 | 546013 | 31695210 |
24 | 2015년 | 자기계발 | 23545 | 832183 |
25 | 2015년 | 건강관리 | 19709 | 3902993 |
26 | 2015년 | 여가활동 | 11656 | 4371353 |
27 | 2015년 | 가정친화 | 403581 | 25819316 |
28 | 2014년 | 자기계발 | 62141 | 2843290 |
29 | 2014년 | 건강관리 | 19309 | 4403357 |
30 | 2014년 | 여가활동 | 13931 | 4145821 |
31 | 2014년 | 가정친화 | 459442 | 26620218 |