Overview

Dataset statistics

Number of variables13
Number of observations1446
Missing cells1435
Missing cells (%)7.6%
Duplicate rows1
Duplicate rows (%)0.1%
Total size in memory155.5 KiB
Average record size in memory110.1 B

Variable types

Categorical3
Text3
Numeric6
Boolean1

Dataset

Description건강기능식품수입업 현황
Author행정안전부
URLhttps://data.gg.go.kr/portal/data/service/selectServicePage.do?&infId=HFRH1RQ3S4BKY6CW5N6X13353646&infSeq=1

Alerts

다중이용업소여부 has constant value ""Constant
Dataset has 1 (0.1%) duplicate rowsDuplicates
위생업종명 is highly overall correlated with 인허가일자 and 7 other fieldsHigh correlation
시군명 is highly overall correlated with 소재지우편번호 and 3 other fieldsHigh correlation
영업상태명 is highly overall correlated with 폐업일자 and 1 other fieldsHigh correlation
인허가일자 is highly overall correlated with 폐업일자 and 2 other fieldsHigh correlation
폐업일자 is highly overall correlated with 인허가일자 and 3 other fieldsHigh correlation
년도 is highly overall correlated with 인허가일자 and 2 other fieldsHigh correlation
소재지우편번호 is highly overall correlated with WGS84경도 and 2 other fieldsHigh correlation
WGS84위도 is highly overall correlated with 시군명 and 1 other fieldsHigh correlation
WGS84경도 is highly overall correlated with 소재지우편번호 and 2 other fieldsHigh correlation
위생업종명 is highly imbalanced (87.4%)Imbalance
폐업일자 has 830 (57.4%) missing valuesMissing
년도 has 25 (1.7%) missing valuesMissing
다중이용업소여부 has 25 (1.7%) missing valuesMissing
소재지도로명주소 has 163 (11.3%) missing valuesMissing
소재지우편번호 has 121 (8.4%) missing valuesMissing
WGS84위도 has 135 (9.3%) missing valuesMissing
WGS84경도 has 135 (9.3%) missing valuesMissing

Reproduction

Analysis started2023-12-10 21:20:47.400960
Analysis finished2023-12-10 21:20:52.948437
Duration5.55 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시군명
Categorical

HIGH CORRELATION 

Distinct30
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Memory size11.4 KiB
성남시
294 
고양시
219 
안양시
127 
수원시
109 
용인시
91 
Other values (25)
606 

Length

Max length4
Median length3
Mean length3.0373444
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row가평군
2nd row가평군
3rd row가평군
4th row가평군
5th row고양시

Common Values

ValueCountFrequency (%)
성남시 294
20.3%
고양시 219
15.1%
안양시 127
 
8.8%
수원시 109
 
7.5%
용인시 91
 
6.3%
부천시 87
 
6.0%
안산시 53
 
3.7%
화성시 49
 
3.4%
김포시 42
 
2.9%
군포시 31
 
2.1%
Other values (20) 344
23.8%

Length

2023-12-11T06:20:53.022177image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
성남시 294
20.3%
고양시 219
15.1%
안양시 127
 
8.8%
수원시 109
 
7.5%
용인시 91
 
6.3%
부천시 87
 
6.0%
안산시 53
 
3.7%
화성시 49
 
3.4%
김포시 42
 
2.9%
군포시 31
 
2.1%
Other values (20) 344
23.8%
Distinct1276
Distinct (%)88.2%
Missing0
Missing (%)0.0%
Memory size11.4 KiB
2023-12-11T06:20:53.255217image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length40
Median length27
Mean length7.5345781
Min length2

Characters and Unicode

Total characters10895
Distinct characters522
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1127 ?
Unique (%)77.9%

Sample

1st row(주)나눔생명과학연구소
2nd row더블케이
3rd row히말
4th row나눔생명과학연구소
5th row월간암
ValueCountFrequency (%)
주식회사 58
 
3.5%
10
 
0.6%
korea 7
 
0.4%
인터내셔널 6
 
0.4%
글로벌 5
 
0.3%
주)씨알 4
 
0.2%
코리아 4
 
0.2%
인터내셔날 4
 
0.2%
주)건강사랑 4
 
0.2%
시카고헬스코리아 3
 
0.2%
Other values (1346) 1531
93.6%
2023-12-11T06:20:53.668225image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
808
 
7.4%
) 762
 
7.0%
( 754
 
6.9%
509
 
4.7%
334
 
3.1%
263
 
2.4%
234
 
2.1%
228
 
2.1%
190
 
1.7%
190
 
1.7%
Other values (512) 6623
60.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 8530
78.3%
Close Punctuation 762
 
7.0%
Open Punctuation 754
 
6.9%
Uppercase Letter 372
 
3.4%
Lowercase Letter 242
 
2.2%
Space Separator 190
 
1.7%
Other Punctuation 33
 
0.3%
Decimal Number 8
 
0.1%
Dash Punctuation 4
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
808
 
9.5%
509
 
6.0%
334
 
3.9%
263
 
3.1%
234
 
2.7%
228
 
2.7%
190
 
2.2%
147
 
1.7%
134
 
1.6%
130
 
1.5%
Other values (456) 5553
65.1%
Uppercase Letter
ValueCountFrequency (%)
N 33
 
8.9%
A 33
 
8.9%
E 27
 
7.3%
K 24
 
6.5%
I 23
 
6.2%
H 21
 
5.6%
L 19
 
5.1%
B 19
 
5.1%
T 19
 
5.1%
O 19
 
5.1%
Other values (15) 135
36.3%
Lowercase Letter
ValueCountFrequency (%)
e 38
15.7%
a 32
13.2%
r 25
10.3%
o 21
8.7%
i 20
8.3%
t 18
7.4%
n 15
 
6.2%
l 15
 
6.2%
s 10
 
4.1%
h 8
 
3.3%
Other values (11) 40
16.5%
Other Punctuation
ValueCountFrequency (%)
. 17
51.5%
& 13
39.4%
, 3
 
9.1%
Decimal Number
ValueCountFrequency (%)
2 5
62.5%
5 2
 
25.0%
1 1
 
12.5%
Close Punctuation
ValueCountFrequency (%)
) 762
100.0%
Open Punctuation
ValueCountFrequency (%)
( 754
100.0%
Space Separator
ValueCountFrequency (%)
190
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 8530
78.3%
Common 1751
 
16.1%
Latin 614
 
5.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
808
 
9.5%
509
 
6.0%
334
 
3.9%
263
 
3.1%
234
 
2.7%
228
 
2.7%
190
 
2.2%
147
 
1.7%
134
 
1.6%
130
 
1.5%
Other values (456) 5553
65.1%
Latin
ValueCountFrequency (%)
e 38
 
6.2%
N 33
 
5.4%
A 33
 
5.4%
a 32
 
5.2%
E 27
 
4.4%
r 25
 
4.1%
K 24
 
3.9%
I 23
 
3.7%
o 21
 
3.4%
H 21
 
3.4%
Other values (36) 337
54.9%
Common
ValueCountFrequency (%)
) 762
43.5%
( 754
43.1%
190
 
10.9%
. 17
 
1.0%
& 13
 
0.7%
2 5
 
0.3%
- 4
 
0.2%
, 3
 
0.2%
5 2
 
0.1%
1 1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 8530
78.3%
ASCII 2365
 
21.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
808
 
9.5%
509
 
6.0%
334
 
3.9%
263
 
3.1%
234
 
2.7%
228
 
2.7%
190
 
2.2%
147
 
1.7%
134
 
1.6%
130
 
1.5%
Other values (456) 5553
65.1%
ASCII
ValueCountFrequency (%)
) 762
32.2%
( 754
31.9%
190
 
8.0%
e 38
 
1.6%
N 33
 
1.4%
A 33
 
1.4%
a 32
 
1.4%
E 27
 
1.1%
r 25
 
1.1%
K 24
 
1.0%
Other values (46) 447
18.9%

인허가일자
Real number (ℝ)

HIGH CORRELATION 

Distinct1038
Distinct (%)71.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20087036
Minimum19971101
Maximum20160203
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.8 KiB
2023-12-11T06:20:53.829547image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum19971101
5-th percentile20040306
Q120050526
median20081108
Q320121219
95-th percentile20150717
Maximum20160203
Range189102
Interquartile range (IQR)70692.75

Descriptive statistics

Standard deviation41910.871
Coefficient of variation (CV)0.0020864637
Kurtosis-0.86210985
Mean20087036
Median Absolute Deviation (MAD)39205.5
Skewness-0.073887842
Sum2.9045854 × 1010
Variance1.7565211 × 109
MonotonicityNot monotonic
2023-12-11T06:20:53.956835image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
19971101 11
 
0.8%
20040730 8
 
0.6%
20040723 7
 
0.5%
20040622 5
 
0.3%
20041029 5
 
0.3%
20090401 4
 
0.3%
20040714 4
 
0.3%
20040614 4
 
0.3%
20140707 4
 
0.3%
20050421 4
 
0.3%
Other values (1028) 1390
96.1%
ValueCountFrequency (%)
19971101 11
0.8%
19980403 1
 
0.1%
19980404 1
 
0.1%
19980616 1
 
0.1%
19980806 1
 
0.1%
19981125 1
 
0.1%
19981210 1
 
0.1%
19981214 2
 
0.1%
19990323 1
 
0.1%
19990615 1
 
0.1%
ValueCountFrequency (%)
20160203 1
0.1%
20160201 2
0.1%
20160129 1
0.1%
20160128 2
0.1%
20160121 1
0.1%
20160115 1
0.1%
20160114 2
0.1%
20160113 1
0.1%
20160107 2
0.1%
20160106 1
0.1%

영업상태명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size11.4 KiB
운영중
830 
폐업 등
616 

Length

Max length4
Median length3
Mean length3.4260028
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row운영중
2nd row운영중
3rd row운영중
4th row폐업 등
5th row운영중

Common Values

ValueCountFrequency (%)
운영중 830
57.4%
폐업 등 616
42.6%

Length

2023-12-11T06:20:54.105598image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T06:20:54.199135image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
운영중 830
40.3%
폐업 616
29.9%
616
29.9%

폐업일자
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct393
Distinct (%)63.8%
Missing830
Missing (%)57.4%
Infinite0
Infinite (%)0.0%
Mean20096406
Minimum20040712
Maximum20160225
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.8 KiB
2023-12-11T06:20:54.306756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20040712
5-th percentile20041231
Q120060620
median20100609
Q320131104
95-th percentile20151110
Maximum20160225
Range119513
Interquartile range (IQR)70484

Descriptive statistics

Standard deviation40206.673
Coefficient of variation (CV)0.0020006897
Kurtosis-1.4876394
Mean20096406
Median Absolute Deviation (MAD)39807
Skewness-0.062426433
Sum1.2379386 × 1010
Variance1.6165765 × 109
MonotonicityNot monotonic
2023-12-11T06:20:54.450274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20041231 108
 
7.5%
20060728 20
 
1.4%
20060906 10
 
0.7%
20051231 7
 
0.5%
20151230 4
 
0.3%
20090608 4
 
0.3%
20070615 4
 
0.3%
20070501 4
 
0.3%
20070515 3
 
0.2%
20110307 3
 
0.2%
Other values (383) 449
31.1%
(Missing) 830
57.4%
ValueCountFrequency (%)
20040712 1
 
0.1%
20040827 1
 
0.1%
20040922 1
 
0.1%
20041203 1
 
0.1%
20041209 1
 
0.1%
20041213 1
 
0.1%
20041220 1
 
0.1%
20041221 1
 
0.1%
20041231 108
7.5%
20050107 1
 
0.1%
ValueCountFrequency (%)
20160225 1
 
0.1%
20160201 2
0.1%
20160125 1
 
0.1%
20160121 1
 
0.1%
20160119 1
 
0.1%
20160115 1
 
0.1%
20160113 1
 
0.1%
20160108 1
 
0.1%
20151231 1
 
0.1%
20151230 4
0.3%

년도
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct6
Distinct (%)0.4%
Missing25
Missing (%)1.7%
Infinite0
Infinite (%)0.0%
Mean2011.8142
Minimum2011
Maximum2016
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.8 KiB
2023-12-11T06:20:54.583948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2011
5-th percentile2011
Q12011
median2011
Q32012
95-th percentile2015
Maximum2016
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.38069
Coefficient of variation (CV)0.000686291
Kurtosis0.61039495
Mean2011.8142
Median Absolute Deviation (MAD)0
Skewness1.4403943
Sum2858788
Variance1.9063048
MonotonicityNot monotonic
2023-12-11T06:20:54.691780image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
2011 985
68.1%
2015 118
 
8.2%
2013 113
 
7.8%
2014 105
 
7.3%
2012 89
 
6.2%
2016 11
 
0.8%
(Missing) 25
 
1.7%
ValueCountFrequency (%)
2011 985
68.1%
2012 89
 
6.2%
2013 113
 
7.8%
2014 105
 
7.3%
2015 118
 
8.2%
2016 11
 
0.8%
ValueCountFrequency (%)
2016 11
 
0.8%
2015 118
 
8.2%
2014 105
 
7.3%
2013 113
 
7.8%
2012 89
 
6.2%
2011 985
68.1%

다중이용업소여부
Boolean

CONSTANT  MISSING 

Distinct1
Distinct (%)0.1%
Missing25
Missing (%)1.7%
Memory size3.0 KiB
False
1421 
(Missing)
 
25
ValueCountFrequency (%)
False 1421
98.3%
(Missing) 25
 
1.7%
2023-12-11T06:20:54.796632image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

위생업종명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size11.4 KiB
건강기능식품수입업
1421 
<NA>
 
25

Length

Max length9
Median length9
Mean length8.9135546
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row건강기능식품수입업
2nd row건강기능식품수입업
3rd row건강기능식품수입업
4th row건강기능식품수입업
5th row건강기능식품수입업

Common Values

ValueCountFrequency (%)
건강기능식품수입업 1421
98.3%
<NA> 25
 
1.7%

Length

2023-12-11T06:20:54.892132image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T06:20:54.993723image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
건강기능식품수입업 1421
98.3%
na 25
 
1.7%
Distinct1238
Distinct (%)96.5%
Missing163
Missing (%)11.3%
Memory size11.4 KiB
2023-12-11T06:20:55.261249image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length60
Median length45
Mean length31.790335
Min length13

Characters and Unicode

Total characters40787
Distinct characters439
Distinct categories12 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1198 ?
Unique (%)93.4%

Sample

1st row경기도 가평군 설악면 유명로 1691-14
2nd row경기도 가평군 청평면 구청평로 24-14
3rd row경기도 가평군 청평면 경춘로 277-24
4th row경기도 고양시 일산동구 무궁화로 42-38, 406호 (장항동, 범진빌딩)
5th row경기도 고양시 덕양구 무원로36번길 4 (행신동, 지하1층)
ValueCountFrequency (%)
경기도 1283
 
15.3%
성남시 264
 
3.2%
고양시 200
 
2.4%
분당구 195
 
2.3%
일산동구 118
 
1.4%
안양시 102
 
1.2%
수원시 98
 
1.2%
동안구 80
 
1.0%
부천시 80
 
1.0%
용인시 79
 
0.9%
Other values (2558) 5863
70.1%
2023-12-11T06:20:55.755030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7108
 
17.4%
1 1526
 
3.7%
1371
 
3.4%
1345
 
3.3%
1326
 
3.3%
1315
 
3.2%
1310
 
3.2%
1252
 
3.1%
, 1111
 
2.7%
2 1062
 
2.6%
Other values (429) 22061
54.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 23160
56.8%
Space Separator 7108
 
17.4%
Decimal Number 7104
 
17.4%
Other Punctuation 1112
 
2.7%
Open Punctuation 926
 
2.3%
Close Punctuation 926
 
2.3%
Dash Punctuation 322
 
0.8%
Uppercase Letter 115
 
0.3%
Lowercase Letter 8
 
< 0.1%
Math Symbol 3
 
< 0.1%
Other values (2) 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1371
 
5.9%
1345
 
5.8%
1326
 
5.7%
1315
 
5.7%
1310
 
5.7%
1252
 
5.4%
861
 
3.7%
636
 
2.7%
520
 
2.2%
513
 
2.2%
Other values (391) 12711
54.9%
Uppercase Letter
ValueCountFrequency (%)
B 41
35.7%
A 20
17.4%
C 17
14.8%
K 7
 
6.1%
T 7
 
6.1%
S 5
 
4.3%
I 5
 
4.3%
E 4
 
3.5%
D 4
 
3.5%
F 2
 
1.7%
Other values (3) 3
 
2.6%
Decimal Number
ValueCountFrequency (%)
1 1526
21.5%
2 1062
14.9%
0 804
11.3%
3 801
11.3%
4 678
9.5%
5 569
 
8.0%
6 481
 
6.8%
7 435
 
6.1%
8 417
 
5.9%
9 331
 
4.7%
Lowercase Letter
ValueCountFrequency (%)
c 3
37.5%
a 1
 
12.5%
k 1
 
12.5%
s 1
 
12.5%
w 1
 
12.5%
j 1
 
12.5%
Other Punctuation
ValueCountFrequency (%)
, 1111
99.9%
/ 1
 
0.1%
Space Separator
ValueCountFrequency (%)
7108
100.0%
Open Punctuation
ValueCountFrequency (%)
( 926
100.0%
Close Punctuation
ValueCountFrequency (%)
) 926
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 322
100.0%
Math Symbol
ValueCountFrequency (%)
~ 3
100.0%
Letter Number
ValueCountFrequency (%)
2
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 23160
56.8%
Common 17502
42.9%
Latin 125
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1371
 
5.9%
1345
 
5.8%
1326
 
5.7%
1315
 
5.7%
1310
 
5.7%
1252
 
5.4%
861
 
3.7%
636
 
2.7%
520
 
2.2%
513
 
2.2%
Other values (391) 12711
54.9%
Latin
ValueCountFrequency (%)
B 41
32.8%
A 20
16.0%
C 17
13.6%
K 7
 
5.6%
T 7
 
5.6%
S 5
 
4.0%
I 5
 
4.0%
E 4
 
3.2%
D 4
 
3.2%
c 3
 
2.4%
Other values (10) 12
 
9.6%
Common
ValueCountFrequency (%)
7108
40.6%
1 1526
 
8.7%
, 1111
 
6.3%
2 1062
 
6.1%
( 926
 
5.3%
) 926
 
5.3%
0 804
 
4.6%
3 801
 
4.6%
4 678
 
3.9%
5 569
 
3.3%
Other values (8) 1991
 
11.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 23160
56.8%
ASCII 17624
43.2%
Number Forms 2
 
< 0.1%
Enclosed Alphanum 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7108
40.3%
1 1526
 
8.7%
, 1111
 
6.3%
2 1062
 
6.0%
( 926
 
5.3%
) 926
 
5.3%
0 804
 
4.6%
3 801
 
4.5%
4 678
 
3.8%
5 569
 
3.2%
Other values (26) 2113
 
12.0%
Hangul
ValueCountFrequency (%)
1371
 
5.9%
1345
 
5.8%
1326
 
5.7%
1315
 
5.7%
1310
 
5.7%
1252
 
5.4%
861
 
3.7%
636
 
2.7%
520
 
2.2%
513
 
2.2%
Other values (391) 12711
54.9%
Number Forms
ValueCountFrequency (%)
2
100.0%
Enclosed Alphanum
ValueCountFrequency (%)
1
100.0%
Distinct1318
Distinct (%)91.2%
Missing1
Missing (%)0.1%
Memory size11.4 KiB
2023-12-11T06:20:56.034766image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length52
Median length42
Mean length28.42699
Min length7

Characters and Unicode

Total characters41077
Distinct characters423
Distinct categories12 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1260 ?
Unique (%)87.2%

Sample

1st row경기도 가평군 설악면 선촌리 375-5번지
2nd row경기도 가평군 청평면 청평리 669-37번지
3rd row경기도 가평군 청평면 대성리 321-5번지
4th row경기도 가평군
5th row경기도 고양시 일산동구 장항동 775번지 범진빌딩 406호
ValueCountFrequency (%)
경기도 1444
 
16.7%
성남시 294
 
3.4%
고양시 219
 
2.5%
분당구 196
 
2.3%
안양시 125
 
1.4%
일산동구 113
 
1.3%
수원시 109
 
1.3%
1층 98
 
1.1%
용인시 91
 
1.1%
2층 88
 
1.0%
Other values (2634) 5884
67.9%
2023-12-11T06:20:56.502600image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7397
 
18.0%
1 1762
 
4.3%
1582
 
3.9%
1503
 
3.7%
1488
 
3.6%
1483
 
3.6%
1469
 
3.6%
1454
 
3.5%
1320
 
3.2%
2 1145
 
2.8%
Other values (413) 20474
49.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 24188
58.9%
Decimal Number 8165
 
19.9%
Space Separator 7397
 
18.0%
Dash Punctuation 1014
 
2.5%
Uppercase Letter 160
 
0.4%
Open Punctuation 50
 
0.1%
Close Punctuation 49
 
0.1%
Other Punctuation 40
 
0.1%
Lowercase Letter 8
 
< 0.1%
Math Symbol 3
 
< 0.1%
Other values (2) 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1582
 
6.5%
1503
 
6.2%
1488
 
6.2%
1483
 
6.1%
1469
 
6.1%
1454
 
6.0%
1320
 
5.5%
906
 
3.7%
733
 
3.0%
566
 
2.3%
Other values (368) 11684
48.3%
Uppercase Letter
ValueCountFrequency (%)
B 57
35.6%
A 23
14.4%
C 21
 
13.1%
T 9
 
5.6%
I 9
 
5.6%
D 9
 
5.6%
K 8
 
5.0%
S 6
 
3.8%
E 5
 
3.1%
G 3
 
1.9%
Other values (7) 10
 
6.2%
Decimal Number
ValueCountFrequency (%)
1 1762
21.6%
2 1145
14.0%
0 954
11.7%
3 898
11.0%
4 741
9.1%
5 714
8.7%
6 573
 
7.0%
7 527
 
6.5%
8 439
 
5.4%
9 412
 
5.0%
Lowercase Letter
ValueCountFrequency (%)
c 2
25.0%
e 1
12.5%
j 1
12.5%
s 1
12.5%
k 1
12.5%
a 1
12.5%
w 1
12.5%
Other Punctuation
ValueCountFrequency (%)
, 36
90.0%
/ 2
 
5.0%
@ 1
 
2.5%
& 1
 
2.5%
Space Separator
ValueCountFrequency (%)
7397
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1014
100.0%
Open Punctuation
ValueCountFrequency (%)
( 50
100.0%
Close Punctuation
ValueCountFrequency (%)
) 49
100.0%
Math Symbol
ValueCountFrequency (%)
~ 3
100.0%
Letter Number
ValueCountFrequency (%)
2
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 24188
58.9%
Common 16719
40.7%
Latin 170
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1582
 
6.5%
1503
 
6.2%
1488
 
6.2%
1483
 
6.1%
1469
 
6.1%
1454
 
6.0%
1320
 
5.5%
906
 
3.7%
733
 
3.0%
566
 
2.3%
Other values (368) 11684
48.3%
Latin
ValueCountFrequency (%)
B 57
33.5%
A 23
13.5%
C 21
 
12.4%
T 9
 
5.3%
I 9
 
5.3%
D 9
 
5.3%
K 8
 
4.7%
S 6
 
3.5%
E 5
 
2.9%
G 3
 
1.8%
Other values (15) 20
 
11.8%
Common
ValueCountFrequency (%)
7397
44.2%
1 1762
 
10.5%
2 1145
 
6.8%
- 1014
 
6.1%
0 954
 
5.7%
3 898
 
5.4%
4 741
 
4.4%
5 714
 
4.3%
6 573
 
3.4%
7 527
 
3.2%
Other values (10) 994
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 24188
58.9%
ASCII 16886
41.1%
Number Forms 2
 
< 0.1%
Enclosed Alphanum 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7397
43.8%
1 1762
 
10.4%
2 1145
 
6.8%
- 1014
 
6.0%
0 954
 
5.6%
3 898
 
5.3%
4 741
 
4.4%
5 714
 
4.2%
6 573
 
3.4%
7 527
 
3.1%
Other values (33) 1161
 
6.9%
Hangul
ValueCountFrequency (%)
1582
 
6.5%
1503
 
6.2%
1488
 
6.2%
1483
 
6.1%
1469
 
6.1%
1454
 
6.0%
1320
 
5.5%
906
 
3.7%
733
 
3.0%
566
 
2.3%
Other values (368) 11684
48.3%
Number Forms
ValueCountFrequency (%)
2
100.0%
Enclosed Alphanum
ValueCountFrequency (%)
1
100.0%

소재지우편번호
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct707
Distinct (%)53.4%
Missing121
Missing (%)8.4%
Infinite0
Infinite (%)0.0%
Mean403764.39
Minimum10055
Maximum487922
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.8 KiB
2023-12-11T06:20:56.650607image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10055
5-th percentile14594
Q1413815
median441849
Q3463804
95-th percentile472861
Maximum487922
Range477867
Interquartile range (IQR)49989

Descriptive statistics

Standard deviation125706.28
Coefficient of variation (CV)0.31133574
Kurtosis5.522549
Mean403764.39
Median Absolute Deviation (MAD)21976
Skewness-2.6846909
Sum5.3498782 × 108
Variance1.580207 × 1010
MonotonicityNot monotonic
2023-12-11T06:20:56.824133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
410837 54
 
3.7%
410835 31
 
2.1%
463824 22
 
1.5%
463825 22
 
1.5%
463847 18
 
1.2%
412827 15
 
1.0%
431815 14
 
1.0%
462807 13
 
0.9%
463741 10
 
0.7%
463828 10
 
0.7%
Other values (697) 1116
77.2%
(Missing) 121
 
8.4%
ValueCountFrequency (%)
10055 1
0.1%
10219 1
0.1%
10399 1
0.1%
10401 1
0.1%
10414 2
0.1%
10445 1
0.1%
12731 1
0.1%
12772 2
0.1%
12801 1
0.1%
13135 1
0.1%
ValueCountFrequency (%)
487922 1
0.1%
487833 1
0.1%
487826 2
0.1%
487822 1
0.1%
487816 1
0.1%
487803 1
0.1%
487050 1
0.1%
487030 1
0.1%
483100 1
0.1%
483030 1
0.1%

WGS84위도
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct1063
Distinct (%)81.1%
Missing135
Missing (%)9.3%
Infinite0
Infinite (%)0.0%
Mean37.440266
Minimum36.938533
Maximum38.040758
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.8 KiB
2023-12-11T06:20:56.991244image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum36.938533
5-th percentile37.144309
Q137.334135
median37.401119
Q337.60703
95-th percentile37.733969
Maximum38.040758
Range1.1022254
Interquartile range (IQR)0.27289548

Descriptive statistics

Standard deviation0.17893198
Coefficient of variation (CV)0.0047791322
Kurtosis-0.1519932
Mean37.440266
Median Absolute Deviation (MAD)0.10211344
Skewness0.12442556
Sum49084.189
Variance0.032016655
MonotonicityNot monotonic
2023-12-11T06:20:57.122676image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
37.3889401776 16
 
1.1%
37.366142335 12
 
0.8%
37.3443549674 11
 
0.8%
37.640363152 8
 
0.6%
37.3795300021 6
 
0.4%
37.3945555957 6
 
0.4%
37.340506942 5
 
0.3%
37.6399272494 5
 
0.3%
37.370051815 5
 
0.3%
37.4319270019 4
 
0.3%
Other values (1053) 1233
85.3%
(Missing) 135
 
9.3%
ValueCountFrequency (%)
36.9385325442 1
0.1%
36.9448447964 1
0.1%
36.9584835375 1
0.1%
36.9694743993 1
0.1%
36.979235186 1
0.1%
36.9892555319 1
0.1%
36.9893331167 1
0.1%
36.9903076154 1
0.1%
36.9910653124 1
0.1%
36.9919555069 1
0.1%
ValueCountFrequency (%)
38.0407579057 1
0.1%
37.9459864374 1
0.1%
37.9176978791 1
0.1%
37.8984375157 1
0.1%
37.8962602159 1
0.1%
37.8926900343 1
0.1%
37.879075276 1
0.1%
37.8774641574 1
0.1%
37.8705975218 1
0.1%
37.8570785916 1
0.1%

WGS84경도
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct1063
Distinct (%)81.1%
Missing135
Missing (%)9.3%
Infinite0
Infinite (%)0.0%
Mean126.98481
Minimum126.56106
Maximum127.63325
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.8 KiB
2023-12-11T06:20:57.259124image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum126.56106
5-th percentile126.74593
Q1126.81759
median127.00033
Q3127.11743
95-th percentile127.22129
Maximum127.63325
Range1.0721942
Interquartile range (IQR)0.29984004

Descriptive statistics

Standard deviation0.17631972
Coefficient of variation (CV)0.0013885103
Kurtosis-0.16477015
Mean126.98481
Median Absolute Deviation (MAD)0.13072268
Skewness0.21421931
Sum166477.08
Variance0.031088643
MonotonicityNot monotonic
2023-12-11T06:20:57.392933image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
127.122552825 16
 
1.1%
127.1053883207 12
 
0.8%
127.10510721 11
 
0.8%
126.7867096575 8
 
0.6%
127.1142414695 6
 
0.4%
126.9598855251 6
 
0.4%
127.1080851942 5
 
0.3%
126.7861697718 5
 
0.3%
126.9530364977 5
 
0.3%
127.1774287775 4
 
0.3%
Other values (1053) 1233
85.3%
(Missing) 135
 
9.3%
ValueCountFrequency (%)
126.5610598959 1
0.1%
126.5746807386 2
0.1%
126.578023211 1
0.1%
126.5881963706 2
0.1%
126.5887889164 1
0.1%
126.5890984365 1
0.1%
126.5897002858 1
0.1%
126.5924003108 1
0.1%
126.6084104948 1
0.1%
126.6263124124 1
0.1%
ValueCountFrequency (%)
127.6332540954 1
0.1%
127.6056599911 1
0.1%
127.6016773811 1
0.1%
127.5573407149 1
0.1%
127.5562184664 1
0.1%
127.5559884317 1
0.1%
127.5380760033 1
0.1%
127.5348579944 1
0.1%
127.4971426938 1
0.1%
127.493686935 1
0.1%

Interactions

2023-12-11T06:20:51.740445image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:20:48.592536image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:20:49.201909image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:20:49.832964image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:20:50.492094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:20:51.117260image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:20:51.856215image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:20:48.713715image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:20:49.305047image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:20:49.950754image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:20:50.589167image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:20:51.222558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:20:51.960503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:20:48.813161image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:20:49.403733image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:20:50.056245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:20:50.694862image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:20:51.330978image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:20:52.275071image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:20:48.908879image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:20:49.510400image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:20:50.167785image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:20:50.808117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:20:51.443452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:20:52.358868image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:20:49.010610image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:20:49.600908image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:20:50.271190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:20:50.895353image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:20:51.556125image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:20:52.453573image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:20:49.102805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:20:49.712623image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:20:50.383473image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:20:51.008779image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:20:51.646572image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T06:20:57.493338image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군명인허가일자영업상태명폐업일자년도소재지우편번호WGS84위도WGS84경도
시군명1.0000.0960.2160.2260.0000.9920.9660.965
인허가일자0.0961.0000.4230.7931.0000.0000.0850.077
영업상태명0.2160.4231.000NaN0.2490.0420.1240.068
폐업일자0.2260.793NaN1.0000.6390.1030.1530.175
년도0.0001.0000.2490.6391.0000.0000.0680.059
소재지우편번호0.9920.0000.0420.1030.0001.0000.6370.760
WGS84위도0.9660.0850.1240.1530.0680.6371.0000.700
WGS84경도0.9650.0770.0680.1750.0590.7600.7001.000
2023-12-11T06:20:57.839465image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
위생업종명시군명영업상태명
위생업종명1.0001.0001.000
시군명1.0001.0000.170
영업상태명1.0000.1701.000
2023-12-11T06:20:57.916799image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
인허가일자폐업일자년도소재지우편번호WGS84위도WGS84경도시군명영업상태명위생업종명
인허가일자1.0000.7670.816-0.005-0.010-0.0060.0400.3381.000
폐업일자0.7671.0000.5720.063-0.0050.0090.0721.0001.000
년도0.8160.5721.000-0.001-0.005-0.0000.0000.2861.000
소재지우편번호-0.0050.063-0.0011.000-0.2320.8270.8890.0731.000
WGS84위도-0.010-0.005-0.005-0.2321.000-0.3300.7160.0951.000
WGS84경도-0.0060.009-0.0000.827-0.3301.0000.7120.0521.000
시군명0.0400.0720.0000.8890.7160.7121.0000.1701.000
영업상태명0.3381.0000.2860.0730.0950.0520.1701.0001.000
위생업종명1.0001.0001.0001.0001.0001.0001.0001.0001.000

Missing values

2023-12-11T06:20:52.568887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T06:20:52.724510image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T06:20:52.858855image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

시군명사업장명인허가일자영업상태명폐업일자년도다중이용업소여부위생업종명소재지도로명주소소재지지번주소소재지우편번호WGS84위도WGS84경도
0가평군(주)나눔생명과학연구소20040913운영중<NA>2011N건강기능식품수입업경기도 가평군 설악면 유명로 1691-14경기도 가평군 설악면 선촌리 375-5번지47785437.677328127.483817
1가평군더블케이20140605운영중<NA>2014N건강기능식품수입업경기도 가평군 청평면 구청평로 24-14경기도 가평군 청평면 청평리 669-37번지47781337.730079127.411368
2가평군히말20131113운영중<NA>2013N건강기능식품수입업경기도 가평군 청평면 경춘로 277-24경기도 가평군 청평면 대성리 321-5번지47781237.700781127.386254
3가평군나눔생명과학연구소20020415폐업 등200412312011N건강기능식품수입업<NA>경기도 가평군<NA><NA><NA>
4고양시월간암20120215운영중<NA>2012N건강기능식품수입업경기도 고양시 일산동구 무궁화로 42-38, 406호 (장항동, 범진빌딩)경기도 고양시 일산동구 장항동 775번지 범진빌딩 406호41083737.661831126.770189
5고양시네츄럴 하우스20120322운영중<NA>2012N건강기능식품수입업경기도 고양시 덕양구 무원로36번길 4 (행신동, 지하1층)경기도 고양시 덕양구 행신동 719-1번지 지하1층41282537.615836126.832652
6고양시유에스일공일20120504운영중<NA>2012N건강기능식품수입업경기도 고양시 일산동구 호수로 662, 204-1호 (장항동, 삼성라끄빌)경기도 고양시 일산동구 장항동 751번지 삼성라끄빌 204-1호41083737.660759126.76588
7고양시(주)건강과 이웃20120313운영중<NA>2012N건강기능식품수입업경기도 고양시 일산서구 중앙로 1456, 417호 (주엽동, 서현프라자)경기도 고양시 일산서구 주엽동 18-2번지 서현프라자 417호41183837.671558126.758982
8고양시(주)오스트레일리언메이드20111107운영중<NA>2011N건강기능식품수입업경기도 고양시 일산동구 동국로 81-8 (식사동, 1층일부)경기도 고양시 일산동구 식사동 1008-2번지 1층일부41082137.679936126.802548
9고양시복문트레이딩(주)20050526운영중<NA>2011N건강기능식품수입업경기도 고양시 일산동구 중앙로1275번길 38-10, 336호 (장항동, 우림로데오스위트)경기도 고양시 일산동구 장항동 771-1번지 우림로데오스위트 336호41083737.659943126.770945
시군명사업장명인허가일자영업상태명폐업일자년도다중이용업소여부위생업종명소재지도로명주소소재지지번주소소재지우편번호WGS84위도WGS84경도
1436화성시(주)대호양행19981210폐업 등200412312011N건강기능식품수입업<NA>경기도 화성시 팔탄면 덕우리445918<NA><NA>
1437화성시명문제약(주)20050429폐업 등200412312011N건강기능식품수입업<NA>경기도 화성시 향남면 상신리<NA><NA><NA>
1438화성시(주)동구제약19971101폐업 등200412312011N건강기능식품수입업<NA>경기도 화성시<NA><NA><NA>
1439화성시(주)씨에이치케미칼20101006폐업 등201208062011N건강기능식품수입업경기도 화성시 팔탄면 서해로 1409-15, 201호 (영재빌딩 2층 201호)경기도 화성시 팔탄면 창곡리 437-13번지 영재빌딩 2층 201호 201호44594937.187408126.880662
1440화성시일진제약(주)20020202폐업 등200412312011N건강기능식품수입업<NA>경기도 화성시 향남면 상신리<NA><NA><NA>
1441화성시(주)나라원20130709폐업 등201411282013N건강기능식품수입업경기도 화성시 양감면 초록로 761-21 (A동)경기도 화성시 양감면 송산리 238-6번지 A동44593337.119334126.982165
1442화성시에스케이내추럴팜(주)20041020폐업 등200412312011N건강기능식품수입업<NA>경기도 화성시<NA><NA><NA>
1443화성시비알푸드20080401폐업 등201210252011N건강기능식품수입업경기도 화성시 동탄중심상가2길 26-21 (반송동,로드프라자 501호)경기도 화성시 반송동 87-3번지 로드프라자 501호44516037.20673127.074
1444화성시(주)생보통상19980806폐업 등200412312011N건강기능식품수입업<NA>경기도 화성시 태안읍 진안리<NA><NA><NA>
1445화성시(주)씨티씨바이오20050112폐업 등200412312011N건강기능식품수입업<NA>경기도 화성시 팔탄면 노하리445909<NA><NA>

Duplicate rows

Most frequently occurring

시군명사업장명인허가일자영업상태명폐업일자년도다중이용업소여부위생업종명소재지도로명주소소재지지번주소소재지우편번호WGS84위도WGS84경도# duplicates
0안산시신풍제약(주)20050524폐업 등200412312011N건강기능식품수입업<NA>경기도 안산시<NA><NA><NA>2