Overview

Dataset statistics

Number of variables4
Number of observations426
Missing cells1
Missing cells (%)0.1%
Duplicate rows2
Duplicate rows (%)0.5%
Total size in memory13.9 KiB
Average record size in memory33.3 B

Variable types

Categorical2
Text1
Numeric1

Dataset

Description자치구명,법정동명,업태명,업소수
Author구로구
URLhttps://data.seoul.go.kr/dataList/OA-2298/S/1/datasetView.do

Alerts

자치구명 has constant value ""Constant
Dataset has 2 (0.5%) duplicate rowsDuplicates

Reproduction

Analysis started2024-05-18 07:54:15.776077
Analysis finished2024-05-18 07:54:17.054570
Duration1.28 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

자치구명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.5 KiB
구로구
426 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row구로구
2nd row구로구
3rd row구로구
4th row구로구
5th row구로구

Common Values

ValueCountFrequency (%)
구로구 426
100.0%

Length

2024-05-18T16:54:17.382150image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T16:54:17.880295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
구로구 426
100.0%

법정동명
Categorical

Distinct14
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size3.5 KiB
구로동
68 
고척동
53 
개봉동
51 
신도림동
49 
오류동
49 
Other values (9)
156 

Length

Max length4
Median length3
Mean length3.0516432
Min length2

Unique

Unique4 ?
Unique (%)0.9%

Sample

1st row신정동
2nd row<NA>
3rd row신도림동
4th row신도림동
5th row신도림동

Common Values

ValueCountFrequency (%)
구로동 68
16.0%
고척동 53
12.4%
개봉동 51
12.0%
신도림동 49
11.5%
오류동 49
11.5%
가리봉동 34
8.0%
천왕동 33
7.7%
항동 33
7.7%
궁동 29
6.8%
온수동 23
 
5.4%
Other values (4) 4
 
0.9%

Length

2024-05-18T16:54:18.448244image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
구로동 68
16.0%
고척동 53
12.4%
개봉동 51
12.0%
신도림동 49
11.5%
오류동 49
11.5%
가리봉동 34
8.0%
천왕동 33
7.7%
항동 33
7.7%
궁동 29
6.8%
온수동 23
 
5.4%
Other values (4) 4
 
0.9%
Distinct72
Distinct (%)16.9%
Missing1
Missing (%)0.2%
Memory size3.5 KiB
2024-05-18T16:54:19.090645image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length12
Mean length5.52
Min length2

Characters and Unicode

Total characters2346
Distinct characters154
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)1.6%

Sample

1st row식품자동판매기영업
2nd row건강기능식품수입업
3rd row한식
4th row중국식
5th row경양식
ValueCountFrequency (%)
기타 29
 
6.3%
패스트푸드 16
 
3.5%
식품제조가공업 12
 
2.6%
식품자동판매기영업 12
 
2.6%
한식 11
 
2.4%
호프/통닭 10
 
2.2%
커피숍 10
 
2.2%
전자상거래(통신판매업 10
 
2.2%
분식 10
 
2.2%
영업장판매 10
 
2.2%
Other values (64) 332
71.9%
2024-05-18T16:54:20.614680image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
162
 
6.9%
141
 
6.0%
107
 
4.6%
104
 
4.4%
75
 
3.2%
69
 
2.9%
49
 
2.1%
46
 
2.0%
) 44
 
1.9%
( 44
 
1.9%
Other values (144) 1505
64.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2193
93.5%
Close Punctuation 44
 
1.9%
Open Punctuation 44
 
1.9%
Space Separator 37
 
1.6%
Other Punctuation 28
 
1.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
162
 
7.4%
141
 
6.4%
107
 
4.9%
104
 
4.7%
75
 
3.4%
69
 
3.1%
49
 
2.2%
46
 
2.1%
40
 
1.8%
40
 
1.8%
Other values (138) 1360
62.0%
Other Punctuation
ValueCountFrequency (%)
/ 22
78.6%
, 5
 
17.9%
. 1
 
3.6%
Close Punctuation
ValueCountFrequency (%)
) 44
100.0%
Open Punctuation
ValueCountFrequency (%)
( 44
100.0%
Space Separator
ValueCountFrequency (%)
37
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2193
93.5%
Common 153
 
6.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
162
 
7.4%
141
 
6.4%
107
 
4.9%
104
 
4.7%
75
 
3.4%
69
 
3.1%
49
 
2.2%
46
 
2.1%
40
 
1.8%
40
 
1.8%
Other values (138) 1360
62.0%
Common
ValueCountFrequency (%)
) 44
28.8%
( 44
28.8%
37
24.2%
/ 22
14.4%
, 5
 
3.3%
. 1
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2193
93.5%
ASCII 153
 
6.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
162
 
7.4%
141
 
6.4%
107
 
4.9%
104
 
4.7%
75
 
3.4%
69
 
3.1%
49
 
2.2%
46
 
2.1%
40
 
1.8%
40
 
1.8%
Other values (138) 1360
62.0%
ASCII
ValueCountFrequency (%)
) 44
28.8%
( 44
28.8%
37
24.2%
/ 22
14.4%
, 5
 
3.3%
. 1
 
0.7%

업소수
Real number (ℝ)

Distinct75
Distinct (%)17.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean19.680751
Minimum1
Maximum972
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.9 KiB
2024-05-18T16:54:21.166919image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q314
95-th percentile73.75
Maximum972
Range971
Interquartile range (IQR)12

Descriptive statistics

Standard deviation61.299315
Coefficient of variation (CV)3.1146837
Kurtosis141.63782
Mean19.680751
Median Absolute Deviation (MAD)3
Skewness10.188251
Sum8384
Variance3757.6061
MonotonicityNot monotonic
2024-05-18T16:54:21.926155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 99
23.2%
2 58
13.6%
3 38
 
8.9%
4 25
 
5.9%
7 18
 
4.2%
5 15
 
3.5%
6 14
 
3.3%
8 14
 
3.3%
9 10
 
2.3%
10 10
 
2.3%
Other values (65) 125
29.3%
ValueCountFrequency (%)
1 99
23.2%
2 58
13.6%
3 38
 
8.9%
4 25
 
5.9%
5 15
 
3.5%
6 14
 
3.3%
7 18
 
4.2%
8 14
 
3.3%
9 10
 
2.3%
10 10
 
2.3%
ValueCountFrequency (%)
972 1
0.2%
364 1
0.2%
325 1
0.2%
230 1
0.2%
223 1
0.2%
217 1
0.2%
207 1
0.2%
198 1
0.2%
195 1
0.2%
190 1
0.2%

Interactions

2024-05-18T16:54:16.235156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-18T16:54:22.314748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
법정동명업태명업소수
법정동명1.0000.0000.040
업태명0.0001.0000.000
업소수0.0400.0001.000
2024-05-18T16:54:22.728869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업소수법정동명
업소수1.0000.020
법정동명0.0201.000

Missing values

2024-05-18T16:54:16.591635image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-18T16:54:16.920759image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

자치구명법정동명업태명업소수
0구로구신정동식품자동판매기영업1
1구로구<NA>건강기능식품수입업1
2구로구신도림동한식121
3구로구신도림동중국식14
4구로구신도림동경양식9
5구로구신도림동일식17
6구로구신도림동분식18
7구로구신도림동뷔페식14
8구로구신도림동정종/대포집/소주방1
9구로구신도림동전통찻집1
자치구명법정동명업태명업소수
416구로구항동유통전문판매업3
417구로구항동기타식품판매업1
418구로구항동위탁급식영업1
419구로구항동제과점영업3
420구로구항동집단급식소 식품판매업1
421구로구항동영업장판매4
422구로구항동방문판매2
423구로구항동전자상거래(통신판매업)17
424구로구옥길동식품자동판매기영업1
425구로구역곡동한식1

Duplicate rows

Most frequently occurring

자치구명법정동명업태명업소수# duplicates
0구로구구로동전통찻집12
1구로구궁동패스트푸드12