Overview

Dataset statistics

Number of variables4
Number of observations327
Missing cells5
Missing cells (%)0.4%
Duplicate rows2
Duplicate rows (%)0.6%
Total size in memory10.7 KiB
Average record size in memory33.4 B

Variable types

Categorical2
Text1
Numeric1

Dataset

Description자치구명,법정동명,업태명,업소수
Author중랑구
URLhttps://data.seoul.go.kr/dataList/OA-10298/S/1/datasetView.do

Alerts

자치구명 has constant value ""Constant
Dataset has 2 (0.6%) duplicate rowsDuplicates
업태명 has 5 (1.5%) missing valuesMissing

Reproduction

Analysis started2024-05-18 07:15:18.972237
Analysis finished2024-05-18 07:15:20.027685
Duration1.06 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

자치구명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size2.7 KiB
중랑구
327 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row중랑구
2nd row중랑구
3rd row중랑구
4th row중랑구
5th row중랑구

Common Values

ValueCountFrequency (%)
중랑구 327
100.0%

Length

2024-05-18T16:15:20.300430image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T16:15:20.603269image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
중랑구 327
100.0%

법정동명
Categorical

Distinct6
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size2.7 KiB
면목동
63 
상봉동
57 
묵동
55 
망우동
53 
신내동
50 

Length

Max length3
Median length3
Mean length2.8318043
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row면목동
2nd row면목동
3rd row면목동
4th row면목동
5th row면목동

Common Values

ValueCountFrequency (%)
면목동 63
19.3%
상봉동 57
17.4%
묵동 55
16.8%
망우동 53
16.2%
신내동 50
15.3%
중화동 49
15.0%

Length

2024-05-18T16:15:20.947507image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T16:15:21.293698image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
면목동 63
19.3%
상봉동 57
17.4%
묵동 55
16.8%
망우동 53
16.2%
신내동 50
15.3%
중화동 49
15.0%

업태명
Text

MISSING 

Distinct73
Distinct (%)22.7%
Missing5
Missing (%)1.5%
Memory size2.7 KiB
2024-05-18T16:15:21.745767image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length12
Mean length5.6335404
Min length2

Characters and Unicode

Total characters1814
Distinct characters152
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)2.8%

Sample

1st row한식
2nd row중국식
3rd row경양식
4th row일식
5th row분식
ValueCountFrequency (%)
기타 23
 
6.5%
패스트푸드 12
 
3.4%
식품제조가공업 11
 
3.1%
식품자동판매기영업 6
 
1.7%
식품소분업 6
 
1.7%
학교 6
 
1.7%
병원 6
 
1.7%
사회복지시설 6
 
1.7%
중국식 6
 
1.7%
집단급식소 6
 
1.7%
Other values (65) 264
75.0%
2024-05-18T16:15:22.569012image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
126
 
6.9%
107
 
5.9%
81
 
4.5%
79
 
4.4%
62
 
3.4%
56
 
3.1%
35
 
1.9%
) 32
 
1.8%
32
 
1.8%
( 32
 
1.8%
Other values (142) 1172
64.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1697
93.6%
Close Punctuation 32
 
1.8%
Open Punctuation 32
 
1.8%
Space Separator 30
 
1.7%
Other Punctuation 23
 
1.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
126
 
7.4%
107
 
6.3%
81
 
4.8%
79
 
4.7%
62
 
3.7%
56
 
3.3%
35
 
2.1%
32
 
1.9%
30
 
1.8%
30
 
1.8%
Other values (137) 1059
62.4%
Other Punctuation
ValueCountFrequency (%)
/ 18
78.3%
, 5
 
21.7%
Close Punctuation
ValueCountFrequency (%)
) 32
100.0%
Open Punctuation
ValueCountFrequency (%)
( 32
100.0%
Space Separator
ValueCountFrequency (%)
30
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1697
93.6%
Common 117
 
6.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
126
 
7.4%
107
 
6.3%
81
 
4.8%
79
 
4.7%
62
 
3.7%
56
 
3.3%
35
 
2.1%
32
 
1.9%
30
 
1.8%
30
 
1.8%
Other values (137) 1059
62.4%
Common
ValueCountFrequency (%)
) 32
27.4%
( 32
27.4%
30
25.6%
/ 18
15.4%
, 5
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1697
93.6%
ASCII 117
 
6.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
126
 
7.4%
107
 
6.3%
81
 
4.8%
79
 
4.7%
62
 
3.7%
56
 
3.3%
35
 
2.1%
32
 
1.9%
30
 
1.8%
30
 
1.8%
Other values (137) 1059
62.4%
ASCII
ValueCountFrequency (%)
) 32
27.4%
( 32
27.4%
30
25.6%
/ 18
15.4%
, 5
 
4.3%

업소수
Real number (ℝ)

Distinct73
Distinct (%)22.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean22.134557
Minimum1
Maximum657
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.0 KiB
2024-05-18T16:15:22.978682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median6
Q319.5
95-th percentile89.8
Maximum657
Range656
Interquartile range (IQR)17.5

Descriptive statistics

Standard deviation51.494597
Coefficient of variation (CV)2.3264345
Kurtosis74.47656
Mean22.134557
Median Absolute Deviation (MAD)5
Skewness7.1793501
Sum7238
Variance2651.6935
MonotonicityNot monotonic
2024-05-18T16:15:23.408890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 60
18.3%
2 39
 
11.9%
3 27
 
8.3%
6 17
 
5.2%
4 15
 
4.6%
13 10
 
3.1%
5 10
 
3.1%
12 10
 
3.1%
7 9
 
2.8%
8 9
 
2.8%
Other values (63) 121
37.0%
ValueCountFrequency (%)
1 60
18.3%
2 39
11.9%
3 27
8.3%
4 15
 
4.6%
5 10
 
3.1%
6 17
 
5.2%
7 9
 
2.8%
8 9
 
2.8%
9 7
 
2.1%
10 6
 
1.8%
ValueCountFrequency (%)
657 1
0.3%
275 1
0.3%
228 1
0.3%
223 1
0.3%
203 1
0.3%
199 1
0.3%
187 1
0.3%
179 1
0.3%
167 1
0.3%
130 1
0.3%

Interactions

2024-05-18T16:15:19.273556image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-18T16:15:23.675321image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
법정동명업태명업소수
법정동명1.0000.0000.125
업태명0.0001.0000.000
업소수0.1250.0001.000
2024-05-18T16:15:23.909135image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업소수법정동명
업소수1.0000.045
법정동명0.0451.000

Missing values

2024-05-18T16:15:19.620478image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-18T16:15:19.920070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

자치구명법정동명업태명업소수
0중랑구면목동한식657
1중랑구면목동중국식34
2중랑구면목동경양식29
3중랑구면목동일식40
4중랑구면목동분식108
5중랑구면목동뷔페식2
6중랑구면목동정종/대포집/소주방87
7중랑구면목동출장조리2
8중랑구면목동패스트푸드9
9중랑구면목동호프/통닭199
자치구명법정동명업태명업소수
317중랑구신내동제과점영업16
318중랑구신내동집단급식소 식품판매업8
319중랑구신내동영업장판매39
320중랑구신내동방문판매4
321중랑구신내동전자상거래(통신판매업)77
322중랑구신내동<NA>1
323중랑구신내동다단계판매1
324중랑구신내동도매업(유통)1
325중랑구신내동기타 건강기능식품일반판매업2
326중랑구신내동건강기능식품유통전문판매업6

Duplicate rows

Most frequently occurring

자치구명법정동명업태명업소수# duplicates
0중랑구묵동<NA>12
1중랑구중화동패스트푸드62