Overview

Dataset statistics

Number of variables4
Number of observations621
Missing cells6
Missing cells (%)0.2%
Duplicate rows2
Duplicate rows (%)0.3%
Total size in memory20.1 KiB
Average record size in memory33.2 B

Variable types

Categorical2
Text1
Numeric1

Dataset

Description자치구명,법정동명,업태명,업소수
Author성동구
URLhttps://data.seoul.go.kr/dataList/OA-10760/S/1/datasetView.do

Alerts

자치구명 has constant value ""Constant
Dataset has 2 (0.3%) duplicate rowsDuplicates

Reproduction

Analysis started2024-05-18 09:31:18.695063
Analysis finished2024-05-18 09:31:19.836887
Duration1.14 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

자치구명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size5.0 KiB
성동구
621 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row성동구
2nd row성동구
3rd row성동구
4th row성동구
5th row성동구

Common Values

ValueCountFrequency (%)
성동구 621
100.0%

Length

2024-05-18T18:31:20.057985image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T18:31:20.610821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
성동구 621
100.0%

법정동명
Categorical

Distinct17
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Memory size5.0 KiB
성수동2가
54 
행당동
53 
성수동1가
52 
마장동
46 
용답동
41 
Other values (12)
375 

Length

Max length5
Median length3
Mean length3.9758454
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row상왕십리동
2nd row상왕십리동
3rd row상왕십리동
4th row상왕십리동
5th row상왕십리동

Common Values

ValueCountFrequency (%)
성수동2가 54
 
8.7%
행당동 53
 
8.5%
성수동1가 52
 
8.4%
마장동 46
 
7.4%
용답동 41
 
6.6%
하왕십리동 40
 
6.4%
옥수동 39
 
6.3%
금호동4가 37
 
6.0%
금호동1가 35
 
5.6%
금호동3가 34
 
5.5%
Other values (7) 190
30.6%

Length

2024-05-18T18:31:20.963244image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
성수동2가 54
 
8.7%
행당동 53
 
8.5%
성수동1가 52
 
8.4%
마장동 46
 
7.4%
용답동 41
 
6.6%
하왕십리동 40
 
6.4%
옥수동 39
 
6.3%
금호동4가 37
 
6.0%
금호동1가 35
 
5.6%
금호동3가 34
 
5.5%
Other values (7) 190
30.6%
Distinct69
Distinct (%)11.2%
Missing6
Missing (%)1.0%
Memory size5.0 KiB
2024-05-18T18:31:21.485123image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length12
Mean length5.6764228
Min length2

Characters and Unicode

Total characters3491
Distinct characters149
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)1.6%

Sample

1st row한식
2nd row중국식
3rd row경양식
4th row일식
5th row분식
ValueCountFrequency (%)
기타 54
 
8.0%
패스트푸드 21
 
3.1%
즉석판매제조가공업 17
 
2.5%
식품자동판매기영업 17
 
2.5%
일반조리판매 17
 
2.5%
영업장판매 17
 
2.5%
제과점영업 17
 
2.5%
커피숍 17
 
2.5%
한식 17
 
2.5%
휴게음식점 17
 
2.5%
Other values (60) 462
68.6%
2024-05-18T18:31:22.571539image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
241
 
6.9%
203
 
5.8%
164
 
4.7%
161
 
4.6%
117
 
3.4%
99
 
2.8%
80
 
2.3%
77
 
2.2%
( 73
 
2.1%
) 73
 
2.1%
Other values (139) 2203
63.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3234
92.6%
Open Punctuation 73
 
2.1%
Close Punctuation 73
 
2.1%
Space Separator 58
 
1.7%
Other Punctuation 53
 
1.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
241
 
7.5%
203
 
6.3%
164
 
5.1%
161
 
5.0%
117
 
3.6%
99
 
3.1%
80
 
2.5%
77
 
2.4%
67
 
2.1%
60
 
1.9%
Other values (133) 1965
60.8%
Other Punctuation
ValueCountFrequency (%)
/ 39
73.6%
, 11
 
20.8%
. 3
 
5.7%
Open Punctuation
ValueCountFrequency (%)
( 73
100.0%
Close Punctuation
ValueCountFrequency (%)
) 73
100.0%
Space Separator
ValueCountFrequency (%)
58
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3234
92.6%
Common 257
 
7.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
241
 
7.5%
203
 
6.3%
164
 
5.1%
161
 
5.0%
117
 
3.6%
99
 
3.1%
80
 
2.5%
77
 
2.4%
67
 
2.1%
60
 
1.9%
Other values (133) 1965
60.8%
Common
ValueCountFrequency (%)
( 73
28.4%
) 73
28.4%
58
22.6%
/ 39
15.2%
, 11
 
4.3%
. 3
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3234
92.6%
ASCII 257
 
7.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
241
 
7.5%
203
 
6.3%
164
 
5.1%
161
 
5.0%
117
 
3.6%
99
 
3.1%
80
 
2.5%
77
 
2.4%
67
 
2.1%
60
 
1.9%
Other values (133) 1965
60.8%
ASCII
ValueCountFrequency (%)
( 73
28.4%
) 73
28.4%
58
22.6%
/ 39
15.2%
, 11
 
4.3%
. 3
 
1.2%

업소수
Real number (ℝ)

Distinct72
Distinct (%)11.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.470209
Minimum1
Maximum388
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.6 KiB
2024-05-18T18:31:23.056324image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median3
Q39
95-th percentile47
Maximum388
Range387
Interquartile range (IQR)8

Descriptive statistics

Standard deviation33.896967
Coefficient of variation (CV)2.7182356
Kurtosis71.07358
Mean12.470209
Median Absolute Deviation (MAD)2
Skewness7.5584455
Sum7744
Variance1149.0044
MonotonicityNot monotonic
2024-05-18T18:31:23.972055image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 158
25.4%
2 96
15.5%
3 64
10.3%
5 35
 
5.6%
4 34
 
5.5%
6 25
 
4.0%
7 24
 
3.9%
8 20
 
3.2%
11 15
 
2.4%
10 10
 
1.6%
Other values (62) 140
22.5%
ValueCountFrequency (%)
1 158
25.4%
2 96
15.5%
3 64
10.3%
4 34
 
5.5%
5 35
 
5.6%
6 25
 
4.0%
7 24
 
3.9%
8 20
 
3.2%
9 10
 
1.6%
10 10
 
1.6%
ValueCountFrequency (%)
388 1
0.2%
385 1
0.2%
363 1
0.2%
242 1
0.2%
215 1
0.2%
142 1
0.2%
126 1
0.2%
117 1
0.2%
114 1
0.2%
112 1
0.2%

Interactions

2024-05-18T18:31:19.055028image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-18T18:31:24.639384image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
법정동명업태명업소수
법정동명1.0000.0000.207
업태명0.0001.0000.000
업소수0.2070.0001.000
2024-05-18T18:31:25.041195image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업소수법정동명
업소수1.0000.094
법정동명0.0941.000

Missing values

2024-05-18T18:31:19.435872image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-18T18:31:19.724833image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

자치구명법정동명업태명업소수
0성동구상왕십리동한식19
1성동구상왕십리동중국식2
2성동구상왕십리동경양식1
3성동구상왕십리동일식5
4성동구상왕십리동분식3
5성동구상왕십리동호프/통닭4
6성동구상왕십리동통닭(치킨)2
7성동구상왕십리동회집1
8성동구상왕십리동기타19
9성동구상왕십리동일반조리판매2
자치구명법정동명업태명업소수
611성동구용답동식품등 수입판매업9
612성동구용답동식품자동판매기영업12
613성동구용답동유통전문판매업6
614성동구용답동위탁급식영업5
615성동구용답동제과점영업1
616성동구용답동건강기능식품수입업1
617성동구용답동영업장판매16
618성동구용답동방문판매2
619성동구용답동전자상거래(통신판매업)25
620성동구용답동건강기능식품유통전문판매업7

Duplicate rows

Most frequently occurring

자치구명법정동명업태명업소수# duplicates
0성동구금호동3가<NA>12
1성동구용답동패스트푸드22