Overview

Dataset statistics

Number of variables4
Number of observations923
Missing cells11
Missing cells (%)0.3%
Duplicate rows1
Duplicate rows (%)0.1%
Total size in memory29.9 KiB
Average record size in memory33.1 B

Variable types

Categorical2
Text1
Numeric1

Dataset

Description자치구명,법정동명,업태명,업소수
Author마포구
URLhttps://data.seoul.go.kr/dataList/OA-11376/S/1/datasetView.do

Alerts

자치구명 has constant value ""Constant
Dataset has 1 (0.1%) duplicate rowsDuplicates
업태명 has 11 (1.2%) missing valuesMissing

Reproduction

Analysis started2024-05-18 02:41:16.659317
Analysis finished2024-05-18 02:41:17.878746
Duration1.22 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

자치구명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size7.3 KiB
마포구
923 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row마포구
2nd row마포구
3rd row마포구
4th row마포구
5th row마포구

Common Values

ValueCountFrequency (%)
마포구 923
100.0%

Length

2024-05-18T11:41:18.050287image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T11:41:18.301090image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
마포구 923
100.0%

법정동명
Categorical

Distinct27
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Memory size7.3 KiB
서교동
 
60
상암동
 
55
공덕동
 
54
도화동
 
54
성산동
 
54
Other values (22)
646 

Length

Max length4
Median length3
Mean length3.056338
Min length2

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row효창동
2nd row아현동
3rd row아현동
4th row아현동
5th row아현동

Common Values

ValueCountFrequency (%)
서교동 60
 
6.5%
상암동 55
 
6.0%
공덕동 54
 
5.9%
도화동 54
 
5.9%
성산동 54
 
5.9%
합정동 51
 
5.5%
동교동 49
 
5.3%
망원동 49
 
5.3%
신수동 42
 
4.6%
대흥동 41
 
4.4%
Other values (17) 414
44.9%

Length

2024-05-18T11:41:18.668228image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
서교동 60
 
6.5%
상암동 55
 
6.0%
공덕동 54
 
5.9%
도화동 54
 
5.9%
성산동 54
 
5.9%
합정동 51
 
5.5%
동교동 49
 
5.3%
망원동 49
 
5.3%
신수동 42
 
4.6%
대흥동 41
 
4.4%
Other values (17) 414
44.9%

업태명
Text

MISSING 

Distinct75
Distinct (%)8.2%
Missing11
Missing (%)1.2%
Memory size7.3 KiB
2024-05-18T11:41:19.281595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length12
Mean length5.8059211
Min length2

Characters and Unicode

Total characters5295
Distinct characters150
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)0.8%

Sample

1st row커피숍
2nd row한식
3rd row중국식
4th row경양식
5th row일식
ValueCountFrequency (%)
기타 84
 
8.3%
패스트푸드 31
 
3.1%
식품제조가공업 27
 
2.7%
식품자동판매기영업 25
 
2.5%
한식 25
 
2.5%
커피숍 23
 
2.3%
편의점 23
 
2.3%
유통전문판매업 23
 
2.3%
즉석판매제조가공업 23
 
2.3%
전자상거래(통신판매업 22
 
2.2%
Other values (66) 710
69.9%
2024-05-18T11:41:20.638129image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
376
 
7.1%
323
 
6.1%
249
 
4.7%
236
 
4.5%
195
 
3.7%
171
 
3.2%
124
 
2.3%
116
 
2.2%
113
 
2.1%
( 107
 
2.0%
Other values (140) 3285
62.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4917
92.9%
Open Punctuation 107
 
2.0%
Close Punctuation 107
 
2.0%
Space Separator 104
 
2.0%
Other Punctuation 60
 
1.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
376
 
7.6%
323
 
6.6%
249
 
5.1%
236
 
4.8%
195
 
4.0%
171
 
3.5%
124
 
2.5%
116
 
2.4%
113
 
2.3%
90
 
1.8%
Other values (135) 2924
59.5%
Other Punctuation
ValueCountFrequency (%)
/ 42
70.0%
, 18
30.0%
Open Punctuation
ValueCountFrequency (%)
( 107
100.0%
Close Punctuation
ValueCountFrequency (%)
) 107
100.0%
Space Separator
ValueCountFrequency (%)
104
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4917
92.9%
Common 378
 
7.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
376
 
7.6%
323
 
6.6%
249
 
5.1%
236
 
4.8%
195
 
4.0%
171
 
3.5%
124
 
2.5%
116
 
2.4%
113
 
2.3%
90
 
1.8%
Other values (135) 2924
59.5%
Common
ValueCountFrequency (%)
( 107
28.3%
) 107
28.3%
104
27.5%
/ 42
 
11.1%
, 18
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4917
92.9%
ASCII 378
 
7.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
376
 
7.6%
323
 
6.6%
249
 
5.1%
236
 
4.8%
195
 
4.0%
171
 
3.5%
124
 
2.5%
116
 
2.4%
113
 
2.3%
90
 
1.8%
Other values (135) 2924
59.5%
ASCII
ValueCountFrequency (%)
( 107
28.3%
) 107
28.3%
104
27.5%
/ 42
 
11.1%
, 18
 
4.8%

업소수
Real number (ℝ)

Distinct100
Distinct (%)10.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.901408
Minimum1
Maximum566
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.2 KiB
2024-05-18T11:41:21.217446image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median3
Q313
95-th percentile71
Maximum566
Range565
Interquartile range (IQR)12

Descriptive statistics

Standard deviation40.353052
Coefficient of variation (CV)2.537703
Kurtosis69.985955
Mean15.901408
Median Absolute Deviation (MAD)2
Skewness7.0306974
Sum14677
Variance1628.3688
MonotonicityNot monotonic
2024-05-18T11:41:21.681532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 256
27.7%
2 129
14.0%
3 83
 
9.0%
4 50
 
5.4%
5 38
 
4.1%
6 31
 
3.4%
8 23
 
2.5%
7 19
 
2.1%
10 16
 
1.7%
9 16
 
1.7%
Other values (90) 262
28.4%
ValueCountFrequency (%)
1 256
27.7%
2 129
14.0%
3 83
 
9.0%
4 50
 
5.4%
5 38
 
4.1%
6 31
 
3.4%
7 19
 
2.1%
8 23
 
2.5%
9 16
 
1.7%
10 16
 
1.7%
ValueCountFrequency (%)
566 1
0.1%
492 1
0.1%
359 1
0.1%
294 1
0.1%
244 1
0.1%
220 1
0.1%
202 1
0.1%
197 1
0.1%
196 1
0.1%
193 1
0.1%

Interactions

2024-05-18T11:41:17.154572image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-18T11:41:21.960253image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
법정동명업태명업소수
법정동명1.0000.0000.000
업태명0.0001.0000.000
업소수0.0000.0001.000
2024-05-18T11:41:22.308983image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업소수법정동명
업소수1.0000.000
법정동명0.0001.000

Missing values

2024-05-18T11:41:17.518646image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-18T11:41:17.792263image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

자치구명법정동명업태명업소수
0마포구효창동커피숍1
1마포구아현동한식62
2마포구아현동중국식5
3마포구아현동경양식7
4마포구아현동일식9
5마포구아현동분식13
6마포구아현동호프/통닭14
7마포구아현동통닭(치킨)2
8마포구아현동회집1
9마포구아현동까페3
자치구명법정동명업태명업소수
913마포구상암동영업장판매39
914마포구상암동방문판매3
915마포구상암동전자상거래(통신판매업)55
916마포구상암동<NA>2
917마포구상암동다단계판매1
918마포구상암동도매업(유통)1
919마포구상암동자동판매기판매1
920마포구상암동기타(복합 등)8
921마포구상암동기타 건강기능식품일반판매업2
922마포구상암동건강기능식품유통전문판매업10

Duplicate rows

Most frequently occurring

자치구명법정동명업태명업소수# duplicates
0마포구용강동패스트푸드12