Overview

Dataset statistics

Number of variables6
Number of observations201
Missing cells1
Missing cells (%)0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory9.7 KiB
Average record size in memory49.7 B

Variable types

Numeric1
Text3
Categorical1
DateTime1

Dataset

Description음식물류폐기물 다량배출사업장 현황(사업장명, 소재지, 전화번호, 수거형태)
Author서울특별시 성북구
URLhttps://www.data.go.kr/data/15034345/fileData.do

Alerts

데이터기준일자 has constant value ""Constant
수거형태 is highly imbalanced (56.5%)Imbalance
연번 has unique valuesUnique

Reproduction

Analysis started2023-12-12 15:20:17.573202
Analysis finished2023-12-12 15:20:18.270575
Duration0.7 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

UNIQUE 

Distinct201
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean101
Minimum1
Maximum201
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 KiB
2023-12-13T00:20:18.380473image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile11
Q151
median101
Q3151
95-th percentile191
Maximum201
Range200
Interquartile range (IQR)100

Descriptive statistics

Standard deviation58.167861
Coefficient of variation (CV)0.57591941
Kurtosis-1.2
Mean101
Median Absolute Deviation (MAD)50
Skewness0
Sum20301
Variance3383.5
MonotonicityStrictly increasing
2023-12-13T00:20:18.582723image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.5%
139 1
 
0.5%
129 1
 
0.5%
130 1
 
0.5%
131 1
 
0.5%
132 1
 
0.5%
133 1
 
0.5%
134 1
 
0.5%
135 1
 
0.5%
136 1
 
0.5%
Other values (191) 191
95.0%
ValueCountFrequency (%)
1 1
0.5%
2 1
0.5%
3 1
0.5%
4 1
0.5%
5 1
0.5%
6 1
0.5%
7 1
0.5%
8 1
0.5%
9 1
0.5%
10 1
0.5%
ValueCountFrequency (%)
201 1
0.5%
200 1
0.5%
199 1
0.5%
198 1
0.5%
197 1
0.5%
196 1
0.5%
195 1
0.5%
194 1
0.5%
193 1
0.5%
192 1
0.5%

상호
Text

Distinct199
Distinct (%)99.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
2023-12-13T00:20:18.890589image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length18
Mean length8.9303483
Min length2

Characters and Unicode

Total characters1795
Distinct characters319
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique197 ?
Unique (%)98.0%

Sample

1st row(의)참예원의료재단 성북참노인전문병원
2nd row(주)BKR버거킹안암오거리점
3rd row(주)갈비원(갈비정원)
4th row(주)델리후레쉬 고대교우회관점
5th row(주)델리후레쉬 고려대점
ValueCountFrequency (%)
스타벅스 7
 
2.6%
주식회사 4
 
1.5%
장어세상 3
 
1.1%
주)델리후레쉬 3
 
1.1%
주)아라마크 2
 
0.7%
성신여대점 2
 
0.7%
정릉점 2
 
0.7%
종암점 2
 
0.7%
미아점 2
 
0.7%
국민대학교 2
 
0.7%
Other values (235) 238
89.1%
2023-12-13T00:20:19.325481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
80
 
4.5%
73
 
4.1%
67
 
3.7%
( 44
 
2.5%
) 44
 
2.5%
42
 
2.3%
38
 
2.1%
37
 
2.1%
32
 
1.8%
32
 
1.8%
Other values (309) 1306
72.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1597
89.0%
Space Separator 67
 
3.7%
Open Punctuation 46
 
2.6%
Close Punctuation 46
 
2.6%
Uppercase Letter 21
 
1.2%
Lowercase Letter 14
 
0.8%
Decimal Number 2
 
0.1%
Other Punctuation 1
 
0.1%
Other Symbol 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
80
 
5.0%
73
 
4.6%
42
 
2.6%
38
 
2.4%
37
 
2.3%
32
 
2.0%
32
 
2.0%
31
 
1.9%
30
 
1.9%
28
 
1.8%
Other values (281) 1174
73.5%
Uppercase Letter
ValueCountFrequency (%)
S 3
14.3%
F 3
14.3%
K 3
14.3%
C 2
9.5%
L 2
9.5%
E 2
9.5%
G 1
 
4.8%
T 1
 
4.8%
H 1
 
4.8%
D 1
 
4.8%
Other values (2) 2
9.5%
Lowercase Letter
ValueCountFrequency (%)
e 5
35.7%
r 2
 
14.3%
f 2
 
14.3%
v 1
 
7.1%
a 1
 
7.1%
t 1
 
7.1%
k 1
 
7.1%
s 1
 
7.1%
Open Punctuation
ValueCountFrequency (%)
( 44
95.7%
[ 2
 
4.3%
Close Punctuation
ValueCountFrequency (%)
) 44
95.7%
] 2
 
4.3%
Space Separator
ValueCountFrequency (%)
67
100.0%
Decimal Number
ValueCountFrequency (%)
1 2
100.0%
Other Punctuation
ValueCountFrequency (%)
& 1
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1598
89.0%
Common 162
 
9.0%
Latin 35
 
1.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
80
 
5.0%
73
 
4.6%
42
 
2.6%
38
 
2.4%
37
 
2.3%
32
 
2.0%
32
 
2.0%
31
 
1.9%
30
 
1.9%
28
 
1.8%
Other values (282) 1175
73.5%
Latin
ValueCountFrequency (%)
e 5
14.3%
S 3
 
8.6%
F 3
 
8.6%
K 3
 
8.6%
C 2
 
5.7%
r 2
 
5.7%
L 2
 
5.7%
E 2
 
5.7%
f 2
 
5.7%
v 1
 
2.9%
Other values (10) 10
28.6%
Common
ValueCountFrequency (%)
67
41.4%
( 44
27.2%
) 44
27.2%
1 2
 
1.2%
] 2
 
1.2%
[ 2
 
1.2%
& 1
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1597
89.0%
ASCII 197
 
11.0%
None 1
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
80
 
5.0%
73
 
4.6%
42
 
2.6%
38
 
2.4%
37
 
2.3%
32
 
2.0%
32
 
2.0%
31
 
1.9%
30
 
1.9%
28
 
1.8%
Other values (281) 1174
73.5%
ASCII
ValueCountFrequency (%)
67
34.0%
( 44
22.3%
) 44
22.3%
e 5
 
2.5%
S 3
 
1.5%
F 3
 
1.5%
K 3
 
1.5%
1 2
 
1.0%
C 2
 
1.0%
r 2
 
1.0%
Other values (17) 22
 
11.2%
None
ValueCountFrequency (%)
1
100.0%
Distinct191
Distinct (%)95.5%
Missing1
Missing (%)0.5%
Memory size1.7 KiB
2023-12-13T00:20:19.605294image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length11
Mean length11.24
Min length9

Characters and Unicode

Total characters2248
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique183 ?
Unique (%)91.5%

Sample

1st row02-912-2114
2nd row02-922-0332
3rd row02-927-9229
4th row02-3290-1811
5th row02-3290-1811
ValueCountFrequency (%)
02-943-2495 3
 
1.5%
02-3290-1811 2
 
1.0%
02-929-3920 2
 
1.0%
02-905-1100 2
 
1.0%
02-940-7028 2
 
1.0%
02-962-1472 2
 
1.0%
02-914-5075 2
 
1.0%
02-910-4959 2
 
1.0%
02-762-7187 1
 
0.5%
02-927-7187 1
 
0.5%
Other values (181) 181
90.5%
2023-12-13T00:20:20.085649image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 399
17.7%
2 389
17.3%
0 345
15.3%
9 250
11.1%
1 164
7.3%
7 142
 
6.3%
3 130
 
5.8%
6 117
 
5.2%
4 109
 
4.8%
5 107
 
4.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1849
82.3%
Dash Punctuation 399
 
17.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 389
21.0%
0 345
18.7%
9 250
13.5%
1 164
8.9%
7 142
 
7.7%
3 130
 
7.0%
6 117
 
6.3%
4 109
 
5.9%
5 107
 
5.8%
8 96
 
5.2%
Dash Punctuation
ValueCountFrequency (%)
- 399
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2248
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 399
17.7%
2 389
17.3%
0 345
15.3%
9 250
11.1%
1 164
7.3%
7 142
 
6.3%
3 130
 
5.8%
6 117
 
5.2%
4 109
 
4.8%
5 107
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2248
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 399
17.7%
2 389
17.3%
0 345
15.3%
9 250
11.1%
1 164
7.3%
7 142
 
6.3%
3 130
 
5.8%
6 117
 
5.2%
4 109
 
4.8%
5 107
 
4.8%
Distinct190
Distinct (%)94.5%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
2023-12-13T00:20:20.448359image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length55
Median length45
Mean length28.99005
Min length21

Characters and Unicode

Total characters5827
Distinct characters184
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique179 ?
Unique (%)89.1%

Sample

1st row서울특별시 성북구 북악산로1길 71 (정릉동)
2nd row서울특별시 성북구 안암로 73 (안암동5가)
3rd row서울특별시 성북구 동소문로 81 (동소문동6가)
4th row서울특별시 성북구 종암로 13_ 고려대학교교우회 (종암동)
5th row서울특별시 성북구 안암로 145 (안암동5가_ 고려대학교 학생식당 1층)
ValueCountFrequency (%)
서울특별시 201
 
17.7%
성북구 200
 
17.7%
정릉동 29
 
2.6%
안암동5가 27
 
2.4%
하월곡동 22
 
1.9%
종암동 22
 
1.9%
종암로 19
 
1.7%
동소문로 17
 
1.5%
정릉로 15
 
1.3%
돈암동 14
 
1.2%
Other values (329) 567
50.0%
2023-12-13T00:20:20.962771image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
932
 
16.0%
252
 
4.3%
236
 
4.1%
235
 
4.0%
218
 
3.7%
207
 
3.6%
204
 
3.5%
202
 
3.5%
( 202
 
3.5%
) 202
 
3.5%
Other values (174) 2937
50.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3568
61.2%
Space Separator 932
 
16.0%
Decimal Number 788
 
13.5%
Open Punctuation 202
 
3.5%
Close Punctuation 202
 
3.5%
Connector Punctuation 89
 
1.5%
Other Punctuation 17
 
0.3%
Dash Punctuation 14
 
0.2%
Uppercase Letter 9
 
0.2%
Lowercase Letter 4
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
252
 
7.1%
236
 
6.6%
235
 
6.6%
218
 
6.1%
207
 
5.8%
204
 
5.7%
202
 
5.7%
201
 
5.6%
201
 
5.6%
196
 
5.5%
Other values (146) 1416
39.7%
Decimal Number
ValueCountFrequency (%)
1 178
22.6%
2 116
14.7%
4 82
10.4%
5 80
10.2%
3 72
9.1%
6 72
9.1%
0 64
 
8.1%
7 55
 
7.0%
8 40
 
5.1%
9 29
 
3.7%
Uppercase Letter
ValueCountFrequency (%)
B 2
22.2%
K 2
22.2%
H 2
22.2%
S 1
11.1%
J 1
11.1%
D 1
11.1%
Lowercase Letter
ValueCountFrequency (%)
e 1
25.0%
s 1
25.0%
k 1
25.0%
b 1
25.0%
Other Punctuation
ValueCountFrequency (%)
, 16
94.1%
/ 1
 
5.9%
Space Separator
ValueCountFrequency (%)
932
100.0%
Open Punctuation
ValueCountFrequency (%)
( 202
100.0%
Close Punctuation
ValueCountFrequency (%)
) 202
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 89
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 14
100.0%
Math Symbol
ValueCountFrequency (%)
~ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3568
61.2%
Common 2246
38.5%
Latin 13
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
252
 
7.1%
236
 
6.6%
235
 
6.6%
218
 
6.1%
207
 
5.8%
204
 
5.7%
202
 
5.7%
201
 
5.6%
201
 
5.6%
196
 
5.5%
Other values (146) 1416
39.7%
Common
ValueCountFrequency (%)
932
41.5%
( 202
 
9.0%
) 202
 
9.0%
1 178
 
7.9%
2 116
 
5.2%
_ 89
 
4.0%
4 82
 
3.7%
5 80
 
3.6%
3 72
 
3.2%
6 72
 
3.2%
Other values (8) 221
 
9.8%
Latin
ValueCountFrequency (%)
B 2
15.4%
K 2
15.4%
H 2
15.4%
e 1
7.7%
s 1
7.7%
k 1
7.7%
b 1
7.7%
S 1
7.7%
J 1
7.7%
D 1
7.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3568
61.2%
ASCII 2259
38.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
932
41.3%
( 202
 
8.9%
) 202
 
8.9%
1 178
 
7.9%
2 116
 
5.1%
_ 89
 
3.9%
4 82
 
3.6%
5 80
 
3.5%
3 72
 
3.2%
6 72
 
3.2%
Other values (18) 234
 
10.4%
Hangul
ValueCountFrequency (%)
252
 
7.1%
236
 
6.6%
235
 
6.6%
218
 
6.1%
207
 
5.8%
204
 
5.7%
202
 
5.7%
201
 
5.6%
201
 
5.6%
196
 
5.5%
Other values (146) 1416
39.7%

수거형태
Categorical

IMBALANCE 

Distinct2
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
위탁처리
183 
자가처리
 
18

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row위탁처리
2nd row위탁처리
3rd row위탁처리
4th row위탁처리
5th row위탁처리

Common Values

ValueCountFrequency (%)
위탁처리 183
91.0%
자가처리 18
 
9.0%

Length

2023-12-13T00:20:21.143771image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:20:21.244389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
위탁처리 183
91.0%
자가처리 18
 
9.0%

데이터기준일자
Date

CONSTANT 

Distinct1
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
Minimum2021-12-10 00:00:00
Maximum2021-12-10 00:00:00
2023-12-13T00:20:21.335497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:20:21.427909image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Interactions

2023-12-13T00:20:17.929603image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T00:20:21.512033image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번수거형태
연번1.0000.222
수거형태0.2221.000
2023-12-13T00:20:21.614370image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번수거형태
연번1.0000.166
수거형태0.1661.000

Missing values

2023-12-13T00:20:18.090165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T00:20:18.216310image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번상호전화번호사업장도로명주소수거형태데이터기준일자
01(의)참예원의료재단 성북참노인전문병원02-912-2114서울특별시 성북구 북악산로1길 71 (정릉동)위탁처리2021-12-10
12(주)BKR버거킹안암오거리점02-922-0332서울특별시 성북구 안암로 73 (안암동5가)위탁처리2021-12-10
23(주)갈비원(갈비정원)02-927-9229서울특별시 성북구 동소문로 81 (동소문동6가)위탁처리2021-12-10
34(주)델리후레쉬 고대교우회관점02-3290-1811서울특별시 성북구 종암로 13_ 고려대학교교우회 (종암동)위탁처리2021-12-10
45(주)델리후레쉬 고려대점02-3290-1811서울특별시 성북구 안암로 145 (안암동5가_ 고려대학교 학생식당 1층)위탁처리2021-12-10
56(주)델리후레쉬 학생회관점02-922-0730서울특별시 성북구 안암로 145 (안암동5가_ 고려대학교 학생회관 2층 스넥식당)위탁처리2021-12-10
67(주)디에스와이컴퍼니 알루엣02-919-9505서울특별시 성북구 서경로 60_ 제상가동 제비108호 (정릉동_ 길음뉴타운11단지롯데캐슬골든힐스)위탁처리2021-12-10
78(주)산들푸드 한성대 1호점02-766-1977서울특별시 성북구 삼선교로16길 116_ 한성대학교 (삼선동2가)위탁처리2021-12-10
89(주)산들푸드성신여대수정캠퍼스점02-921-4566서울특별시 성북구 보문로34다길 2_ 성신여자대학교 난향관 3층 (돈암동)위탁처리2021-12-10
910성북동면옥02-762-3450서울특별시 성북구 대사관로 40, 지상1,2,3층 (성북동)위탁처리2021-12-10
연번상호전화번호사업장도로명주소수거형태데이터기준일자
191192화로구이조선02-953-8805서울특별시 성북구 종암로 129 (종암동)위탁처리2021-12-10
192193피버(fever)02-6225-2188서울특별시 성북구 화랑로 265, 1,2층 101,102,202,203호 (장위동, H하우스 장위)위탁처리2021-12-10
193194장어세상02-910-6007서울특별시 성북구 오패산로3길 5, 1층 (하월곡동)위탁처리2021-12-10
194195SELF 장어세상02-943-9827서울특별시 성북구 정릉로 240 (정릉동)위탁처리2021-12-10
195196경성진갈비 종암점02-929-1233서울특별시 성북구 종암로 129 (종암동, 청한상가2층(218호 외 16, 1층 149))위탁처리2021-12-10
196197김영희강남동태찜정릉점02-914-5075서울특별시 성북구 정릉로 263 (정릉동)위탁처리2021-12-10
197198오거리술집상구비어0507-1314-7739서울특별시 성북구 고려대로24길 60 (안암동5가)위탁처리2021-12-10
198199카페그레테(Cafe Grete)02-941-1633서울특별시 성북구 보국문로 87-6, 1,2층 (정릉동)위탁처리2021-12-10
199200㈜야단법석 한상02-962-1472서울특별시 성북구 화랑로 248 (석관동, 1301~1304 장위뉴타워)위탁처리2021-12-10
200201호랭이술집02-929-3920서울특별시 성북구 고려대로26길 14, 지하1층 B101호 (안암동5가)위탁처리2021-12-10