Overview

Dataset statistics

Number of variables2
Number of observations363
Missing cells254
Missing cells (%)35.0%
Duplicate rows1
Duplicate rows (%)0.3%
Total size in memory5.8 KiB
Average record size in memory16.4 B

Variable types

Text2

Dataset

Description경기도 오산시에 등록된 통신판매업체 중 식품 판매 목적으로 등록한 통신판매업의 업체명, 휴대폰번호를 제외한 연락처(일부) 항목를 제공합니다.
Author경기도 오산시
URLhttps://www.data.go.kr/data/15085719/fileData.do

Alerts

Dataset has 1 (0.3%) duplicate rowsDuplicates
연락처 has 254 (70.0%) missing valuesMissing

Reproduction

Analysis started2023-12-12 13:18:57.975103
Analysis finished2023-12-12 13:18:58.703812
Duration0.73 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct361
Distinct (%)99.4%
Missing0
Missing (%)0.0%
Memory size3.0 KiB
2023-12-12T22:18:58.889148image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length33
Median length21
Mean length6.9807163
Min length1

Characters and Unicode

Total characters2534
Distinct characters464
Distinct categories11 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique359 ?
Unique (%)98.9%

Sample

1st row삼우인터내셔널
2nd row대원축산
3rd row주식회사 에스제이우솔(SJ Woosol Co. Ltd.)
4th row갓팩토리
5th row다모어엠(Damore M)
ValueCountFrequency (%)
주식회사 36
 
7.5%
오산점 4
 
0.8%
4
 
0.8%
농업회사법인 3
 
0.6%
포트오브모카 2
 
0.4%
food 2
 
0.4%
유한회사 2
 
0.4%
2
 
0.4%
벽돌집 2
 
0.4%
system 2
 
0.4%
Other values (423) 423
87.8%
2023-12-12T22:18:59.305676image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
120
 
4.7%
74
 
2.9%
65
 
2.6%
( 61
 
2.4%
) 61
 
2.4%
54
 
2.1%
53
 
2.1%
50
 
2.0%
47
 
1.9%
32
 
1.3%
Other values (454) 1917
75.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1983
78.3%
Uppercase Letter 143
 
5.6%
Lowercase Letter 139
 
5.5%
Space Separator 120
 
4.7%
Open Punctuation 61
 
2.4%
Close Punctuation 61
 
2.4%
Other Punctuation 12
 
0.5%
Decimal Number 11
 
0.4%
Dash Punctuation 2
 
0.1%
Other Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
74
 
3.7%
65
 
3.3%
54
 
2.7%
53
 
2.7%
50
 
2.5%
47
 
2.4%
32
 
1.6%
31
 
1.6%
30
 
1.5%
26
 
1.3%
Other values (394) 1521
76.7%
Uppercase Letter
ValueCountFrequency (%)
E 13
 
9.1%
S 11
 
7.7%
A 11
 
7.7%
N 10
 
7.0%
F 9
 
6.3%
C 9
 
6.3%
L 8
 
5.6%
O 8
 
5.6%
D 7
 
4.9%
R 6
 
4.2%
Other values (13) 51
35.7%
Lowercase Letter
ValueCountFrequency (%)
o 19
13.7%
e 15
10.8%
n 12
8.6%
i 11
 
7.9%
a 11
 
7.9%
m 11
 
7.9%
s 9
 
6.5%
l 8
 
5.8%
t 8
 
5.8%
r 6
 
4.3%
Other values (11) 29
20.9%
Decimal Number
ValueCountFrequency (%)
0 3
27.3%
2 3
27.3%
9 1
 
9.1%
4 1
 
9.1%
5 1
 
9.1%
8 1
 
9.1%
7 1
 
9.1%
Other Punctuation
ValueCountFrequency (%)
. 9
75.0%
& 2
 
16.7%
' 1
 
8.3%
Space Separator
ValueCountFrequency (%)
120
100.0%
Open Punctuation
ValueCountFrequency (%)
( 61
100.0%
Close Punctuation
ValueCountFrequency (%)
) 61
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1982
78.2%
Latin 282
 
11.1%
Common 268
 
10.6%
Han 2
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
74
 
3.7%
65
 
3.3%
54
 
2.7%
53
 
2.7%
50
 
2.5%
47
 
2.4%
32
 
1.6%
31
 
1.6%
30
 
1.5%
26
 
1.3%
Other values (393) 1520
76.7%
Latin
ValueCountFrequency (%)
o 19
 
6.7%
e 15
 
5.3%
E 13
 
4.6%
n 12
 
4.3%
i 11
 
3.9%
S 11
 
3.9%
A 11
 
3.9%
a 11
 
3.9%
m 11
 
3.9%
N 10
 
3.5%
Other values (34) 158
56.0%
Common
ValueCountFrequency (%)
120
44.8%
( 61
22.8%
) 61
22.8%
. 9
 
3.4%
0 3
 
1.1%
2 3
 
1.1%
- 2
 
0.7%
& 2
 
0.7%
9 1
 
0.4%
' 1
 
0.4%
Other values (5) 5
 
1.9%
Han
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1981
78.2%
ASCII 550
 
21.7%
CJK 2
 
0.1%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
120
21.8%
( 61
 
11.1%
) 61
 
11.1%
o 19
 
3.5%
e 15
 
2.7%
E 13
 
2.4%
n 12
 
2.2%
i 11
 
2.0%
S 11
 
2.0%
A 11
 
2.0%
Other values (49) 216
39.3%
Hangul
ValueCountFrequency (%)
74
 
3.7%
65
 
3.3%
54
 
2.7%
53
 
2.7%
50
 
2.5%
47
 
2.4%
32
 
1.6%
31
 
1.6%
30
 
1.5%
26
 
1.3%
Other values (392) 1519
76.7%
CJK
ValueCountFrequency (%)
1
50.0%
1
50.0%
None
ValueCountFrequency (%)
1
100.0%

연락처
Text

MISSING 

Distinct109
Distinct (%)100.0%
Missing254
Missing (%)70.0%
Memory size3.0 KiB
2023-12-12T22:18:59.579422image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length12.174312
Min length11

Characters and Unicode

Total characters1327
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique109 ?
Unique (%)100.0%

Sample

1st row031-374-9213
2nd row031-373-4301
3rd row031-378-2288
4th row031-376-4941
5th row031-374-0124
ValueCountFrequency (%)
031-377-8795 1
 
0.9%
031-377-5866 1
 
0.9%
070-7620-2339 1
 
0.9%
0505-314-1751 1
 
0.9%
031-378-8745 1
 
0.9%
070-4412-8852 1
 
0.9%
031-374-5004 1
 
0.9%
031-375-6130 1
 
0.9%
031-393-0441 1
 
0.9%
070-7526-7796 1
 
0.9%
Other values (99) 99
90.8%
2023-12-12T22:19:00.029698image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 234
17.6%
- 218
16.4%
0 179
13.5%
1 147
11.1%
7 147
11.1%
5 77
 
5.8%
6 75
 
5.7%
2 73
 
5.5%
8 71
 
5.4%
4 59
 
4.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1109
83.6%
Dash Punctuation 218
 
16.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 234
21.1%
0 179
16.1%
1 147
13.3%
7 147
13.3%
5 77
 
6.9%
6 75
 
6.8%
2 73
 
6.6%
8 71
 
6.4%
4 59
 
5.3%
9 47
 
4.2%
Dash Punctuation
ValueCountFrequency (%)
- 218
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1327
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3 234
17.6%
- 218
16.4%
0 179
13.5%
1 147
11.1%
7 147
11.1%
5 77
 
5.8%
6 75
 
5.7%
2 73
 
5.5%
8 71
 
5.4%
4 59
 
4.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1327
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 234
17.6%
- 218
16.4%
0 179
13.5%
1 147
11.1%
7 147
11.1%
5 77
 
5.8%
6 75
 
5.7%
2 73
 
5.5%
8 71
 
5.4%
4 59
 
4.4%

Missing values

2023-12-12T22:18:58.603853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T22:18:58.675358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

업체명연락처
0삼우인터내셔널<NA>
1대원축산031-374-9213
2주식회사 에스제이우솔(SJ Woosol Co. Ltd.)<NA>
3갓팩토리<NA>
4다모어엠(Damore M)<NA>
5제이더블유(JW)유통<NA>
6성초<NA>
7거듭나다<NA>
8티에스케이인터내셔널 주식회사<NA>
9주식회사 텐바이오<NA>
업체명연락처
353인더로우<NA>
354웰빙나라<NA>
355허브비밀<NA>
356해피월드031-378-4371
357그린약국031-378-9054
358행복을짓는남매약국031-378-6858
359청아람 Food System031-302-4425
360백세식품031-373-9052
361세건홍삼전문점031-378-3435
362헬스보충제031-377-6180

Duplicate rows

Most frequently occurring

업체명연락처# duplicates
0벽돌집<NA>2