Overview

Dataset statistics

Number of variables5
Number of observations990
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory38.8 KiB
Average record size in memory40.1 B

Variable types

Text3
DateTime1
Categorical1

Dataset

Description전라북도 군산시에 소재한 담배소매인 지정 현황에 대한 데이터로 업소명, 업소지번주소, 업소도로명주소, 지정일자, 법인구분 항목명을 제공합니다.
URLhttps://www.data.go.kr/data/15117014/fileData.do

Alerts

법인구분 is highly imbalanced (59.9%)Imbalance

Reproduction

Analysis started2023-12-12 15:54:34.783991
Analysis finished2023-12-12 15:54:35.596402
Duration0.81 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct961
Distinct (%)97.1%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
2023-12-13T00:54:35.870389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length19
Mean length7.6606061
Min length1

Characters and Unicode

Total characters7584
Distinct characters475
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique936 ?
Unique (%)94.5%

Sample

1st row씨유 군산엠플레이스점
2nd row지에스25 미장주공점
3rd row서해그린슈퍼
4th row세븐일레븐 군산영화점
5th row씨유 군산장미점
ValueCountFrequency (%)
씨유 58
 
4.4%
세븐일레븐 55
 
4.1%
지에스25 29
 
2.2%
이마트24 23
 
1.7%
gs25 19
 
1.4%
지에스(gs)25 18
 
1.4%
유한회사 13
 
1.0%
주)코리아세븐 9
 
0.7%
편의점 6
 
0.5%
주식회사 6
 
0.5%
Other values (1006) 1090
82.2%
2023-12-13T00:54:36.357128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
387
 
5.1%
336
 
4.4%
324
 
4.3%
291
 
3.8%
237
 
3.1%
234
 
3.1%
2 160
 
2.1%
143
 
1.9%
141
 
1.9%
138
 
1.8%
Other values (465) 5193
68.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 6439
84.9%
Decimal Number 338
 
4.5%
Space Separator 336
 
4.4%
Uppercase Letter 208
 
2.7%
Close Punctuation 119
 
1.6%
Open Punctuation 115
 
1.5%
Lowercase Letter 24
 
0.3%
Other Punctuation 3
 
< 0.1%
Dash Punctuation 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
387
 
6.0%
324
 
5.0%
291
 
4.5%
237
 
3.7%
234
 
3.6%
143
 
2.2%
141
 
2.2%
138
 
2.1%
135
 
2.1%
131
 
2.0%
Other values (413) 4278
66.4%
Uppercase Letter
ValueCountFrequency (%)
S 76
36.5%
G 74
35.6%
K 9
 
4.3%
C 7
 
3.4%
U 6
 
2.9%
E 6
 
2.9%
L 4
 
1.9%
R 3
 
1.4%
N 3
 
1.4%
A 3
 
1.4%
Other values (13) 17
 
8.2%
Lowercase Letter
ValueCountFrequency (%)
l 5
20.8%
a 4
16.7%
o 3
12.5%
d 2
 
8.3%
e 2
 
8.3%
y 1
 
4.2%
x 1
 
4.2%
s 1
 
4.2%
r 1
 
4.2%
k 1
 
4.2%
Other values (3) 3
12.5%
Decimal Number
ValueCountFrequency (%)
2 160
47.3%
5 119
35.2%
4 38
 
11.2%
1 7
 
2.1%
8 4
 
1.2%
6 3
 
0.9%
3 2
 
0.6%
9 2
 
0.6%
7 2
 
0.6%
0 1
 
0.3%
Other Punctuation
ValueCountFrequency (%)
& 2
66.7%
' 1
33.3%
Space Separator
ValueCountFrequency (%)
336
100.0%
Close Punctuation
ValueCountFrequency (%)
) 119
100.0%
Open Punctuation
ValueCountFrequency (%)
( 115
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 6439
84.9%
Common 913
 
12.0%
Latin 232
 
3.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
387
 
6.0%
324
 
5.0%
291
 
4.5%
237
 
3.7%
234
 
3.6%
143
 
2.2%
141
 
2.2%
138
 
2.1%
135
 
2.1%
131
 
2.0%
Other values (413) 4278
66.4%
Latin
ValueCountFrequency (%)
S 76
32.8%
G 74
31.9%
K 9
 
3.9%
C 7
 
3.0%
U 6
 
2.6%
E 6
 
2.6%
l 5
 
2.2%
L 4
 
1.7%
a 4
 
1.7%
R 3
 
1.3%
Other values (26) 38
16.4%
Common
ValueCountFrequency (%)
336
36.8%
2 160
17.5%
5 119
 
13.0%
) 119
 
13.0%
( 115
 
12.6%
4 38
 
4.2%
1 7
 
0.8%
8 4
 
0.4%
6 3
 
0.3%
3 2
 
0.2%
Other values (6) 10
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 6439
84.9%
ASCII 1145
 
15.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
387
 
6.0%
324
 
5.0%
291
 
4.5%
237
 
3.7%
234
 
3.6%
143
 
2.2%
141
 
2.2%
138
 
2.1%
135
 
2.1%
131
 
2.0%
Other values (413) 4278
66.4%
ASCII
ValueCountFrequency (%)
336
29.3%
2 160
14.0%
5 119
 
10.4%
) 119
 
10.4%
( 115
 
10.0%
S 76
 
6.6%
G 74
 
6.5%
4 38
 
3.3%
K 9
 
0.8%
C 7
 
0.6%
Other values (42) 92
 
8.0%
Distinct978
Distinct (%)98.8%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
2023-12-13T00:54:36.744872image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length51
Median length42
Mean length22.749495
Min length15

Characters and Unicode

Total characters22522
Distinct characters324
Distinct categories10 ?
Distinct scripts4 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique967 ?
Unique (%)97.7%

Sample

1st row전라북도 군산시 지곡동 136-1
2nd row전라북도 군산시 미장동 476 군산미장휴먼시아
3rd row전라북도 군산시 소룡동 857 그린서해맨션아파트
4th row전라북도 군산시 영화동 12-6 FamillyMart
5th row전라북도 군산시 장미동 5-4 오성슈퍼
ValueCountFrequency (%)
군산시 991
20.5%
전라북도 990
20.5%
나운동 121
 
2.5%
소룡동 71
 
1.5%
조촌동 64
 
1.3%
수송동 64
 
1.3%
1호 53
 
1.1%
오식도동 53
 
1.1%
미룡동 43
 
0.9%
경암동 42
 
0.9%
Other values (1286) 2338
48.4%
2023-12-13T00:54:37.483289image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4492
19.9%
1144
 
5.1%
1119
 
5.0%
1037
 
4.6%
1015
 
4.5%
1015
 
4.5%
995
 
4.4%
993
 
4.4%
1 903
 
4.0%
871
 
3.9%
Other values (314) 8938
39.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 13175
58.5%
Space Separator 4492
 
19.9%
Decimal Number 4297
 
19.1%
Dash Punctuation 465
 
2.1%
Uppercase Letter 41
 
0.2%
Lowercase Letter 17
 
0.1%
Close Punctuation 13
 
0.1%
Open Punctuation 13
 
0.1%
Other Punctuation 8
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1144
 
8.7%
1119
 
8.5%
1037
 
7.9%
1015
 
7.7%
1015
 
7.7%
995
 
7.6%
993
 
7.5%
871
 
6.6%
624
 
4.7%
482
 
3.7%
Other values (266) 3880
29.4%
Uppercase Letter
ValueCountFrequency (%)
B 6
14.6%
S 5
12.2%
C 3
 
7.3%
A 3
 
7.3%
G 3
 
7.3%
K 2
 
4.9%
U 2
 
4.9%
O 2
 
4.9%
W 2
 
4.9%
F 2
 
4.9%
Other values (9) 11
26.8%
Lowercase Letter
ValueCountFrequency (%)
a 3
17.6%
y 3
17.6%
e 2
11.8%
l 2
11.8%
h 1
 
5.9%
u 1
 
5.9%
c 1
 
5.9%
m 1
 
5.9%
i 1
 
5.9%
r 1
 
5.9%
Decimal Number
ValueCountFrequency (%)
1 903
21.0%
3 465
10.8%
5 443
10.3%
2 423
9.8%
8 392
9.1%
4 367
8.5%
6 367
8.5%
0 324
 
7.5%
7 311
 
7.2%
9 302
 
7.0%
Other Punctuation
ValueCountFrequency (%)
@ 5
62.5%
/ 2
 
25.0%
: 1
 
12.5%
Space Separator
ValueCountFrequency (%)
4492
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 465
100.0%
Close Punctuation
ValueCountFrequency (%)
) 13
100.0%
Open Punctuation
ValueCountFrequency (%)
( 13
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 13173
58.5%
Common 9289
41.2%
Latin 58
 
0.3%
Han 2
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1144
 
8.7%
1119
 
8.5%
1037
 
7.9%
1015
 
7.7%
1015
 
7.7%
995
 
7.6%
993
 
7.5%
871
 
6.6%
624
 
4.7%
482
 
3.7%
Other values (264) 3878
29.4%
Latin
ValueCountFrequency (%)
B 6
 
10.3%
S 5
 
8.6%
C 3
 
5.2%
A 3
 
5.2%
G 3
 
5.2%
a 3
 
5.2%
y 3
 
5.2%
e 2
 
3.4%
K 2
 
3.4%
U 2
 
3.4%
Other values (20) 26
44.8%
Common
ValueCountFrequency (%)
4492
48.4%
1 903
 
9.7%
3 465
 
5.0%
- 465
 
5.0%
5 443
 
4.8%
2 423
 
4.6%
8 392
 
4.2%
4 367
 
4.0%
6 367
 
4.0%
0 324
 
3.5%
Other values (8) 648
 
7.0%
Han
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 13173
58.5%
ASCII 9347
41.5%
CJK 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4492
48.1%
1 903
 
9.7%
3 465
 
5.0%
- 465
 
5.0%
5 443
 
4.7%
2 423
 
4.5%
8 392
 
4.2%
4 367
 
3.9%
6 367
 
3.9%
0 324
 
3.5%
Other values (38) 706
 
7.6%
Hangul
ValueCountFrequency (%)
1144
 
8.7%
1119
 
8.5%
1037
 
7.9%
1015
 
7.7%
1015
 
7.7%
995
 
7.6%
993
 
7.5%
871
 
6.6%
624
 
4.7%
482
 
3.7%
Other values (264) 3878
29.4%
CJK
ValueCountFrequency (%)
1
50.0%
1
50.0%
Distinct973
Distinct (%)98.3%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
2023-12-13T00:54:37.970743image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length61
Median length53
Mean length25.577778
Min length14

Characters and Unicode

Total characters25322
Distinct characters293
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique957 ?
Unique (%)96.7%

Sample

1st row전라북도 군산시 계산로 51 1층 106호 136-7 (지곡동)
2nd row전라북도 군산시 경포천로 34 분산상가동 102 103호 (미장동 군산미장휴먼시아)
3rd row전라북도 군산시 풍전3길 26 1층 101호 (소룡동)
4th row전라북도 군산시 구영5길 129 1층 (영화동)
5th row전라북도 군산시 장미2길 13 1층 (장미동)
ValueCountFrequency (%)
군산시 991
 
17.7%
전라북도 990
 
17.7%
1층 141
 
2.5%
나운동 133
 
2.4%
소룡동 65
 
1.2%
조촌동 64
 
1.1%
수송동 61
 
1.1%
101호 54
 
1.0%
오식도동 52
 
0.9%
미룡동 43
 
0.8%
Other values (1067) 3012
53.7%
2023-12-13T00:54:38.813468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5078
20.1%
1134
 
4.5%
1 1129
 
4.5%
1120
 
4.4%
1060
 
4.2%
1023
 
4.0%
1020
 
4.0%
1004
 
4.0%
995
 
3.9%
970
 
3.8%
Other values (283) 10789
42.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 14666
57.9%
Space Separator 5078
 
20.1%
Decimal Number 3786
 
15.0%
Close Punctuation 812
 
3.2%
Open Punctuation 812
 
3.2%
Dash Punctuation 140
 
0.6%
Uppercase Letter 12
 
< 0.1%
Other Punctuation 11
 
< 0.1%
Math Symbol 3
 
< 0.1%
Lowercase Letter 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1134
 
7.7%
1120
 
7.6%
1060
 
7.2%
1023
 
7.0%
1020
 
7.0%
1004
 
6.8%
995
 
6.8%
970
 
6.6%
584
 
4.0%
426
 
2.9%
Other values (256) 5330
36.3%
Decimal Number
ValueCountFrequency (%)
1 1129
29.8%
2 552
14.6%
3 414
 
10.9%
0 389
 
10.3%
4 302
 
8.0%
5 268
 
7.1%
6 203
 
5.4%
7 195
 
5.2%
9 176
 
4.6%
8 158
 
4.2%
Uppercase Letter
ValueCountFrequency (%)
A 3
25.0%
D 2
16.7%
B 2
16.7%
S 2
16.7%
G 2
16.7%
E 1
 
8.3%
Other Punctuation
ValueCountFrequency (%)
, 5
45.5%
. 4
36.4%
@ 1
 
9.1%
/ 1
 
9.1%
Lowercase Letter
ValueCountFrequency (%)
e 1
50.0%
p 1
50.0%
Space Separator
ValueCountFrequency (%)
5078
100.0%
Close Punctuation
ValueCountFrequency (%)
) 812
100.0%
Open Punctuation
ValueCountFrequency (%)
( 812
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 140
100.0%
Math Symbol
ValueCountFrequency (%)
~ 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 14666
57.9%
Common 10642
42.0%
Latin 14
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1134
 
7.7%
1120
 
7.6%
1060
 
7.2%
1023
 
7.0%
1020
 
7.0%
1004
 
6.8%
995
 
6.8%
970
 
6.6%
584
 
4.0%
426
 
2.9%
Other values (256) 5330
36.3%
Common
ValueCountFrequency (%)
5078
47.7%
1 1129
 
10.6%
) 812
 
7.6%
( 812
 
7.6%
2 552
 
5.2%
3 414
 
3.9%
0 389
 
3.7%
4 302
 
2.8%
5 268
 
2.5%
6 203
 
1.9%
Other values (9) 683
 
6.4%
Latin
ValueCountFrequency (%)
A 3
21.4%
D 2
14.3%
B 2
14.3%
S 2
14.3%
G 2
14.3%
e 1
 
7.1%
p 1
 
7.1%
E 1
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 14666
57.9%
ASCII 10656
42.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5078
47.7%
1 1129
 
10.6%
) 812
 
7.6%
( 812
 
7.6%
2 552
 
5.2%
3 414
 
3.9%
0 389
 
3.7%
4 302
 
2.8%
5 268
 
2.5%
6 203
 
1.9%
Other values (17) 697
 
6.5%
Hangul
ValueCountFrequency (%)
1134
 
7.7%
1120
 
7.6%
1060
 
7.2%
1023
 
7.0%
1020
 
7.0%
1004
 
6.8%
995
 
6.8%
970
 
6.6%
584
 
4.0%
426
 
2.9%
Other values (256) 5330
36.3%
Distinct842
Distinct (%)85.1%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
Minimum1979-06-30 00:00:00
Maximum2023-07-19 00:00:00
2023-12-13T00:54:39.024946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:54:39.269562image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

법인구분
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
개인
911 
법인
 
79

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row개인
2nd row개인
3rd row개인
4th row개인
5th row개인

Common Values

ValueCountFrequency (%)
개인 911
92.0%
법인 79
 
8.0%

Length

2023-12-13T00:54:39.507450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:54:39.689168image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
개인 911
92.0%
법인 79
 
8.0%

Missing values

2023-12-13T00:54:35.414862image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T00:54:35.544047image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

업소명업소지번주소업소도로명주소지정일자법인구분
0씨유 군산엠플레이스점전라북도 군산시 지곡동 136-1전라북도 군산시 계산로 51 1층 106호 136-7 (지곡동)2023-07-19개인
1지에스25 미장주공점전라북도 군산시 미장동 476 군산미장휴먼시아전라북도 군산시 경포천로 34 분산상가동 102 103호 (미장동 군산미장휴먼시아)2023-07-14개인
2서해그린슈퍼전라북도 군산시 소룡동 857 그린서해맨션아파트전라북도 군산시 풍전3길 26 1층 101호 (소룡동)2023-07-14개인
3세븐일레븐 군산영화점전라북도 군산시 영화동 12-6 FamillyMart전라북도 군산시 구영5길 129 1층 (영화동)2023-07-06개인
4씨유 군산장미점전라북도 군산시 장미동 5-4 오성슈퍼전라북도 군산시 장미2길 13 1층 (장미동)2023-07-04개인
5쉽고빠른마켓전라북도 군산시 미룡동 59-10전라북도 군산시 미룡로 73 1층 2호 (미룡동)2023-07-03개인
6지에스25 군산더샵점전라북도 군산시 조촌동 3976 더샵디오션시티전라북도 군산시 궁포1로 79 상가동 431동 101호 102호 (조촌동 더샵디오션시티)2023-06-27개인
7수 식자재마트전라북도 군산시 내흥동 1002-4전라북도 군산시 선사2길 7 101~112호 (내흥동)2023-06-21개인
8씨유 은파호수공원점전라북도 군산시 지곡동 379-1전라북도 군산시 계산2길 93 (지곡동)2023-06-19개인
9지에스25 군산클래스점전라북도 군산시 내흥동 925전라북도 군산시 정자로 13 상가2동 1층 101 102호 (내흥동)2023-06-13개인
업소명업소지번주소업소도로명주소지정일자법인구분
980풍성상회전라북도 군산시 회현면 학당리 567-3호전라북도 군산시 회현면 회미로 5401979-06-30개인
981회현농업협동조합전라북도 군산시 회현면 대정리 65-1호전라북도 군산시 회현면 광지산길 61998-07-15개인
982옥구농업협동조합전라북도 군산시 옥구읍 선제리 509-7호전라북도 군산시 옥구읍 옥구로 371998-10-29법인
983옥일상회전라북도 군산시 옥구읍 선제리 513-2호전라북도 군산시 옥구읍 예기길 42000-08-14개인
984창원슈퍼전라북도 군산시 송풍동 956-4 창원슈퍼전라북도 군산시 청소년회관로 71 창원슈퍼 (송풍동)1998-10-29개인
985대형상회전라북도 군산시 대야면 산월리 294호전라북도 군산시 석화들길 751998-01-01개인
986군산시청청우회전라북도 군산시 조촌동 888호전라북도 군산시 시청로 17 (조촌동)1996-12-30개인
987만자슈퍼전라북도 군산시 대야면 지경리 850호전라북도 군산시 대야면 만자1길 191988-02-15개인
988가축병원전라북도 군산시 대야면 지경리 731-2호전라북도 군산시 대야면 우덕2길 51980-12-24개인
989구내매점전라북도 군산시 경암동 614-2번지 터미널내(전주 익산방면)전라북도 군산시 해망로 182000-09-21개인