Overview

Dataset statistics

Number of variables15
Number of observations1957
Missing cells4279
Missing cells (%)14.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory231.4 KiB
Average record size in memory121.1 B

Variable types

Numeric1
Categorical2
Text12

Dataset

Description고유번호,언어,상호명,콘텐츠URL,주소,신주소,전화번호,팩스번호,웹사이트,운영시간,운영요일,휴무일,교통정보,태그,장애인편의시설
Author서울관광재단
URLhttps://data.seoul.go.kr/dataList/OA-21050/S/1/datasetView.do

Alerts

고유번호 is highly overall correlated with 팩스번호High correlation
팩스번호 is highly overall correlated with 고유번호High correlation
팩스번호 is highly imbalanced (90.4%)Imbalance
전화번호 has 113 (5.8%) missing valuesMissing
웹사이트 has 614 (31.4%) missing valuesMissing
운영시간 has 259 (13.2%) missing valuesMissing
운영요일 has 919 (47.0%) missing valuesMissing
휴무일 has 555 (28.4%) missing valuesMissing
교통정보 has 82 (4.2%) missing valuesMissing
장애인편의시설 has 1737 (88.8%) missing valuesMissing
고유번호 has unique valuesUnique
콘텐츠URL has unique valuesUnique

Reproduction

Analysis started2024-05-18 04:37:12.316645
Analysis finished2024-05-18 04:37:26.687967
Duration14.37 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

고유번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct1957
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23132.124
Minimum36
Maximum45594
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size17.3 KiB
2024-05-18T13:37:27.088369image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum36
5-th percentile1307.6
Q16254
median24735
Q334919
95-th percentile44071.4
Maximum45594
Range45558
Interquartile range (IQR)28665

Descriptive statistics

Standard deviation14936.926
Coefficient of variation (CV)0.6457222
Kurtosis-1.2344201
Mean23132.124
Median Absolute Deviation (MAD)13425
Skewness-0.11786277
Sum45269567
Variance2.2311176 × 108
MonotonicityNot monotonic
2024-05-18T13:37:27.539995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
45520 1
 
0.1%
2731 1
 
0.1%
2248 1
 
0.1%
22070 1
 
0.1%
27131 1
 
0.1%
6995 1
 
0.1%
24806 1
 
0.1%
28356 1
 
0.1%
4104 1
 
0.1%
4105 1
 
0.1%
Other values (1947) 1947
99.5%
ValueCountFrequency (%)
36 1
0.1%
37 1
0.1%
72 1
0.1%
73 1
0.1%
74 1
0.1%
75 1
0.1%
76 1
0.1%
77 1
0.1%
78 1
0.1%
79 1
0.1%
ValueCountFrequency (%)
45594 1
0.1%
45576 1
0.1%
45575 1
0.1%
45574 1
0.1%
45573 1
0.1%
45568 1
0.1%
45567 1
0.1%
45564 1
0.1%
45563 1
0.1%
45562 1
0.1%

언어
Categorical

Distinct5
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size15.4 KiB
zh-TW
399 
en
396 
ja
395 
zh-CN
386 
ko
381 

Length

Max length5
Median length2
Mean length3.2033725
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowen
2nd rowen
3rd rowen
4th rowen
5th rowen

Common Values

ValueCountFrequency (%)
zh-TW 399
20.4%
en 396
20.2%
ja 395
20.2%
zh-CN 386
19.7%
ko 381
19.5%

Length

2024-05-18T13:37:28.218791image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T13:37:28.640400image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
zh-tw 399
20.4%
en 396
20.2%
ja 395
20.2%
zh-cn 386
19.7%
ko 381
19.5%
Distinct1871
Distinct (%)95.6%
Missing0
Missing (%)0.0%
Memory size15.4 KiB
2024-05-18T13:37:29.475135image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length74
Median length58
Mean length11.170669
Min length2

Characters and Unicode

Total characters21861
Distinct characters1234
Distinct categories12 ?
Distinct scripts6 ?
Distinct blocks9 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1795 ?
Unique (%)91.7%

Sample

1st rowBaekyang Laundry
2nd rowChangsin-Sungin Quarry Observatory
3rd rowChangsin-dong's Cliff Village
4th rowChoong Ang High School
5th rowChoong Ang Store
ValueCountFrequency (%)
museum 80
 
2.6%
seoul 39
 
1.3%
of 38
 
1.2%
center 36
 
1.2%
art 30
 
1.0%
gallery 28
 
0.9%
information 20
 
0.7%
tourist 18
 
0.6%
national 18
 
0.6%
korea 13
 
0.4%
Other values (2192) 2739
89.5%
2024-05-18T13:37:31.219594image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1151
 
5.3%
? 1138
 
5.2%
e 826
 
3.8%
o 750
 
3.4%
a 744
 
3.4%
n 741
 
3.4%
r 493
 
2.3%
u 485
 
2.2%
i 461
 
2.1%
419
 
1.9%
Other values (1224) 14653
67.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 9544
43.7%
Lowercase Letter 7345
33.6%
Uppercase Letter 1707
 
7.8%
Other Punctuation 1227
 
5.6%
Space Separator 1152
 
5.3%
Close Punctuation 352
 
1.6%
Open Punctuation 352
 
1.6%
Decimal Number 142
 
0.6%
Dash Punctuation 29
 
0.1%
Control 6
 
< 0.1%
Other values (2) 5
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
419
 
4.4%
239
 
2.5%
162
 
1.7%
161
 
1.7%
136
 
1.4%
96
 
1.0%
93
 
1.0%
89
 
0.9%
84
 
0.9%
84
 
0.9%
Other values (1139) 7981
83.6%
Lowercase Letter
ValueCountFrequency (%)
e 826
11.2%
o 750
10.2%
a 744
10.1%
n 741
10.1%
r 493
 
6.7%
u 485
 
6.6%
i 461
 
6.3%
l 394
 
5.4%
t 373
 
5.1%
g 336
 
4.6%
Other values (16) 1742
23.7%
Uppercase Letter
ValueCountFrequency (%)
S 223
13.1%
M 173
 
10.1%
C 151
 
8.8%
A 126
 
7.4%
H 101
 
5.9%
G 96
 
5.6%
T 91
 
5.3%
B 77
 
4.5%
P 74
 
4.3%
I 66
 
3.9%
Other values (16) 529
31.0%
Other Punctuation
ValueCountFrequency (%)
? 1138
92.7%
· 33
 
2.7%
' 16
 
1.3%
& 15
 
1.2%
. 13
 
1.1%
3
 
0.2%
: 2
 
0.2%
2
 
0.2%
, 2
 
0.2%
/ 1
 
0.1%
Other values (2) 2
 
0.2%
Decimal Number
ValueCountFrequency (%)
1 29
20.4%
7 18
12.7%
3 18
12.7%
8 17
12.0%
2 15
10.6%
6 15
10.6%
9 15
10.6%
4 9
 
6.3%
0 6
 
4.2%
Close Punctuation
ValueCountFrequency (%)
) 334
94.9%
16
 
4.5%
2
 
0.6%
Open Punctuation
ValueCountFrequency (%)
( 334
94.9%
16
 
4.5%
2
 
0.6%
Space Separator
ValueCountFrequency (%)
1151
99.9%
  1
 
0.1%
Dash Punctuation
ValueCountFrequency (%)
- 29
100.0%
Control
ValueCountFrequency (%)
6
100.0%
Final Punctuation
ValueCountFrequency (%)
4
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9052
41.4%
Han 4968
22.7%
Common 3265
 
14.9%
Hangul 2392
 
10.9%
Katakana 2147
 
9.8%
Hiragana 37
 
0.2%

Most frequent character per script

Han
ValueCountFrequency (%)
239
 
4.8%
162
 
3.3%
161
 
3.2%
93
 
1.9%
84
 
1.7%
67
 
1.3%
62
 
1.2%
62
 
1.2%
60
 
1.2%
55
 
1.1%
Other values (655) 3923
79.0%
Hangul
ValueCountFrequency (%)
136
 
5.7%
63
 
2.6%
52
 
2.2%
51
 
2.1%
51
 
2.1%
45
 
1.9%
44
 
1.8%
36
 
1.5%
34
 
1.4%
32
 
1.3%
Other values (377) 1848
77.3%
Katakana
ValueCountFrequency (%)
419
 
19.5%
96
 
4.5%
89
 
4.1%
84
 
3.9%
84
 
3.9%
80
 
3.7%
80
 
3.7%
70
 
3.3%
48
 
2.2%
47
 
2.2%
Other values (66) 1050
48.9%
Latin
ValueCountFrequency (%)
e 826
 
9.1%
o 750
 
8.3%
a 744
 
8.2%
n 741
 
8.2%
r 493
 
5.4%
u 485
 
5.4%
i 461
 
5.1%
l 394
 
4.4%
t 373
 
4.1%
g 336
 
3.7%
Other values (42) 3449
38.1%
Common
ValueCountFrequency (%)
1151
35.3%
? 1138
34.9%
) 334
 
10.2%
( 334
 
10.2%
· 33
 
1.0%
1 29
 
0.9%
- 29
 
0.9%
7 18
 
0.6%
3 18
 
0.6%
8 17
 
0.5%
Other values (23) 164
 
5.0%
Hiragana
ValueCountFrequency (%)
7
18.9%
6
16.2%
2
 
5.4%
2
 
5.4%
2
 
5.4%
2
 
5.4%
2
 
5.4%
1
 
2.7%
1
 
2.7%
1
 
2.7%
Other values (11) 11
29.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12235
56.0%
CJK 4965
22.7%
Hangul 2392
 
10.9%
Katakana 2147
 
9.8%
None 77
 
0.4%
Hiragana 37
 
0.2%
Punctuation 4
 
< 0.1%
CJK Compat Ideographs 3
 
< 0.1%
Box Drawing 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1151
 
9.4%
? 1138
 
9.3%
e 826
 
6.8%
o 750
 
6.1%
a 744
 
6.1%
n 741
 
6.1%
r 493
 
4.0%
u 485
 
4.0%
i 461
 
3.8%
l 394
 
3.2%
Other values (63) 5052
41.3%
Katakana
ValueCountFrequency (%)
419
 
19.5%
96
 
4.5%
89
 
4.1%
84
 
3.9%
84
 
3.9%
80
 
3.7%
80
 
3.7%
70
 
3.3%
48
 
2.2%
47
 
2.2%
Other values (66) 1050
48.9%
CJK
ValueCountFrequency (%)
239
 
4.8%
162
 
3.3%
161
 
3.2%
93
 
1.9%
84
 
1.7%
67
 
1.3%
62
 
1.2%
62
 
1.2%
60
 
1.2%
55
 
1.1%
Other values (653) 3920
79.0%
Hangul
ValueCountFrequency (%)
136
 
5.7%
63
 
2.6%
52
 
2.2%
51
 
2.1%
51
 
2.1%
45
 
1.9%
44
 
1.8%
36
 
1.5%
34
 
1.4%
32
 
1.3%
Other values (377) 1848
77.3%
None
ValueCountFrequency (%)
· 33
42.9%
16
20.8%
16
20.8%
3
 
3.9%
2
 
2.6%
2
 
2.6%
2
 
2.6%
1
 
1.3%
1
 
1.3%
  1
 
1.3%
Hiragana
ValueCountFrequency (%)
7
18.9%
6
16.2%
2
 
5.4%
2
 
5.4%
2
 
5.4%
2
 
5.4%
2
 
5.4%
1
 
2.7%
1
 
2.7%
1
 
2.7%
Other values (11) 11
29.7%
Punctuation
ValueCountFrequency (%)
4
100.0%
CJK Compat Ideographs
ValueCountFrequency (%)
2
66.7%
1
33.3%
Box Drawing
ValueCountFrequency (%)
1
100.0%

콘텐츠URL
Text

UNIQUE 

Distinct1957
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size15.4 KiB
2024-05-18T13:37:32.551742image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length204
Median length190
Mean length136.51303
Min length124

Characters and Unicode

Total characters267156
Distinct characters1069
Distinct categories14 ?
Distinct scripts7 ?
Distinct blocks9 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1957 ?
Unique (%)100.0%

Sample

1st rowhttps://english.visitseoul.net/attractions/Baekyang-2024/ENP8onuvv?utm_source=seoulopendata&utm_medium=attractions&utm_content=ENP8onuvv
2nd rowhttps://english.visitseoul.net/attractions/2024-Chaeseokjangjeonmangdae/ENPauov7d?utm_source=seoulopendata&utm_medium=attractions&utm_content=ENPauov7d
3rd rowhttps://english.visitseoul.net/attractions/2024-changsincliff/ENPgvo4y2?utm_source=seoulopendata&utm_medium=attractions&utm_content=ENPgvo4y2
4th rowhttps://english.visitseoul.net/attractions/ChoongAngHighSchool/ENPgcblme?utm_source=seoulopendata&utm_medium=attractions&utm_content=ENPgcblme
5th rowhttps://english.visitseoul.net/attractions/ChoongAngStore/ENPl7gype?utm_source=seoulopendata&utm_medium=attractions&utm_content=ENPl7gype
ValueCountFrequency (%)
https://english.visitseoul.net/attractions/baekyang-2024/enp8onuvv?utm_source=seoulopendata&utm_medium=attractions&utm_content=enp8onuvv 1
 
0.1%
https://chinese.visitseoul.net/attractions/2023043/cnpw01l8t?utm_source=seoulopendata&utm_medium=attractions&utm_content=cnpw01l8t 1
 
0.1%
https://tchinese.visitseoul.net/attractions/死六臣公園/tcp004513?utm_source=seoulopendata&utm_medium=attractions&utm_content=tcp004513 1
 
0.1%
https://chinese.visitseoul.net/attractions/文化理容院1/cnp026953?utm_source=seoulopendata&utm_medium=attractions&utm_content=cnp026953 1
 
0.1%
https://tchinese.visitseoul.net/attractions/孫基禎紀念館/tcp006995?utm_source=seoulopendata&utm_medium=attractions&utm_content=tcp006995 1
 
0.1%
https://chinese.visitseoul.net/attractions/?路西服店/cnp024708?utm_source=seoulopendata&utm_medium=attractions&utm_content=cnp024708 1
 
0.1%
https://japanese.visitseoul.net/attractions/hongik-bookstore-jp/jpp028332?utm_source=seoulopendata&utm_medium=attractions&utm_content=jpp028332 1
 
0.1%
https://chinese.visitseoul.net/attractions/弘智?和?春台城(홍지문탕춘대성)/cnp004101?utm_source=seoulopendata&utm_medium=attractions&utm_content=cnp004101 1
 
0.1%
https://tchinese.visitseoul.net/attractions/弘智門和蕩春臺城/tcp004101?utm_source=seoulopendata&utm_medium=attractions&utm_content=tcp004101 1
 
0.1%
https://tchinese.visitseoul.net/attractions/2023015/tcpqc1t0c?utm_source=seoulopendata&utm_medium=attractions&utm_content=tcpqc1t0c 1
 
0.1%
Other values (1947) 1947
99.5%
2024-05-18T13:37:35.016755image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 32539
 
12.2%
e 18433
 
6.9%
s 16021
 
6.0%
o 15520
 
5.8%
n 15148
 
5.7%
u 14677
 
5.5%
a 14238
 
5.3%
i 11861
 
4.4%
m 10397
 
3.9%
/ 9787
 
3.7%
Other values (1059) 108535
40.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 189300
70.9%
Decimal Number 22533
 
8.4%
Other Punctuation 22275
 
8.3%
Uppercase Letter 13660
 
5.1%
Math Symbol 5871
 
2.2%
Connector Punctuation 5871
 
2.2%
Other Letter 5577
 
2.1%
Dash Punctuation 2007
 
0.8%
Open Punctuation 26
 
< 0.1%
Close Punctuation 25
 
< 0.1%
Other values (4) 11
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
144
 
2.6%
141
 
2.5%
115
 
2.1%
114
 
2.0%
67
 
1.2%
59
 
1.1%
58
 
1.0%
57
 
1.0%
57
 
1.0%
53
 
1.0%
Other values (970) 4712
84.5%
Lowercase Letter
ValueCountFrequency (%)
t 32539
17.2%
e 18433
9.7%
s 16021
8.5%
o 15520
8.2%
n 15148
8.0%
u 14677
7.8%
a 14238
7.5%
i 11861
 
6.3%
m 10397
 
5.5%
c 9100
 
4.8%
Other values (21) 31366
16.6%
Uppercase Letter
ValueCountFrequency (%)
P 4832
35.4%
C 1749
 
12.8%
N 1641
 
12.0%
T 911
 
6.7%
J 874
 
6.4%
E 845
 
6.2%
K 829
 
6.1%
O 802
 
5.9%
S 249
 
1.8%
M 175
 
1.3%
Other values (16) 753
 
5.5%
Decimal Number
ValueCountFrequency (%)
0 6612
29.3%
2 3289
14.6%
1 2195
 
9.7%
3 2119
 
9.4%
5 1593
 
7.1%
7 1400
 
6.2%
4 1373
 
6.1%
6 1363
 
6.0%
9 1323
 
5.9%
8 1266
 
5.6%
Other Punctuation
ValueCountFrequency (%)
/ 9787
43.9%
& 3914
 
17.6%
. 3914
 
17.6%
? 2696
 
12.1%
: 1957
 
8.8%
· 5
 
< 0.1%
2
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
24
92.3%
1
 
3.8%
( 1
 
3.8%
Close Punctuation
ValueCountFrequency (%)
23
92.0%
1
 
4.0%
) 1
 
4.0%
Dash Punctuation
ValueCountFrequency (%)
- 2005
99.9%
2
 
0.1%
Final Punctuation
ValueCountFrequency (%)
3
75.0%
1
 
25.0%
Math Symbol
ValueCountFrequency (%)
= 5871
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 5871
100.0%
Format
ValueCountFrequency (%)
­ 5
100.0%
Initial Punctuation
ValueCountFrequency (%)
1
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 202953
76.0%
Common 58619
 
21.9%
Han 3116
 
1.2%
Hangul 2202
 
0.8%
Katakana 251
 
0.1%
Hiragana 8
 
< 0.1%
Cyrillic 7
 
< 0.1%

Most frequent character per script

Han
ValueCountFrequency (%)
141
 
4.5%
115
 
3.7%
114
 
3.7%
67
 
2.2%
59
 
1.9%
47
 
1.5%
41
 
1.3%
39
 
1.3%
38
 
1.2%
36
 
1.2%
Other values (573) 2419
77.6%
Hangul
ValueCountFrequency (%)
144
 
6.5%
58
 
2.6%
57
 
2.6%
57
 
2.6%
53
 
2.4%
48
 
2.2%
37
 
1.7%
36
 
1.6%
33
 
1.5%
32
 
1.5%
Other values (328) 1647
74.8%
Katakana
ValueCountFrequency (%)
32
 
12.7%
20
 
8.0%
16
 
6.4%
16
 
6.4%
10
 
4.0%
9
 
3.6%
9
 
3.6%
9
 
3.6%
9
 
3.6%
9
 
3.6%
Other values (42) 112
44.6%
Latin
ValueCountFrequency (%)
t 32539
16.0%
e 18433
9.1%
s 16021
 
7.9%
o 15520
 
7.6%
n 15148
 
7.5%
u 14677
 
7.2%
a 14238
 
7.0%
i 11861
 
5.8%
m 10397
 
5.1%
c 9100
 
4.5%
Other values (41) 45019
22.2%
Common
ValueCountFrequency (%)
/ 9787
16.7%
0 6612
11.3%
= 5871
10.0%
_ 5871
10.0%
& 3914
 
6.7%
. 3914
 
6.7%
2 3289
 
5.6%
? 2696
 
4.6%
1 2195
 
3.7%
3 2119
 
3.6%
Other values (22) 12351
21.1%
Hiragana
ValueCountFrequency (%)
2
25.0%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Cyrillic
ValueCountFrequency (%)
т 2
28.6%
р 1
14.3%
е 1
14.3%
у 1
14.3%
с 1
14.3%
Э 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 261503
97.9%
CJK 3116
 
1.2%
Hangul 2202
 
0.8%
Katakana 251
 
0.1%
None 61
 
< 0.1%
Hiragana 8
 
< 0.1%
Punctuation 7
 
< 0.1%
Cyrillic 7
 
< 0.1%
Box Drawing 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 32539
 
12.4%
e 18433
 
7.0%
s 16021
 
6.1%
o 15520
 
5.9%
n 15148
 
5.8%
u 14677
 
5.6%
a 14238
 
5.4%
i 11861
 
4.5%
m 10397
 
4.0%
/ 9787
 
3.7%
Other values (61) 102882
39.3%
Hangul
ValueCountFrequency (%)
144
 
6.5%
58
 
2.6%
57
 
2.6%
57
 
2.6%
53
 
2.4%
48
 
2.2%
37
 
1.7%
36
 
1.6%
33
 
1.5%
32
 
1.5%
Other values (328) 1647
74.8%
CJK
ValueCountFrequency (%)
141
 
4.5%
115
 
3.7%
114
 
3.7%
67
 
2.2%
59
 
1.9%
47
 
1.5%
41
 
1.3%
39
 
1.3%
38
 
1.2%
36
 
1.2%
Other values (573) 2419
77.6%
Katakana
ValueCountFrequency (%)
32
 
12.7%
20
 
8.0%
16
 
6.4%
16
 
6.4%
10
 
4.0%
9
 
3.6%
9
 
3.6%
9
 
3.6%
9
 
3.6%
9
 
3.6%
Other values (42) 112
44.6%
None
ValueCountFrequency (%)
24
39.3%
23
37.7%
­ 5
 
8.2%
· 5
 
8.2%
2
 
3.3%
1
 
1.6%
1
 
1.6%
Punctuation
ValueCountFrequency (%)
3
42.9%
2
28.6%
1
 
14.3%
1
 
14.3%
Hiragana
ValueCountFrequency (%)
2
25.0%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Cyrillic
ValueCountFrequency (%)
т 2
28.6%
р 1
14.3%
е 1
14.3%
у 1
14.3%
с 1
14.3%
Э 1
14.3%
Box Drawing
ValueCountFrequency (%)
1
100.0%

주소
Text

Distinct1324
Distinct (%)67.7%
Missing0
Missing (%)0.0%
Memory size15.4 KiB
2024-05-18T13:37:36.067527image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length79
Median length65
Mean length19.777721
Min length2

Characters and Unicode

Total characters38705
Distinct characters685
Distinct categories12 ?
Distinct scripts6 ?
Distinct blocks7 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1271 ?
Unique (%)64.9%

Sample

1st row 140-24, Gye-dong, Jongno-gu, Seoul, Korea
2nd row 서울 종로구 창신동 23-322
3rd row 서울 종로구 창신동 23-322
4th row 1, Gye-dong, Jongno-gu, Seoul, Korea
5th row 2-105, Gye-dong, Jongno-gu, Seoul, Korea
ValueCountFrequency (%)
서울 331
 
6.6%
seoul 277
 
5.5%
종로구 129
 
2.6%
jongno-gu 114
 
2.3%
중구 65
 
1.3%
jung-gu 44
 
0.9%
korea 43
 
0.9%
1-1 29
 
0.6%
100-120 29
 
0.6%
110-062 27
 
0.5%
Other values (1784) 3905
78.2%
2024-05-18T13:37:37.984329image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6125
 
15.8%
1 2974
 
7.7%
- 2442
 
6.3%
0 2014
 
5.2%
2 1268
 
3.3%
? 1185
 
3.1%
o 1174
 
3.0%
g 1001
 
2.6%
n 969
 
2.5%
3 845
 
2.2%
Other values (675) 18708
48.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 10758
27.8%
Other Letter 10054
26.0%
Lowercase Letter 6192
16.0%
Space Separator 6126
15.8%
Dash Punctuation 2442
 
6.3%
Other Punctuation 1996
 
5.2%
Uppercase Letter 954
 
2.5%
Close Punctuation 88
 
0.2%
Open Punctuation 88
 
0.2%
Control 4
 
< 0.1%
Other values (2) 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
639
 
6.4%
625
 
6.2%
495
 
4.9%
443
 
4.4%
421
 
4.2%
365
 
3.6%
362
 
3.6%
332
 
3.3%
277
 
2.8%
270
 
2.7%
Other values (604) 5825
57.9%
Lowercase Letter
ValueCountFrequency (%)
o 1174
19.0%
g 1001
16.2%
n 969
15.6%
u 720
11.6%
e 580
9.4%
a 370
 
6.0%
l 331
 
5.3%
d 273
 
4.4%
i 114
 
1.8%
r 92
 
1.5%
Other values (13) 568
9.2%
Uppercase Letter
ValueCountFrequency (%)
S 403
42.2%
J 180
18.9%
G 71
 
7.4%
Y 45
 
4.7%
K 44
 
4.6%
B 31
 
3.2%
H 29
 
3.0%
M 24
 
2.5%
D 21
 
2.2%
N 18
 
1.9%
Other values (13) 88
 
9.2%
Decimal Number
ValueCountFrequency (%)
1 2974
27.6%
0 2014
18.7%
2 1268
11.8%
3 845
 
7.9%
8 726
 
6.7%
4 714
 
6.6%
5 681
 
6.3%
7 628
 
5.8%
6 522
 
4.9%
9 386
 
3.6%
Other Punctuation
ValueCountFrequency (%)
? 1185
59.4%
, 807
40.4%
· 2
 
0.1%
. 1
 
0.1%
1
 
0.1%
Space Separator
ValueCountFrequency (%)
6125
> 99.9%
  1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 84
95.5%
4
 
4.5%
Open Punctuation
ValueCountFrequency (%)
( 84
95.5%
4
 
4.5%
Dash Punctuation
ValueCountFrequency (%)
- 2442
100.0%
Control
ValueCountFrequency (%)
4
100.0%
Math Symbol
ValueCountFrequency (%)
~ 2
100.0%
Format
ValueCountFrequency (%)
­ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 21505
55.6%
Latin 7146
 
18.5%
Han 5719
 
14.8%
Hangul 3332
 
8.6%
Katakana 1000
 
2.6%
Hiragana 3
 
< 0.1%

Most frequent character per script

Han
ValueCountFrequency (%)
639
 
11.2%
625
 
10.9%
495
 
8.7%
443
 
7.7%
267
 
4.7%
261
 
4.6%
169
 
3.0%
147
 
2.6%
137
 
2.4%
117
 
2.0%
Other values (333) 2419
42.3%
Hangul
ValueCountFrequency (%)
421
 
12.6%
365
 
11.0%
362
 
10.9%
332
 
10.0%
186
 
5.6%
143
 
4.3%
83
 
2.5%
68
 
2.0%
57
 
1.7%
42
 
1.3%
Other values (206) 1273
38.2%
Katakana
ValueCountFrequency (%)
277
27.7%
270
27.0%
269
26.9%
42
 
4.2%
11
 
1.1%
8
 
0.8%
7
 
0.7%
7
 
0.7%
7
 
0.7%
6
 
0.6%
Other values (42) 96
 
9.6%
Latin
ValueCountFrequency (%)
o 1174
16.4%
g 1001
14.0%
n 969
13.6%
u 720
10.1%
e 580
8.1%
S 403
 
5.6%
a 370
 
5.2%
l 331
 
4.6%
d 273
 
3.8%
J 180
 
2.5%
Other values (36) 1145
16.0%
Common
ValueCountFrequency (%)
6125
28.5%
1 2974
13.8%
- 2442
 
11.4%
0 2014
 
9.4%
2 1268
 
5.9%
? 1185
 
5.5%
3 845
 
3.9%
, 807
 
3.8%
8 726
 
3.4%
4 714
 
3.3%
Other values (15) 2405
 
11.2%
Hiragana
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 28638
74.0%
CJK 5704
 
14.7%
Hangul 3332
 
8.6%
Katakana 1000
 
2.6%
CJK Compat Ideographs 15
 
< 0.1%
None 13
 
< 0.1%
Hiragana 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6125
21.4%
1 2974
 
10.4%
- 2442
 
8.5%
0 2014
 
7.0%
2 1268
 
4.4%
? 1185
 
4.1%
o 1174
 
4.1%
g 1001
 
3.5%
n 969
 
3.4%
3 845
 
3.0%
Other values (55) 8641
30.2%
CJK
ValueCountFrequency (%)
639
 
11.2%
625
 
11.0%
495
 
8.7%
443
 
7.8%
267
 
4.7%
261
 
4.6%
169
 
3.0%
147
 
2.6%
137
 
2.4%
117
 
2.1%
Other values (326) 2404
42.1%
Hangul
ValueCountFrequency (%)
421
 
12.6%
365
 
11.0%
362
 
10.9%
332
 
10.0%
186
 
5.6%
143
 
4.3%
83
 
2.5%
68
 
2.0%
57
 
1.7%
42
 
1.3%
Other values (206) 1273
38.2%
Katakana
ValueCountFrequency (%)
277
27.7%
270
27.0%
269
26.9%
42
 
4.2%
11
 
1.1%
8
 
0.8%
7
 
0.7%
7
 
0.7%
7
 
0.7%
6
 
0.6%
Other values (42) 96
 
9.6%
CJK Compat Ideographs
ValueCountFrequency (%)
7
46.7%
2
 
13.3%
2
 
13.3%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
None
ValueCountFrequency (%)
4
30.8%
4
30.8%
· 2
15.4%
  1
 
7.7%
1
 
7.7%
­ 1
 
7.7%
Hiragana
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
Distinct1913
Distinct (%)97.8%
Missing0
Missing (%)0.0%
Memory size15.4 KiB
2024-05-18T13:37:39.009785image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length102
Median length72
Mean length31.643332
Min length2

Characters and Unicode

Total characters61926
Distinct characters1003
Distinct categories12 ?
Distinct scripts6 ?
Distinct blocks8 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1882 ?
Unique (%)96.2%

Sample

1st row03057 54 Gyedong-gil, Jongno-gu, Seoul
2nd row03091 51 Naksan 5-gil, Jongno-gu, Seoul
3rd row03091 23-322 Changsin-dong, Dongdaemun-gu, Seoul
4th row03051 164 Changdeokgung-gil, Jongno-gu, Seoul
5th row03051 162 Changdeokgung-gil, Jongno-gu, Seoul
ValueCountFrequency (%)
seoul 397
 
5.1%
서울 329
 
4.2%
jongno-gu 139
 
1.8%
종로구 137
 
1.8%
jung-gu 60
 
0.8%
중구 59
 
0.8%
서울특별시 43
 
0.6%
03056 31
 
0.4%
03145 29
 
0.4%
yongsan-gu 29
 
0.4%
Other values (3142) 6542
83.9%
2024-05-18T13:37:40.685851image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7761
 
12.5%
0 3276
 
5.3%
1 2537
 
4.1%
? 2102
 
3.4%
3 1890
 
3.1%
o 1666
 
2.7%
2 1509
 
2.4%
4 1443
 
2.3%
5 1384
 
2.2%
- 1382
 
2.2%
Other values (993) 36976
59.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 20273
32.7%
Decimal Number 15482
25.0%
Lowercase Letter 9534
15.4%
Space Separator 7761
 
12.5%
Other Punctuation 3593
 
5.8%
Uppercase Letter 1556
 
2.5%
Dash Punctuation 1382
 
2.2%
Open Punctuation 1169
 
1.9%
Close Punctuation 1168
 
1.9%
Math Symbol 5
 
< 0.1%
Other values (2) 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1117
 
5.5%
867
 
4.3%
789
 
3.9%
577
 
2.8%
573
 
2.8%
488
 
2.4%
468
 
2.3%
446
 
2.2%
444
 
2.2%
405
 
2.0%
Other values (912) 14099
69.5%
Lowercase Letter
ValueCountFrequency (%)
o 1666
17.5%
g 1348
14.1%
n 1178
12.4%
u 1107
11.6%
e 921
9.7%
l 686
7.2%
a 614
 
6.4%
r 430
 
4.5%
i 323
 
3.4%
d 208
 
2.2%
Other values (15) 1053
11.0%
Uppercase Letter
ValueCountFrequency (%)
S 575
37.0%
J 229
 
14.7%
Y 84
 
5.4%
G 82
 
5.3%
B 75
 
4.8%
D 61
 
3.9%
I 54
 
3.5%
H 46
 
3.0%
C 40
 
2.6%
M 36
 
2.3%
Other values (14) 274
17.6%
Decimal Number
ValueCountFrequency (%)
0 3276
21.2%
1 2537
16.4%
3 1890
12.2%
2 1509
9.7%
4 1443
9.3%
5 1384
8.9%
6 1008
 
6.5%
7 1003
 
6.5%
8 783
 
5.1%
9 648
 
4.2%
Other Punctuation
ValueCountFrequency (%)
? 2102
58.5%
, 1298
36.1%
97
 
2.7%
55
 
1.5%
. 15
 
0.4%
# 14
 
0.4%
· 7
 
0.2%
& 3
 
0.1%
' 1
 
< 0.1%
: 1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 1130
96.7%
37
 
3.2%
1
 
0.1%
Open Punctuation
ValueCountFrequency (%)
( 1127
96.4%
41
 
3.5%
1
 
0.1%
Space Separator
ValueCountFrequency (%)
7761
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1382
100.0%
Math Symbol
ValueCountFrequency (%)
~ 5
100.0%
Control
ValueCountFrequency (%)
2
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 30563
49.4%
Latin 11090
 
17.9%
Han 9474
 
15.3%
Hangul 5716
 
9.2%
Katakana 5069
 
8.2%
Hiragana 14
 
< 0.1%

Most frequent character per script

Han
ValueCountFrequency (%)
1117
 
11.8%
789
 
8.3%
577
 
6.1%
446
 
4.7%
444
 
4.7%
391
 
4.1%
294
 
3.1%
276
 
2.9%
216
 
2.3%
169
 
1.8%
Other values (511) 4755
50.2%
Hangul
ValueCountFrequency (%)
488
 
8.5%
468
 
8.2%
405
 
7.1%
388
 
6.8%
300
 
5.2%
181
 
3.2%
172
 
3.0%
132
 
2.3%
87
 
1.5%
78
 
1.4%
Other values (313) 3017
52.8%
Katakana
ValueCountFrequency (%)
867
17.1%
573
 
11.3%
394
 
7.8%
366
 
7.2%
313
 
6.2%
290
 
5.7%
278
 
5.5%
196
 
3.9%
126
 
2.5%
105
 
2.1%
Other values (62) 1561
30.8%
Latin
ValueCountFrequency (%)
o 1666
15.0%
g 1348
12.2%
n 1178
10.6%
u 1107
10.0%
e 921
8.3%
l 686
 
6.2%
a 614
 
5.5%
S 575
 
5.2%
r 430
 
3.9%
i 323
 
2.9%
Other values (39) 2242
20.2%
Common
ValueCountFrequency (%)
7761
25.4%
0 3276
10.7%
1 2537
 
8.3%
? 2102
 
6.9%
3 1890
 
6.2%
2 1509
 
4.9%
4 1443
 
4.7%
5 1384
 
4.5%
- 1382
 
4.5%
, 1298
 
4.2%
Other values (22) 5981
19.6%
Hiragana
ValueCountFrequency (%)
9
64.3%
1
 
7.1%
1
 
7.1%
1
 
7.1%
1
 
7.1%
1
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 41412
66.9%
CJK 9458
 
15.3%
Hangul 5716
 
9.2%
Katakana 5069
 
8.2%
None 240
 
0.4%
CJK Compat Ideographs 16
 
< 0.1%
Hiragana 14
 
< 0.1%
Punctuation 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7761
18.7%
0 3276
 
7.9%
1 2537
 
6.1%
? 2102
 
5.1%
3 1890
 
4.6%
o 1666
 
4.0%
2 1509
 
3.6%
4 1443
 
3.5%
5 1384
 
3.3%
- 1382
 
3.3%
Other values (62) 16462
39.8%
CJK
ValueCountFrequency (%)
1117
 
11.8%
789
 
8.3%
577
 
6.1%
446
 
4.7%
444
 
4.7%
391
 
4.1%
294
 
3.1%
276
 
2.9%
216
 
2.3%
169
 
1.8%
Other values (504) 4739
50.1%
Katakana
ValueCountFrequency (%)
867
17.1%
573
 
11.3%
394
 
7.8%
366
 
7.2%
313
 
6.2%
290
 
5.7%
278
 
5.5%
196
 
3.9%
126
 
2.5%
105
 
2.1%
Other values (62) 1561
30.8%
Hangul
ValueCountFrequency (%)
488
 
8.5%
468
 
8.2%
405
 
7.1%
388
 
6.8%
300
 
5.2%
181
 
3.2%
172
 
3.0%
132
 
2.3%
87
 
1.5%
78
 
1.4%
Other values (313) 3017
52.8%
None
ValueCountFrequency (%)
97
40.4%
55
22.9%
41
17.1%
37
 
15.4%
· 7
 
2.9%
1
 
0.4%
1
 
0.4%
1
 
0.4%
Hiragana
ValueCountFrequency (%)
9
64.3%
1
 
7.1%
1
 
7.1%
1
 
7.1%
1
 
7.1%
1
 
7.1%
CJK Compat Ideographs
ValueCountFrequency (%)
9
56.2%
2
 
12.5%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
Punctuation
ValueCountFrequency (%)
1
100.0%

전화번호
Text

MISSING 

Distinct769
Distinct (%)41.7%
Missing113
Missing (%)5.8%
Memory size15.4 KiB
2024-05-18T13:37:41.540559image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length87
Median length67
Mean length14.065076
Min length6

Characters and Unicode

Total characters25936
Distinct characters115
Distinct categories12 ?
Distinct scripts5 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique398 ?
Unique (%)21.6%

Sample

1st row+82-2-762-1261
2nd row+82-507-1330-5416
3rd row+82-2-742-1321
4th row+82-2-2261-0501
5th row+82-2-3780-0578
ValueCountFrequency (%)
82-2-120 38
 
2.0%
02-120 12
 
0.6%
82-2-724-0274 9
 
0.5%
82-2-793-8249 8
 
0.4%
82-2-970-4500 8
 
0.4%
82-2-3780-0578 8
 
0.4%
82-2-2077-9000 8
 
0.4%
82-2)2133-5695 8
 
0.4%
82-2-731-0412 7
 
0.4%
82-2-762-4868 7
 
0.4%
Other values (764) 1746
93.9%
2024-05-18T13:37:42.879806image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 5132
19.8%
2 4851
18.7%
8 2490
9.6%
0 2228
8.6%
3 1630
 
6.3%
7 1627
 
6.3%
+ 1494
 
5.8%
1 1483
 
5.7%
4 1269
 
4.9%
6 1147
 
4.4%
Other values (105) 2585
10.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 18940
73.0%
Dash Punctuation 5132
 
19.8%
Math Symbol 1530
 
5.9%
Other Letter 103
 
0.4%
Lowercase Letter 81
 
0.3%
Other Punctuation 53
 
0.2%
Space Separator 27
 
0.1%
Close Punctuation 26
 
0.1%
Open Punctuation 16
 
0.1%
Uppercase Letter 14
 
0.1%
Other values (2) 14
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6
 
5.8%
6
 
5.8%
5
 
4.9%
5
 
4.9%
4
 
3.9%
4
 
3.9%
3
 
2.9%
3
 
2.9%
3
 
2.9%
3
 
2.9%
Other values (48) 61
59.2%
Lowercase Letter
ValueCountFrequency (%)
s 11
13.6%
e 10
12.3%
u 9
11.1%
r 8
9.9%
t 8
9.9%
i 6
7.4%
o 6
7.4%
n 5
6.2%
a 5
6.2%
m 4
 
4.9%
Other values (7) 9
11.1%
Decimal Number
ValueCountFrequency (%)
2 4851
25.6%
8 2490
13.1%
0 2228
11.8%
3 1630
 
8.6%
7 1627
 
8.6%
1 1483
 
7.8%
4 1269
 
6.7%
6 1147
 
6.1%
9 1118
 
5.9%
5 1097
 
5.8%
Other Punctuation
ValueCountFrequency (%)
? 15
28.3%
/ 14
26.4%
, 7
13.2%
6
 
11.3%
: 4
 
7.5%
3
 
5.7%
2
 
3.8%
. 1
 
1.9%
1
 
1.9%
Uppercase Letter
ValueCountFrequency (%)
C 3
21.4%
M 3
21.4%
T 2
14.3%
D 1
 
7.1%
I 1
 
7.1%
S 1
 
7.1%
N 1
 
7.1%
H 1
 
7.1%
A 1
 
7.1%
Math Symbol
ValueCountFrequency (%)
+ 1494
97.6%
~ 34
 
2.2%
< 1
 
0.1%
> 1
 
0.1%
Close Punctuation
ValueCountFrequency (%)
) 22
84.6%
4
 
15.4%
Open Punctuation
ValueCountFrequency (%)
( 12
75.0%
4
 
25.0%
Dash Punctuation
ValueCountFrequency (%)
- 5132
100.0%
Space Separator
ValueCountFrequency (%)
27
100.0%
Control
ValueCountFrequency (%)
12
100.0%
Format
ValueCountFrequency (%)
­ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 25738
99.2%
Latin 95
 
0.4%
Han 69
 
0.3%
Hangul 31
 
0.1%
Katakana 3
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
- 5132
19.9%
2 4851
18.8%
8 2490
9.7%
0 2228
8.7%
3 1630
 
6.3%
7 1627
 
6.3%
+ 1494
 
5.8%
1 1483
 
5.8%
4 1269
 
4.9%
6 1147
 
4.5%
Other values (21) 2387
9.3%
Han
ValueCountFrequency (%)
6
 
8.7%
6
 
8.7%
5
 
7.2%
5
 
7.2%
4
 
5.8%
3
 
4.3%
3
 
4.3%
3
 
4.3%
3
 
4.3%
3
 
4.3%
Other values (21) 28
40.6%
Latin
ValueCountFrequency (%)
s 11
11.6%
e 10
10.5%
u 9
 
9.5%
r 8
 
8.4%
t 8
 
8.4%
i 6
 
6.3%
o 6
 
6.3%
n 5
 
5.3%
a 5
 
5.3%
m 4
 
4.2%
Other values (16) 23
24.2%
Hangul
ValueCountFrequency (%)
4
 
12.9%
2
 
6.5%
2
 
6.5%
2
 
6.5%
2
 
6.5%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
Other values (14) 14
45.2%
Katakana
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 25811
99.5%
CJK 69
 
0.3%
Hangul 31
 
0.1%
None 22
 
0.1%
Katakana 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 5132
19.9%
2 4851
18.8%
8 2490
9.6%
0 2228
8.6%
3 1630
 
6.3%
7 1627
 
6.3%
+ 1494
 
5.8%
1 1483
 
5.7%
4 1269
 
4.9%
6 1147
 
4.4%
Other values (40) 2460
9.5%
None
ValueCountFrequency (%)
6
27.3%
4
18.2%
4
18.2%
3
13.6%
­ 2
 
9.1%
2
 
9.1%
1
 
4.5%
CJK
ValueCountFrequency (%)
6
 
8.7%
6
 
8.7%
5
 
7.2%
5
 
7.2%
4
 
5.8%
3
 
4.3%
3
 
4.3%
3
 
4.3%
3
 
4.3%
3
 
4.3%
Other values (21) 28
40.6%
Hangul
ValueCountFrequency (%)
4
 
12.9%
2
 
6.5%
2
 
6.5%
2
 
6.5%
2
 
6.5%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
Other values (14) 14
45.2%
Katakana
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

팩스번호
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct48
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Memory size15.4 KiB
<NA>
1862 
+82-2-2147-3874
 
4
+82-2-996-0456
 
4
+82-2-957-2569
 
4
+82-2-753-4254
 
4
Other values (43)
 
79

Length

Max length18
Median length4
Mean length4.4772611
Min length4

Unique

Unique25 ?
Unique (%)1.3%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 1862
95.1%
+82-2-2147-3874 4
 
0.2%
+82-2-996-0456 4
 
0.2%
+82-2-957-2569 4
 
0.2%
+82-2-753-4254 4
 
0.2%
+82-2-969-9245 4
 
0.2%
+82-2-788-3664 4
 
0.2%
+82-2-2660-2488 4
 
0.2%
+82-2-766-8643 4
 
0.2%
+82-2-732-9928 4
 
0.2%
Other values (38) 59
 
3.0%

Length

2024-05-18T13:37:43.571372image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 1862
94.8%
9470 5
 
0.3%
82-2-996-0456 4
 
0.2%
82-2-957-2569 4
 
0.2%
82-2-753-4254 4
 
0.2%
82-2-969-9245 4
 
0.2%
82-2-788-3664 4
 
0.2%
82-2-2660-2488 4
 
0.2%
82-2-766-8643 4
 
0.2%
82-2-732-9928 4
 
0.2%
Other values (39) 66
 
3.4%

웹사이트
Text

MISSING 

Distinct491
Distinct (%)36.6%
Missing614
Missing (%)31.4%
Memory size15.4 KiB
2024-05-18T13:37:44.264356image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length233
Median length90
Mean length35.12137
Min length14

Characters and Unicode

Total characters47168
Distinct characters76
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique208 ?
Unique (%)15.5%

Sample

1st rowhttps://www.hanokmaeul.or.kr/
2nd rowhttps://sema.seoul.go.kr/en/index
3rd rowhttps://sema.seoul.go.kr/en/index
4th rowhttps://lib.seoul.go.kr/rwww/html/ko/seoulArchRoom.jsp
5th rowhttps://sema.seoul.go.kr/
ValueCountFrequency (%)
http://www.sta.or.kr 40
 
3.0%
http://www.mmca.go.kr 9
 
0.7%
http://www.deoksugung.go.kr 8
 
0.6%
http://dmvillage.info 7
 
0.5%
http://www.museum.seoul.kr 6
 
0.4%
http://sewoon.org 6
 
0.4%
https://sema.seoul.go.kr/en/index 6
 
0.4%
http://plaza.seoul.go.kr/gwanghwamun 6
 
0.4%
http://www.pkmgallery.com 5
 
0.4%
https://culture.gangseo.seoul.kr 5
 
0.4%
Other values (478) 1257
92.8%
2024-05-18T13:37:45.596576image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 4193
 
8.9%
. 3695
 
7.8%
t 3667
 
7.8%
o 3074
 
6.5%
w 2696
 
5.7%
e 2388
 
5.1%
r 2157
 
4.6%
s 1971
 
4.2%
a 1923
 
4.1%
h 1899
 
4.0%
Other values (66) 19505
41.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 35014
74.2%
Other Punctuation 9425
 
20.0%
Decimal Number 1303
 
2.8%
Uppercase Letter 712
 
1.5%
Math Symbol 313
 
0.7%
Connector Punctuation 285
 
0.6%
Dash Punctuation 48
 
0.1%
Space Separator 35
 
0.1%
Other Letter 25
 
0.1%
Control 8
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 3667
 
10.5%
o 3074
 
8.8%
w 2696
 
7.7%
e 2388
 
6.8%
r 2157
 
6.2%
s 1971
 
5.6%
a 1923
 
5.5%
h 1899
 
5.4%
n 1816
 
5.2%
p 1777
 
5.1%
Other values (16) 11646
33.3%
Uppercase Letter
ValueCountFrequency (%)
I 121
17.0%
H 86
12.1%
R 64
9.0%
P 54
 
7.6%
C 53
 
7.4%
V 47
 
6.6%
T 46
 
6.5%
N 42
 
5.9%
G 29
 
4.1%
A 26
 
3.7%
Other values (11) 144
20.2%
Decimal Number
ValueCountFrequency (%)
0 413
31.7%
1 271
20.8%
2 166
12.7%
3 86
 
6.6%
6 84
 
6.4%
5 80
 
6.1%
4 67
 
5.1%
7 52
 
4.0%
9 46
 
3.5%
8 38
 
2.9%
Other Punctuation
ValueCountFrequency (%)
/ 4193
44.5%
. 3695
39.2%
: 1199
 
12.7%
? 183
 
1.9%
& 129
 
1.4%
# 17
 
0.2%
, 6
 
0.1%
; 2
 
< 0.1%
' 1
 
< 0.1%
Other Letter
ValueCountFrequency (%)
5
20.0%
5
20.0%
5
20.0%
5
20.0%
5
20.0%
Math Symbol
ValueCountFrequency (%)
= 313
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 285
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 48
100.0%
Space Separator
ValueCountFrequency (%)
35
100.0%
Control
ValueCountFrequency (%)
8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 35726
75.7%
Common 11417
 
24.2%
Hangul 25
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 3667
 
10.3%
o 3074
 
8.6%
w 2696
 
7.5%
e 2388
 
6.7%
r 2157
 
6.0%
s 1971
 
5.5%
a 1923
 
5.4%
h 1899
 
5.3%
n 1816
 
5.1%
p 1777
 
5.0%
Other values (37) 12358
34.6%
Common
ValueCountFrequency (%)
/ 4193
36.7%
. 3695
32.4%
: 1199
 
10.5%
0 413
 
3.6%
= 313
 
2.7%
_ 285
 
2.5%
1 271
 
2.4%
? 183
 
1.6%
2 166
 
1.5%
& 129
 
1.1%
Other values (14) 570
 
5.0%
Hangul
ValueCountFrequency (%)
5
20.0%
5
20.0%
5
20.0%
5
20.0%
5
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 47143
99.9%
Hangul 25
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 4193
 
8.9%
. 3695
 
7.8%
t 3667
 
7.8%
o 3074
 
6.5%
w 2696
 
5.7%
e 2388
 
5.1%
r 2157
 
4.6%
s 1971
 
4.2%
a 1923
 
4.1%
h 1899
 
4.0%
Other values (61) 19480
41.3%
Hangul
ValueCountFrequency (%)
5
20.0%
5
20.0%
5
20.0%
5
20.0%
5
20.0%

운영시간
Text

MISSING 

Distinct1247
Distinct (%)73.4%
Missing259
Missing (%)13.2%
Memory size15.4 KiB
2024-05-18T13:37:46.464179image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length396
Median length202
Mean length34.687279
Min length2

Characters and Unicode

Total characters58899
Distinct characters865
Distinct categories14 ?
Distinct scripts6 ?
Distinct blocks9 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1107 ?
Unique (%)65.2%

Sample

1st rowTue-Fri 10:00 - 20:00 / Sat-Sun 10:00 - 22:00
2nd rowSummer season (April to October) 09:00 - 21:00 KST Winter season (November to March) 09:00 - 20:00 KST * Traditional garden open 24/7
3rd row성당사무실 화 ~ 금 | 09:00 ~ 20:30 토 요 일 | 09:00 ~ 20:00 일 요 일 | 09:00 ~ 21:00
4th rowTuesday - Friday 10:00 - 20:00 KST Sat, Holiday 10:00 - 19:00 KST
5th row週二 - 週五 10:00 - 20:00 週六、公休日 10:00 - 19:00
ValueCountFrequency (%)
1313
 
15.2%
kst 543
 
6.3%
10:00 353
 
4.1%
18:00 289
 
3.3%
09:00 243
 
2.8%
17:00 143
 
1.7%
19:00 114
 
1.3%
daily 101
 
1.2%
10:00~18:00 90
 
1.0%
20:00 77
 
0.9%
Other values (2043) 5385
62.2%
2024-05-18T13:37:47.966340image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 11279
19.1%
7347
 
12.5%
: 4995
 
8.5%
1 4192
 
7.1%
~ 1739
 
3.0%
9 1184
 
2.0%
? 1151
 
2.0%
2 1138
 
1.9%
- 1034
 
1.8%
3 1006
 
1.7%
Other values (855) 23834
40.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 20942
35.6%
Other Letter 9691
16.5%
Space Separator 7347
 
12.5%
Other Punctuation 7346
 
12.5%
Lowercase Letter 6197
 
10.5%
Uppercase Letter 2662
 
4.5%
Math Symbol 1791
 
3.0%
Dash Punctuation 1035
 
1.8%
Close Punctuation 944
 
1.6%
Open Punctuation 929
 
1.6%
Other values (4) 15
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
471
 
4.9%
461
 
4.8%
260
 
2.7%
220
 
2.3%
213
 
2.2%
207
 
2.1%
196
 
2.0%
193
 
2.0%
157
 
1.6%
154
 
1.6%
Other values (759) 7159
73.9%
Lowercase Letter
ValueCountFrequency (%)
e 769
12.4%
a 652
10.5%
r 518
 
8.4%
s 459
 
7.4%
t 457
 
7.4%
o 414
 
6.7%
n 399
 
6.4%
i 391
 
6.3%
y 385
 
6.2%
u 291
 
4.7%
Other values (16) 1462
23.6%
Uppercase Letter
ValueCountFrequency (%)
S 707
26.6%
T 640
24.0%
K 585
22.0%
M 128
 
4.8%
D 112
 
4.2%
F 72
 
2.7%
W 64
 
2.4%
L 60
 
2.3%
O 45
 
1.7%
N 41
 
1.5%
Other values (13) 208
 
7.8%
Other Punctuation
ValueCountFrequency (%)
: 4995
68.0%
? 1151
 
15.7%
, 242
 
3.3%
203
 
2.8%
* 184
 
2.5%
/ 114
 
1.6%
93
 
1.3%
90
 
1.2%
78
 
1.1%
. 54
 
0.7%
Other values (5) 142
 
1.9%
Decimal Number
ValueCountFrequency (%)
0 11279
53.9%
1 4192
 
20.0%
9 1184
 
5.7%
2 1138
 
5.4%
3 1006
 
4.8%
8 944
 
4.5%
7 622
 
3.0%
6 269
 
1.3%
4 182
 
0.9%
5 125
 
0.6%
Math Symbol
ValueCountFrequency (%)
~ 1739
97.1%
> 19
 
1.1%
< 19
 
1.1%
| 8
 
0.4%
3
 
0.2%
+ 3
 
0.2%
Close Punctuation
ValueCountFrequency (%)
) 798
84.5%
119
 
12.6%
] 26
 
2.8%
1
 
0.1%
Open Punctuation
ValueCountFrequency (%)
( 789
84.9%
113
 
12.2%
[ 26
 
2.8%
1
 
0.1%
Dash Punctuation
ValueCountFrequency (%)
- 1034
99.9%
1
 
0.1%
Space Separator
ValueCountFrequency (%)
7347
100.0%
Control
ValueCountFrequency (%)
9
100.0%
Format
ValueCountFrequency (%)
­ 2
100.0%
Initial Punctuation
ValueCountFrequency (%)
2
100.0%
Final Punctuation
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 40349
68.5%
Latin 8859
 
15.0%
Han 6054
 
10.3%
Hangul 2783
 
4.7%
Hiragana 614
 
1.0%
Katakana 240
 
0.4%

Most frequent character per script

Han
ValueCountFrequency (%)
471
 
7.8%
461
 
7.6%
260
 
4.3%
213
 
3.5%
207
 
3.4%
196
 
3.2%
157
 
2.6%
154
 
2.5%
137
 
2.3%
126
 
2.1%
Other values (412) 3672
60.7%
Hangul
ValueCountFrequency (%)
220
 
7.9%
193
 
6.9%
140
 
5.0%
97
 
3.5%
95
 
3.4%
81
 
2.9%
68
 
2.4%
67
 
2.4%
64
 
2.3%
55
 
2.0%
Other values (242) 1703
61.2%
Katakana
ValueCountFrequency (%)
22
 
9.2%
21
 
8.8%
14
 
5.8%
13
 
5.4%
12
 
5.0%
11
 
4.6%
11
 
4.6%
10
 
4.2%
9
 
3.8%
8
 
3.3%
Other values (42) 109
45.4%
Latin
ValueCountFrequency (%)
e 769
 
8.7%
S 707
 
8.0%
a 652
 
7.4%
T 640
 
7.2%
K 585
 
6.6%
r 518
 
5.8%
s 459
 
5.2%
t 457
 
5.2%
o 414
 
4.7%
n 399
 
4.5%
Other values (39) 3259
36.8%
Common
ValueCountFrequency (%)
0 11279
28.0%
7347
18.2%
: 4995
12.4%
1 4192
 
10.4%
~ 1739
 
4.3%
9 1184
 
2.9%
? 1151
 
2.9%
2 1138
 
2.8%
- 1034
 
2.6%
3 1006
 
2.5%
Other values (37) 5284
13.1%
Hiragana
ValueCountFrequency (%)
88
14.3%
85
13.8%
76
12.4%
48
 
7.8%
29
 
4.7%
23
 
3.7%
21
 
3.4%
20
 
3.3%
19
 
3.1%
17
 
2.8%
Other values (33) 188
30.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 48409
82.2%
CJK 6053
 
10.3%
Hangul 2783
 
4.7%
None 698
 
1.2%
Hiragana 614
 
1.0%
Katakana 240
 
0.4%
Punctuation 98
 
0.2%
Math Operators 3
 
< 0.1%
CJK Compat Ideographs 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 11279
23.3%
7347
15.2%
: 4995
 
10.3%
1 4192
 
8.7%
~ 1739
 
3.6%
9 1184
 
2.4%
? 1151
 
2.4%
2 1138
 
2.4%
- 1034
 
2.1%
3 1006
 
2.1%
Other values (69) 13344
27.6%
CJK
ValueCountFrequency (%)
471
 
7.8%
461
 
7.6%
260
 
4.3%
213
 
3.5%
207
 
3.4%
196
 
3.2%
157
 
2.6%
154
 
2.5%
137
 
2.3%
126
 
2.1%
Other values (411) 3671
60.6%
Hangul
ValueCountFrequency (%)
220
 
7.9%
193
 
6.9%
140
 
5.0%
97
 
3.5%
95
 
3.4%
81
 
2.9%
68
 
2.4%
67
 
2.4%
64
 
2.3%
55
 
2.0%
Other values (242) 1703
61.2%
None
ValueCountFrequency (%)
203
29.1%
119
17.0%
113
16.2%
90
12.9%
78
 
11.2%
38
 
5.4%
33
 
4.7%
· 19
 
2.7%
­ 2
 
0.3%
1
 
0.1%
Other values (2) 2
 
0.3%
Punctuation
ValueCountFrequency (%)
93
94.9%
2
 
2.0%
2
 
2.0%
1
 
1.0%
Hiragana
ValueCountFrequency (%)
88
14.3%
85
13.8%
76
12.4%
48
 
7.8%
29
 
4.7%
23
 
3.7%
21
 
3.4%
20
 
3.3%
19
 
3.1%
17
 
2.8%
Other values (33) 188
30.6%
Katakana
ValueCountFrequency (%)
22
 
9.2%
21
 
8.8%
14
 
5.8%
13
 
5.4%
12
 
5.0%
11
 
4.6%
11
 
4.6%
10
 
4.2%
9
 
3.8%
8
 
3.3%
Other values (42) 109
45.4%
Math Operators
ValueCountFrequency (%)
3
100.0%
CJK Compat Ideographs
ValueCountFrequency (%)
1
100.0%

운영요일
Text

MISSING 

Distinct225
Distinct (%)21.7%
Missing919
Missing (%)47.0%
Memory size15.4 KiB
2024-05-18T13:37:48.545257image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length60
Median length58
Mean length5.6868979
Min length2

Characters and Unicode

Total characters5903
Distinct characters249
Distinct categories11 ?
Distinct scripts6 ?
Distinct blocks7 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique137 ?
Unique (%)13.2%

Sample

1st rowTue-Sun
2nd rowTues - Sun
3rd row週二 - 週日
4th rowTuesday, Wednesday, Thursday, Friday, Saturday, Sunday
5th rowDaily
ValueCountFrequency (%)
화~일 78
 
5.8%
每天 73
 
5.4%
매일 70
 
5.2%
69
 
5.1%
68
 
5.0%
daily 62
 
4.6%
週二~週日 61
 
4.5%
每日 60
 
4.5%
周二~周日 39
 
2.9%
週一~週六 26
 
1.9%
Other values (225) 742
55.0%
2024-05-18T13:37:49.658126image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
475
 
8.0%
~ 426
 
7.2%
346
 
5.9%
323
 
5.5%
258
 
4.4%
220
 
3.7%
a 204
 
3.5%
189
 
3.2%
y 179
 
3.0%
? 178
 
3.0%
Other values (239) 3105
52.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3124
52.9%
Lowercase Letter 1245
 
21.1%
Math Symbol 426
 
7.2%
Space Separator 346
 
5.9%
Other Punctuation 290
 
4.9%
Uppercase Letter 250
 
4.2%
Decimal Number 134
 
2.3%
Dash Punctuation 68
 
1.2%
Open Punctuation 9
 
0.2%
Close Punctuation 9
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
475
15.2%
323
 
10.3%
258
 
8.3%
220
 
7.0%
189
 
6.0%
138
 
4.4%
132
 
4.2%
110
 
3.5%
93
 
3.0%
92
 
2.9%
Other values (176) 1094
35.0%
Lowercase Letter
ValueCountFrequency (%)
a 204
16.4%
y 179
14.4%
d 125
10.0%
u 113
9.1%
n 94
7.6%
e 85
6.8%
i 77
 
6.2%
l 70
 
5.6%
r 67
 
5.4%
o 65
 
5.2%
Other values (12) 166
13.3%
Uppercase Letter
ValueCountFrequency (%)
S 68
27.2%
D 60
24.0%
T 45
18.0%
M 37
14.8%
F 11
 
4.4%
W 9
 
3.6%
K 6
 
2.4%
O 6
 
2.4%
E 4
 
1.6%
B 1
 
0.4%
Other values (3) 3
 
1.2%
Other Punctuation
ValueCountFrequency (%)
? 178
61.4%
: 32
 
11.0%
, 31
 
10.7%
28
 
9.7%
6
 
2.1%
5
 
1.7%
3
 
1.0%
3
 
1.0%
* 2
 
0.7%
. 1
 
0.3%
Decimal Number
ValueCountFrequency (%)
0 74
55.2%
1 30
22.4%
2 10
 
7.5%
8 8
 
6.0%
9 4
 
3.0%
4 3
 
2.2%
3 3
 
2.2%
7 1
 
0.7%
5 1
 
0.7%
Open Punctuation
ValueCountFrequency (%)
( 5
55.6%
4
44.4%
Close Punctuation
ValueCountFrequency (%)
) 5
55.6%
4
44.4%
Math Symbol
ValueCountFrequency (%)
~ 426
100.0%
Space Separator
ValueCountFrequency (%)
346
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 68
100.0%
Control
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Han 2449
41.5%
Latin 1495
25.3%
Common 1284
21.8%
Hangul 639
 
10.8%
Hiragana 24
 
0.4%
Katakana 12
 
0.2%

Most frequent character per script

Han
ValueCountFrequency (%)
475
19.4%
323
13.2%
258
10.5%
189
 
7.7%
138
 
5.6%
132
 
5.4%
93
 
3.8%
92
 
3.8%
91
 
3.7%
87
 
3.6%
Other values (103) 571
23.3%
Hangul
ValueCountFrequency (%)
220
34.4%
110
17.2%
74
 
11.6%
70
 
11.0%
50
 
7.8%
22
 
3.4%
22
 
3.4%
12
 
1.9%
12
 
1.9%
3
 
0.5%
Other values (40) 44
 
6.9%
Latin
ValueCountFrequency (%)
a 204
13.6%
y 179
12.0%
d 125
 
8.4%
u 113
 
7.6%
n 94
 
6.3%
e 85
 
5.7%
i 77
 
5.2%
l 70
 
4.7%
S 68
 
4.5%
r 67
 
4.5%
Other values (25) 413
27.6%
Common
ValueCountFrequency (%)
~ 426
33.2%
346
26.9%
? 178
13.9%
0 74
 
5.8%
- 68
 
5.3%
: 32
 
2.5%
, 31
 
2.4%
1 30
 
2.3%
28
 
2.2%
2 10
 
0.8%
Other values (18) 61
 
4.8%
Hiragana
ValueCountFrequency (%)
4
16.7%
2
 
8.3%
2
 
8.3%
2
 
8.3%
2
 
8.3%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
Other values (7) 7
29.2%
Katakana
ValueCountFrequency (%)
2
16.7%
2
16.7%
2
16.7%
2
16.7%
2
16.7%
2
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2725
46.2%
CJK 2449
41.5%
Hangul 639
 
10.8%
None 51
 
0.9%
Hiragana 24
 
0.4%
Katakana 12
 
0.2%
Punctuation 3
 
0.1%

Most frequent character per block

CJK
ValueCountFrequency (%)
475
19.4%
323
13.2%
258
10.5%
189
 
7.7%
138
 
5.6%
132
 
5.4%
93
 
3.8%
92
 
3.8%
91
 
3.7%
87
 
3.6%
Other values (103) 571
23.3%
ASCII
ValueCountFrequency (%)
~ 426
15.6%
346
 
12.7%
a 204
 
7.5%
y 179
 
6.6%
? 178
 
6.5%
d 125
 
4.6%
u 113
 
4.1%
n 94
 
3.4%
e 85
 
3.1%
i 77
 
2.8%
Other values (45) 898
33.0%
Hangul
ValueCountFrequency (%)
220
34.4%
110
17.2%
74
 
11.6%
70
 
11.0%
50
 
7.8%
22
 
3.4%
22
 
3.4%
12
 
1.9%
12
 
1.9%
3
 
0.5%
Other values (40) 44
 
6.9%
None
ValueCountFrequency (%)
28
54.9%
6
 
11.8%
5
 
9.8%
4
 
7.8%
4
 
7.8%
3
 
5.9%
· 1
 
2.0%
Hiragana
ValueCountFrequency (%)
4
16.7%
2
 
8.3%
2
 
8.3%
2
 
8.3%
2
 
8.3%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
Other values (7) 7
29.2%
Punctuation
ValueCountFrequency (%)
3
100.0%
Katakana
ValueCountFrequency (%)
2
16.7%
2
16.7%
2
16.7%
2
16.7%
2
16.7%
2
16.7%

휴무일
Text

MISSING 

Distinct666
Distinct (%)47.5%
Missing555
Missing (%)28.4%
Memory size15.4 KiB
2024-05-18T13:37:50.164354image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length258
Median length124
Mean length13.586305
Min length1

Characters and Unicode

Total characters19048
Distinct characters528
Distinct categories12 ?
Distinct scripts6 ?
Distinct blocks7 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique549 ?
Unique (%)39.2%

Sample

1st rowMondays
2nd rowClosed Mondays
3rd row설날, 추석 당일 (성당사무실: 월요일 휴무)
4th rowMondays
5th row週一
ValueCountFrequency (%)
closed 142
 
4.4%
118
 
3.7%
mondays 102
 
3.2%
월요일 95
 
3.0%
new 81
 
2.5%
なし 59
 
1.8%
56
 
1.8%
54
 
1.7%
없음 52
 
1.6%
day 52
 
1.6%
Other values (778) 2385
74.6%
2024-05-18T13:37:51.290238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1885
 
9.9%
819
 
4.3%
? 672
 
3.5%
a 636
 
3.3%
e 584
 
3.1%
s 551
 
2.9%
o 511
 
2.7%
d 503
 
2.6%
503
 
2.6%
n 460
 
2.4%
Other values (518) 11924
62.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 8073
42.4%
Lowercase Letter 5206
27.3%
Space Separator 1885
 
9.9%
Other Punctuation 1821
 
9.6%
Uppercase Letter 769
 
4.0%
Decimal Number 750
 
3.9%
Close Punctuation 196
 
1.0%
Open Punctuation 194
 
1.0%
Dash Punctuation 143
 
0.8%
Math Symbol 6
 
< 0.1%
Other values (2) 5
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
819
 
10.1%
422
 
5.2%
399
 
4.9%
293
 
3.6%
288
 
3.6%
224
 
2.8%
214
 
2.7%
210
 
2.6%
208
 
2.6%
192
 
2.4%
Other values (438) 4804
59.5%
Lowercase Letter
ValueCountFrequency (%)
a 636
12.2%
e 584
11.2%
s 551
10.6%
o 511
9.8%
d 503
9.7%
n 460
8.8%
y 369
7.1%
l 289
 
5.6%
r 256
 
4.9%
u 240
 
4.6%
Other values (13) 807
15.5%
Uppercase Letter
ValueCountFrequency (%)
C 191
24.8%
M 138
17.9%
N 87
11.3%
Y 81
10.5%
S 62
 
8.1%
L 48
 
6.2%
D 46
 
6.0%
O 21
 
2.7%
J 19
 
2.5%
H 18
 
2.3%
Other values (12) 58
 
7.5%
Other Punctuation
ValueCountFrequency (%)
? 672
36.9%
503
27.6%
, 305
16.7%
& 81
 
4.4%
77
 
4.2%
' 44
 
2.4%
. 30
 
1.6%
27
 
1.5%
/ 21
 
1.2%
: 16
 
0.9%
Other values (5) 45
 
2.5%
Decimal Number
ValueCountFrequency (%)
1 458
61.1%
5 82
 
10.9%
8 63
 
8.4%
3 51
 
6.8%
2 42
 
5.6%
0 27
 
3.6%
4 14
 
1.9%
6 7
 
0.9%
7 2
 
0.3%
9 2
 
0.3%
Close Punctuation
ValueCountFrequency (%)
) 182
92.9%
14
 
7.1%
Open Punctuation
ValueCountFrequency (%)
( 180
92.8%
14
 
7.2%
Space Separator
ValueCountFrequency (%)
1885
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 143
100.0%
Math Symbol
ValueCountFrequency (%)
~ 6
100.0%
Final Punctuation
ValueCountFrequency (%)
4
100.0%
Control
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5975
31.4%
Han 5454
28.6%
Common 5000
26.2%
Hangul 2216
 
11.6%
Hiragana 370
 
1.9%
Katakana 33
 
0.2%

Most frequent character per script

Han
ValueCountFrequency (%)
819
 
15.0%
399
 
7.3%
293
 
5.4%
288
 
5.3%
224
 
4.1%
214
 
3.9%
210
 
3.9%
192
 
3.5%
162
 
3.0%
124
 
2.3%
Other values (231) 2529
46.4%
Hangul
ValueCountFrequency (%)
422
19.0%
208
 
9.4%
190
 
8.6%
160
 
7.2%
64
 
2.9%
59
 
2.7%
59
 
2.7%
56
 
2.5%
55
 
2.5%
52
 
2.3%
Other values (154) 891
40.2%
Latin
ValueCountFrequency (%)
a 636
10.6%
e 584
 
9.8%
s 551
 
9.2%
o 511
 
8.6%
d 503
 
8.4%
n 460
 
7.7%
y 369
 
6.2%
l 289
 
4.8%
r 256
 
4.3%
u 240
 
4.0%
Other values (35) 1576
26.4%
Common
ValueCountFrequency (%)
1885
37.7%
? 672
 
13.4%
503
 
10.1%
1 458
 
9.2%
, 305
 
6.1%
) 182
 
3.6%
( 180
 
3.6%
- 143
 
2.9%
5 82
 
1.6%
& 81
 
1.6%
Other values (25) 509
 
10.2%
Hiragana
ValueCountFrequency (%)
68
18.4%
64
17.3%
63
17.0%
36
9.7%
31
8.4%
21
 
5.7%
15
 
4.1%
13
 
3.5%
8
 
2.2%
6
 
1.6%
Other values (19) 45
12.2%
Katakana
ValueCountFrequency (%)
4
12.1%
4
12.1%
3
9.1%
3
9.1%
3
9.1%
3
9.1%
3
9.1%
2
6.1%
2
6.1%
2
6.1%
Other values (4) 4
12.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10304
54.1%
CJK 5454
28.6%
Hangul 2216
 
11.6%
None 640
 
3.4%
Hiragana 370
 
1.9%
Katakana 33
 
0.2%
Punctuation 31
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1885
18.3%
? 672
 
6.5%
a 636
 
6.2%
e 584
 
5.7%
s 551
 
5.3%
o 511
 
5.0%
d 503
 
4.9%
n 460
 
4.5%
1 458
 
4.4%
y 369
 
3.6%
Other values (59) 3675
35.7%
CJK
ValueCountFrequency (%)
819
 
15.0%
399
 
7.3%
293
 
5.4%
288
 
5.3%
224
 
4.1%
214
 
3.9%
210
 
3.9%
192
 
3.5%
162
 
3.0%
124
 
2.3%
Other values (231) 2529
46.4%
None
ValueCountFrequency (%)
503
78.6%
77
 
12.0%
14
 
2.2%
14
 
2.2%
11
 
1.7%
11
 
1.7%
· 7
 
1.1%
2
 
0.3%
1
 
0.2%
Hangul
ValueCountFrequency (%)
422
19.0%
208
 
9.4%
190
 
8.6%
160
 
7.2%
64
 
2.9%
59
 
2.7%
59
 
2.7%
56
 
2.5%
55
 
2.5%
52
 
2.3%
Other values (154) 891
40.2%
Hiragana
ValueCountFrequency (%)
68
18.4%
64
17.3%
63
17.0%
36
9.7%
31
8.4%
21
 
5.7%
15
 
4.1%
13
 
3.5%
8
 
2.2%
6
 
1.6%
Other values (19) 45
12.2%
Punctuation
ValueCountFrequency (%)
27
87.1%
4
 
12.9%
Katakana
ValueCountFrequency (%)
4
12.1%
4
12.1%
3
9.1%
3
9.1%
3
9.1%
3
9.1%
3
9.1%
2
6.1%
2
6.1%
2
6.1%
Other values (4) 4
12.1%

교통정보
Text

MISSING 

Distinct1805
Distinct (%)96.3%
Missing82
Missing (%)4.2%
Memory size15.4 KiB
2024-05-18T13:37:52.018628image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length402
Median length184
Mean length38.439467
Min length7

Characters and Unicode

Total characters72074
Distinct characters968
Distinct categories13 ?
Distinct scripts6 ?
Distinct blocks8 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1754 ?
Unique (%)93.5%

Sample

1st rowSubway Line 3, Anguk Station, Exit 3
2nd rowSubway Line 6, Changsin Station, Exit 1
3rd rowSubway Line 6, Changsin Station, Exit 1
4th rowSubway Line 3, Anguk Station, Exit 3
5th rowSubway Line 3, Anguk Station, Exit 3
ValueCountFrequency (%)
station 478
 
4.7%
line 443
 
4.4%
exit 429
 
4.2%
subway 362
 
3.6%
출구 323
 
3.2%
3 287
 
2.8%
258
 
2.5%
on 224
 
2.2%
2 209
 
2.1%
1 207
 
2.0%
Other values (2594) 6945
68.3%
2024-05-18T13:37:53.311706image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9209
 
12.8%
? 4604
 
6.4%
n 2174
 
3.0%
1 2139
 
3.0%
t 1979
 
2.7%
i 1925
 
2.7%
o 1827
 
2.5%
3 1580
 
2.2%
1562
 
2.2%
1509
 
2.1%
Other values (958) 43566
60.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 24584
34.1%
Lowercase Letter 17338
24.1%
Decimal Number 10042
13.9%
Space Separator 9209
 
12.8%
Other Punctuation 6504
 
9.0%
Uppercase Letter 2654
 
3.7%
Open Punctuation 727
 
1.0%
Close Punctuation 726
 
1.0%
Dash Punctuation 183
 
0.3%
Math Symbol 89
 
0.1%
Other values (3) 18
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1562
 
6.4%
1509
 
6.1%
1088
 
4.4%
968
 
3.9%
967
 
3.9%
637
 
2.6%
597
 
2.4%
533
 
2.2%
532
 
2.2%
502
 
2.0%
Other values (861) 15689
63.8%
Lowercase Letter
ValueCountFrequency (%)
n 2174
12.5%
t 1979
11.4%
i 1925
11.1%
o 1827
10.5%
a 1441
 
8.3%
e 1286
 
7.4%
m 930
 
5.4%
u 785
 
4.5%
g 608
 
3.5%
s 591
 
3.4%
Other values (16) 3792
21.9%
Uppercase Letter
ValueCountFrequency (%)
S 923
34.8%
L 470
17.7%
E 468
17.6%
G 118
 
4.4%
A 98
 
3.7%
H 92
 
3.5%
J 71
 
2.7%
C 66
 
2.5%
T 52
 
2.0%
U 39
 
1.5%
Other values (14) 257
 
9.7%
Other Punctuation
ValueCountFrequency (%)
? 4604
70.8%
, 897
 
13.8%
340
 
5.2%
/ 144
 
2.2%
* 141
 
2.2%
130
 
2.0%
& 89
 
1.4%
· 58
 
0.9%
. 58
 
0.9%
' 19
 
0.3%
Other values (4) 24
 
0.4%
Decimal Number
ValueCountFrequency (%)
1 2139
21.3%
3 1580
15.7%
2 1476
14.7%
5 1252
12.5%
4 959
9.5%
0 940
9.4%
6 634
 
6.3%
7 553
 
5.5%
8 262
 
2.6%
9 243
 
2.4%
Other values (4) 4
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 371
51.1%
340
46.8%
11
 
1.5%
] 4
 
0.6%
Open Punctuation
ValueCountFrequency (%)
( 371
51.0%
341
46.9%
11
 
1.5%
[ 4
 
0.6%
Math Symbol
ValueCountFrequency (%)
64
71.9%
~ 16
 
18.0%
> 5
 
5.6%
< 4
 
4.5%
Final Punctuation
ValueCountFrequency (%)
4
57.1%
3
42.9%
Initial Punctuation
ValueCountFrequency (%)
4
57.1%
3
42.9%
Space Separator
ValueCountFrequency (%)
9209
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 183
100.0%
Control
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 27498
38.2%
Latin 19992
27.7%
Han 14513
20.1%
Hangul 6242
 
8.7%
Katakana 3024
 
4.2%
Hiragana 805
 
1.1%

Most frequent character per script

Han
ValueCountFrequency (%)
1562
 
10.8%
1509
 
10.4%
1088
 
7.5%
968
 
6.7%
967
 
6.7%
597
 
4.1%
473
 
3.3%
445
 
3.1%
389
 
2.7%
295
 
2.0%
Other values (491) 6220
42.9%
Hangul
ValueCountFrequency (%)
533
 
8.5%
532
 
8.5%
502
 
8.0%
484
 
7.8%
464
 
7.4%
369
 
5.9%
187
 
3.0%
156
 
2.5%
154
 
2.5%
149
 
2.4%
Other values (251) 2712
43.4%
Katakana
ValueCountFrequency (%)
637
21.1%
177
 
5.9%
172
 
5.7%
144
 
4.8%
121
 
4.0%
110
 
3.6%
90
 
3.0%
85
 
2.8%
80
 
2.6%
77
 
2.5%
Other values (63) 1331
44.0%
Latin
ValueCountFrequency (%)
n 2174
 
10.9%
t 1979
 
9.9%
i 1925
 
9.6%
o 1827
 
9.1%
a 1441
 
7.2%
e 1286
 
6.4%
m 930
 
4.7%
S 923
 
4.6%
u 785
 
3.9%
g 608
 
3.0%
Other values (40) 6114
30.6%
Common
ValueCountFrequency (%)
9209
33.5%
? 4604
16.7%
1 2139
 
7.8%
3 1580
 
5.7%
2 1476
 
5.4%
5 1252
 
4.6%
4 959
 
3.5%
0 940
 
3.4%
, 897
 
3.3%
6 634
 
2.3%
Other values (37) 3808
13.8%
Hiragana
ValueCountFrequency (%)
230
28.6%
227
28.2%
52
 
6.5%
50
 
6.2%
32
 
4.0%
29
 
3.6%
29
 
3.6%
19
 
2.4%
18
 
2.2%
16
 
2.0%
Other values (26) 103
12.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 46156
64.0%
CJK 14513
 
20.1%
Hangul 6242
 
8.7%
Katakana 3024
 
4.2%
None 1255
 
1.7%
Hiragana 805
 
1.1%
Arrows 64
 
0.1%
Punctuation 15
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9209
20.0%
? 4604
 
10.0%
n 2174
 
4.7%
1 2139
 
4.6%
t 1979
 
4.3%
i 1925
 
4.2%
o 1827
 
4.0%
3 1580
 
3.4%
2 1476
 
3.2%
a 1441
 
3.1%
Other values (68) 17802
38.6%
CJK
ValueCountFrequency (%)
1562
 
10.8%
1509
 
10.4%
1088
 
7.5%
968
 
6.7%
967
 
6.7%
597
 
4.1%
473
 
3.3%
445
 
3.1%
389
 
2.7%
295
 
2.0%
Other values (491) 6220
42.9%
Katakana
ValueCountFrequency (%)
637
21.1%
177
 
5.9%
172
 
5.7%
144
 
4.8%
121
 
4.0%
110
 
3.6%
90
 
3.0%
85
 
2.8%
80
 
2.6%
77
 
2.5%
Other values (63) 1331
44.0%
Hangul
ValueCountFrequency (%)
533
 
8.5%
532
 
8.5%
502
 
8.0%
484
 
7.8%
464
 
7.4%
369
 
5.9%
187
 
3.0%
156
 
2.5%
154
 
2.5%
149
 
2.4%
Other values (251) 2712
43.4%
None
ValueCountFrequency (%)
341
27.2%
340
27.1%
340
27.1%
130
 
10.4%
· 58
 
4.6%
18
 
1.4%
11
 
0.9%
11
 
0.9%
2
 
0.2%
1
 
0.1%
Other values (3) 3
 
0.2%
Hiragana
ValueCountFrequency (%)
230
28.6%
227
28.2%
52
 
6.5%
50
 
6.2%
32
 
4.0%
29
 
3.6%
29
 
3.6%
19
 
2.4%
18
 
2.2%
16
 
2.0%
Other values (26) 103
12.8%
Arrows
ValueCountFrequency (%)
64
100.0%
Punctuation
ValueCountFrequency (%)
4
26.7%
4
26.7%
3
20.0%
3
20.0%
1
 
6.7%

태그
Text

Distinct1954
Distinct (%)99.8%
Missing0
Missing (%)0.0%
Memory size15.4 KiB
2024-05-18T13:37:53.980324image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length243
Median length174
Mean length56.226367
Min length3

Characters and Unicode

Total characters110035
Distinct characters1938
Distinct categories11 ?
Distinct scripts6 ?
Distinct blocks8 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1951 ?
Unique (%)99.7%

Sample

1st rowJongro,BaekyangLaundry
2nd rowobservatory, Changsin-dong, cafe,Quarry observatory
3rd rowChangsin-dong Cliff Village, modern history of Korea, Changsin-dong,places to visit in Seoul
4th rowChoongAngHighSchool, Jongno, History, WinterSonata
5th rowChoongAngStore, Jongno,Hallyu
ValueCountFrequency (%)
204
 
2.9%
老店 131
 
1.9%
oraegage 58
 
0.8%
オレガゲ(老 51
 
0.7%
오래가게 47
 
0.7%
information 38
 
0.5%
전시 31
 
0.4%
박물관 26
 
0.4%
광화문 22
 
0.3%
station 22
 
0.3%
Other values (4697) 6408
91.0%
2024-05-18T13:37:54.910228image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
, 13895
 
12.6%
? 5980
 
5.4%
5266
 
4.8%
e 3697
 
3.4%
o 3260
 
3.0%
a 3175
 
2.9%
n 3096
 
2.8%
i 2579
 
2.3%
t 2418
 
2.2%
r 2271
 
2.1%
Other values (1928) 64398
58.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 44121
40.1%
Lowercase Letter 33143
30.1%
Other Punctuation 20173
18.3%
Uppercase Letter 6499
 
5.9%
Space Separator 5266
 
4.8%
Decimal Number 368
 
0.3%
Open Punctuation 169
 
0.2%
Close Punctuation 169
 
0.2%
Dash Punctuation 123
 
0.1%
Final Punctuation 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
684
 
1.6%
594
 
1.3%
536
 
1.2%
534
 
1.2%
486
 
1.1%
471
 
1.1%
453
 
1.0%
421
 
1.0%
416
 
0.9%
411
 
0.9%
Other values (1846) 39115
88.7%
Lowercase Letter
ValueCountFrequency (%)
e 3697
11.2%
o 3260
9.8%
a 3175
9.6%
n 3096
9.3%
i 2579
 
7.8%
t 2418
 
7.3%
r 2271
 
6.9%
u 2102
 
6.3%
l 1627
 
4.9%
g 1550
 
4.7%
Other values (16) 7368
22.2%
Uppercase Letter
ValueCountFrequency (%)
S 1099
16.9%
C 561
 
8.6%
M 465
 
7.2%
H 459
 
7.1%
G 442
 
6.8%
A 391
 
6.0%
T 376
 
5.8%
P 301
 
4.6%
O 270
 
4.2%
D 265
 
4.1%
Other values (16) 1870
28.8%
Other Punctuation
ValueCountFrequency (%)
, 13895
68.9%
? 5980
29.6%
· 96
 
0.5%
81
 
0.4%
# 39
 
0.2%
28
 
0.1%
. 27
 
0.1%
' 13
 
0.1%
& 10
 
< 0.1%
/ 3
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
3 80
21.7%
1 60
16.3%
0 44
12.0%
6 36
9.8%
4 33
9.0%
2 31
 
8.4%
8 28
 
7.6%
7 23
 
6.2%
9 19
 
5.2%
5 14
 
3.8%
Open Punctuation
ValueCountFrequency (%)
( 153
90.5%
16
 
9.5%
Close Punctuation
ValueCountFrequency (%)
) 153
90.5%
16
 
9.5%
Final Punctuation
ValueCountFrequency (%)
2
66.7%
1
33.3%
Space Separator
ValueCountFrequency (%)
5266
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 123
100.0%
Initial Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 39642
36.0%
Common 26272
23.9%
Han 24937
22.7%
Hangul 12833
 
11.7%
Katakana 5880
 
5.3%
Hiragana 471
 
0.4%

Most frequent character per script

Han
ValueCountFrequency (%)
684
 
2.7%
594
 
2.4%
536
 
2.1%
471
 
1.9%
453
 
1.8%
421
 
1.7%
411
 
1.6%
405
 
1.6%
377
 
1.5%
353
 
1.4%
Other values (1152) 20232
81.1%
Hangul
ValueCountFrequency (%)
416
 
3.2%
392
 
3.1%
333
 
2.6%
317
 
2.5%
313
 
2.4%
251
 
2.0%
247
 
1.9%
220
 
1.7%
220
 
1.7%
218
 
1.7%
Other values (554) 9906
77.2%
Katakana
ValueCountFrequency (%)
534
 
9.1%
486
 
8.3%
406
 
6.9%
401
 
6.8%
200
 
3.4%
192
 
3.3%
162
 
2.8%
159
 
2.7%
150
 
2.6%
148
 
2.5%
Other values (69) 3042
51.7%
Latin
ValueCountFrequency (%)
e 3697
 
9.3%
o 3260
 
8.2%
a 3175
 
8.0%
n 3096
 
7.8%
i 2579
 
6.5%
t 2418
 
6.1%
r 2271
 
5.7%
u 2102
 
5.3%
l 1627
 
4.1%
g 1550
 
3.9%
Other values (42) 13867
35.0%
Hiragana
ValueCountFrequency (%)
104
22.1%
46
 
9.8%
30
 
6.4%
29
 
6.2%
26
 
5.5%
19
 
4.0%
19
 
4.0%
18
 
3.8%
13
 
2.8%
12
 
2.5%
Other values (41) 155
32.9%
Common
ValueCountFrequency (%)
, 13895
52.9%
? 5980
22.8%
5266
 
20.0%
( 153
 
0.6%
) 153
 
0.6%
- 123
 
0.5%
· 96
 
0.4%
81
 
0.3%
3 80
 
0.3%
1 60
 
0.2%
Other values (20) 385
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 65673
59.7%
CJK 24932
 
22.7%
Hangul 12833
 
11.7%
Katakana 5880
 
5.3%
Hiragana 471
 
0.4%
None 237
 
0.2%
CJK Compat Ideographs 5
 
< 0.1%
Punctuation 4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
, 13895
21.2%
? 5980
 
9.1%
5266
 
8.0%
e 3697
 
5.6%
o 3260
 
5.0%
a 3175
 
4.8%
n 3096
 
4.7%
i 2579
 
3.9%
t 2418
 
3.7%
r 2271
 
3.5%
Other values (64) 20036
30.5%
CJK
ValueCountFrequency (%)
684
 
2.7%
594
 
2.4%
536
 
2.1%
471
 
1.9%
453
 
1.8%
421
 
1.7%
411
 
1.6%
405
 
1.6%
377
 
1.5%
353
 
1.4%
Other values (1148) 20227
81.1%
Katakana
ValueCountFrequency (%)
534
 
9.1%
486
 
8.3%
406
 
6.9%
401
 
6.8%
200
 
3.4%
192
 
3.3%
162
 
2.8%
159
 
2.7%
150
 
2.6%
148
 
2.5%
Other values (69) 3042
51.7%
Hangul
ValueCountFrequency (%)
416
 
3.2%
392
 
3.1%
333
 
2.6%
317
 
2.5%
313
 
2.4%
251
 
2.0%
247
 
1.9%
220
 
1.7%
220
 
1.7%
218
 
1.7%
Other values (554) 9906
77.2%
Hiragana
ValueCountFrequency (%)
104
22.1%
46
 
9.8%
30
 
6.4%
29
 
6.2%
26
 
5.5%
19
 
4.0%
19
 
4.0%
18
 
3.8%
13
 
2.8%
12
 
2.5%
Other values (41) 155
32.9%
None
ValueCountFrequency (%)
· 96
40.5%
81
34.2%
28
 
11.8%
16
 
6.8%
16
 
6.8%
Punctuation
ValueCountFrequency (%)
2
50.0%
1
25.0%
1
25.0%
CJK Compat Ideographs
ValueCountFrequency (%)
2
40.0%
1
20.0%
1
20.0%
1
20.0%

장애인편의시설
Text

MISSING 

Distinct162
Distinct (%)73.6%
Missing1737
Missing (%)88.8%
Memory size15.4 KiB
2024-05-18T13:37:55.374236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length137
Median length61
Mean length44.681818
Min length4

Characters and Unicode

Total characters9830
Distinct characters125
Distinct categories7 ?
Distinct scripts6 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique121 ?
Unique (%)55.0%

Sample

1st rowAccessible Restrooms,Accessible Pathways,Elevators,Accessible Information Centers & Services(Wheelchair rentals, etc.)
2nd rowAccessible Restrooms,Accessible Pathways,Accessible Information Centers & Services(Wheelchair rentals, etc.),Elevators
3rd rowAccessible Restrooms,Accessible Pathways
4th rowエレベ?タ?,バリアフリ?トイレ,アクセシビリティ,?の不自由な方のための案?所(車椅子レンタルなど)
5th rowAccessible Restrooms,Accessible Information Centers & Services(Wheelchair rentals, etc.)
ValueCountFrequency (%)
accessible 41
 
6.0%
안내(휠체어 33
 
4.9%
대여 33
 
4.9%
33
 
4.9%
전용 33
 
4.9%
restrooms,accessible 31
 
4.6%
services(wheelchair 28
 
4.1%
centers 28
 
4.1%
information 28
 
4.1%
28
 
4.1%
Other values (145) 363
53.5%
2024-05-18T13:37:56.453802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
, 582
 
5.9%
e 524
 
5.3%
? 491
 
5.0%
459
 
4.7%
s 451
 
4.6%
c 326
 
3.3%
i 231
 
2.3%
r 224
 
2.3%
t 200
 
2.0%
l 198
 
2.0%
Other values (115) 6144
62.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4496
45.7%
Lowercase Letter 3095
31.5%
Other Punctuation 1129
 
11.5%
Space Separator 459
 
4.7%
Uppercase Letter 347
 
3.5%
Open Punctuation 152
 
1.5%
Close Punctuation 152
 
1.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
189
 
4.2%
150
 
3.3%
139
 
3.1%
131
 
2.9%
131
 
2.9%
111
 
2.5%
111
 
2.5%
104
 
2.3%
103
 
2.3%
103
 
2.3%
Other values (81) 3224
71.7%
Lowercase Letter
ValueCountFrequency (%)
e 524
16.9%
s 451
14.6%
c 326
10.5%
i 231
7.5%
r 224
7.2%
t 200
 
6.5%
l 198
 
6.4%
a 191
 
6.2%
o 151
 
4.9%
n 138
 
4.5%
Other values (9) 461
14.9%
Uppercase Letter
ValueCountFrequency (%)
A 121
34.9%
P 56
16.1%
R 37
 
10.7%
W 28
 
8.1%
S 28
 
8.1%
C 28
 
8.1%
I 28
 
8.1%
E 21
 
6.1%
Other Punctuation
ValueCountFrequency (%)
, 582
51.6%
? 491
43.5%
& 28
 
2.5%
. 28
 
2.5%
Space Separator
ValueCountFrequency (%)
459
100.0%
Open Punctuation
ValueCountFrequency (%)
( 152
100.0%
Close Punctuation
ValueCountFrequency (%)
) 152
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3442
35.0%
Han 2038
20.7%
Common 1892
19.2%
Hangul 1370
 
13.9%
Katakana 732
 
7.4%
Hiragana 356
 
3.6%

Most frequent character per script

Han
ValueCountFrequency (%)
139
 
6.8%
131
 
6.4%
131
 
6.4%
103
 
5.1%
91
 
4.5%
77
 
3.8%
72
 
3.5%
72
 
3.5%
68
 
3.3%
63
 
3.1%
Other values (29) 1091
53.5%
Hangul
ValueCountFrequency (%)
189
 
13.8%
111
 
8.1%
111
 
8.1%
103
 
7.5%
45
 
3.3%
45
 
3.3%
41
 
3.0%
41
 
3.0%
41
 
3.0%
41
 
3.0%
Other values (19) 602
43.9%
Latin
ValueCountFrequency (%)
e 524
15.2%
s 451
13.1%
c 326
9.5%
i 231
 
6.7%
r 224
 
6.5%
t 200
 
5.8%
l 198
 
5.8%
a 191
 
5.5%
o 151
 
4.4%
n 138
 
4.0%
Other values (17) 808
23.5%
Katakana
ValueCountFrequency (%)
104
14.2%
83
 
11.3%
68
 
9.3%
47
 
6.4%
36
 
4.9%
36
 
4.9%
36
 
4.9%
36
 
4.9%
32
 
4.4%
32
 
4.4%
Other values (8) 222
30.3%
Common
ValueCountFrequency (%)
, 582
30.8%
? 491
26.0%
459
24.3%
( 152
 
8.0%
) 152
 
8.0%
& 28
 
1.5%
. 28
 
1.5%
Hiragana
ValueCountFrequency (%)
150
42.1%
78
21.9%
50
 
14.0%
50
 
14.0%
28
 
7.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5334
54.3%
CJK 2038
 
20.7%
Hangul 1370
 
13.9%
Katakana 732
 
7.4%
Hiragana 356
 
3.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
, 582
 
10.9%
e 524
 
9.8%
? 491
 
9.2%
459
 
8.6%
s 451
 
8.5%
c 326
 
6.1%
i 231
 
4.3%
r 224
 
4.2%
t 200
 
3.7%
l 198
 
3.7%
Other values (24) 1648
30.9%
Hangul
ValueCountFrequency (%)
189
 
13.8%
111
 
8.1%
111
 
8.1%
103
 
7.5%
45
 
3.3%
45
 
3.3%
41
 
3.0%
41
 
3.0%
41
 
3.0%
41
 
3.0%
Other values (19) 602
43.9%
Hiragana
ValueCountFrequency (%)
150
42.1%
78
21.9%
50
 
14.0%
50
 
14.0%
28
 
7.9%
CJK
ValueCountFrequency (%)
139
 
6.8%
131
 
6.4%
131
 
6.4%
103
 
5.1%
91
 
4.5%
77
 
3.8%
72
 
3.5%
72
 
3.5%
68
 
3.3%
63
 
3.1%
Other values (29) 1091
53.5%
Katakana
ValueCountFrequency (%)
104
14.2%
83
 
11.3%
68
 
9.3%
47
 
6.4%
36
 
4.9%
36
 
4.9%
36
 
4.9%
36
 
4.9%
32
 
4.4%
32
 
4.4%
Other values (8) 222
30.3%

Interactions

2024-05-18T13:37:24.352493image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-18T13:37:56.709566image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
고유번호언어팩스번호
고유번호1.0000.1240.993
언어0.1241.0000.000
팩스번호0.9930.0001.000
2024-05-18T13:37:56.961851image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
팩스번호언어
팩스번호1.0000.000
언어0.0001.000
2024-05-18T13:37:57.174109image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
고유번호언어팩스번호
고유번호1.0000.0520.695
언어0.0521.0000.000
팩스번호0.6950.0001.000

Missing values

2024-05-18T13:37:24.808606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-18T13:37:25.694254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-05-18T13:37:26.365783image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

고유번호언어상호명콘텐츠URL주소신주소전화번호팩스번호웹사이트운영시간운영요일휴무일교통정보태그장애인편의시설
045520enBaekyang Laundryhttps://english.visitseoul.net/attractions/Baekyang-2024/ENP8onuvv?utm_source=seoulopendata&utm_medium=attractions&utm_content=ENP8onuvv140-24, Gye-dong, Jongno-gu, Seoul, Korea03057 54 Gyedong-gil, Jongno-gu, Seoul+82-2-762-1261<NA><NA><NA><NA><NA>Subway Line 3, Anguk Station, Exit 3Jongro,BaekyangLaundry<NA>
145492enChangsin-Sungin Quarry Observatoryhttps://english.visitseoul.net/attractions/2024-Chaeseokjangjeonmangdae/ENPauov7d?utm_source=seoulopendata&utm_medium=attractions&utm_content=ENPauov7d서울 종로구 창신동 23-32203091 51 Naksan 5-gil, Jongno-gu, Seoul+82-507-1330-5416<NA><NA>Tue-Fri 10:00 - 20:00 / Sat-Sun 10:00 - 22:00Tue-SunMondaysSubway Line 6, Changsin Station, Exit 1observatory, Changsin-dong, cafe,Quarry observatory<NA>
245482enChangsin-dong's Cliff Villagehttps://english.visitseoul.net/attractions/2024-changsincliff/ENPgvo4y2?utm_source=seoulopendata&utm_medium=attractions&utm_content=ENPgvo4y2서울 종로구 창신동 23-32203091 23-322 Changsin-dong, Dongdaemun-gu, Seoul<NA><NA><NA><NA><NA><NA>Subway Line 6, Changsin Station, Exit 1Changsin-dong Cliff Village, modern history of Korea, Changsin-dong,places to visit in Seoul<NA>
345530enChoong Ang High Schoolhttps://english.visitseoul.net/attractions/ChoongAngHighSchool/ENPgcblme?utm_source=seoulopendata&utm_medium=attractions&utm_content=ENPgcblme1, Gye-dong, Jongno-gu, Seoul, Korea03051 164 Changdeokgung-gil, Jongno-gu, Seoul+82-2-742-1321<NA><NA><NA><NA><NA>Subway Line 3, Anguk Station, Exit 3ChoongAngHighSchool, Jongno, History, WinterSonata<NA>
445535enChoong Ang Storehttps://english.visitseoul.net/attractions/ChoongAngStore/ENPl7gype?utm_source=seoulopendata&utm_medium=attractions&utm_content=ENPl7gype2-105, Gye-dong, Jongno-gu, Seoul, Korea03051 162 Changdeokgung-gil, Jongno-gu, Seoul<NA><NA><NA><NA><NA><NA>Subway Line 3, Anguk Station, Exit 3ChoongAngStore, Jongno,Hallyu<NA>
545563enMyeongdong Jaemi-rohttps://english.visitseoul.net/attractions/2024-jaemiro/ENPhvgb2b?utm_source=seoulopendata&utm_medium=attractions&utm_content=ENPhvgb2b서울 중구 남산동2가 30-404631 24, Toegye-ro 20-gil, Jung-gu, Seoul, Republic of Korea<NA><NA><NA><NA><NA><NA>Subway Line 4, Myeongdong Station, Exit 3Jaemiro,KoreanComics,Animation, Manwha,MyeongdongStation,Myeongdong<NA>
624685enNamsangol Hanok Villagehttps://english.visitseoul.net/attractions/Namsangol-Hanok-Village-en/ENP024684?utm_source=seoulopendata&utm_medium=attractions&utm_content=ENP024684100-272 84-1 Pildong 2(i)-ga, Jung-gu, Seoul04626 28 Toegye-ro 34-gil, Jung-gu, Seoul+82-2-2261-0501<NA>https://www.hanokmaeul.or.kr/Summer season (April to October) 09:00 - 21:00 KST Winter season (November to March) 09:00 - 20:00 KST * Traditional garden open 24/7Tues - SunClosed Mondays- Chungmuro Station (subway lines 3, 4) exits 3, 4Myeong-dong,City Tour Bus Station,Sky Bus,Seoul City Tour Bus,Chungmu-ro Station,Namsangol Hanok Village,Tiger Bus<NA>
716406enNight Views at Banpodaegyo Bridgehttps://english.visitseoul.net/attractions/Night-Views-at-Banpodaegyo-Bridge/ENP016325?utm_source=seoulopendata&utm_medium=attractions&utm_content=ENP016325649, Banpo-dong, Seocho-gu, Seoul (North end of Banpodaegyo Bridge Bridge)2085-14, Olympic-daero, Seocho-gu, Seoul+82-2-3780-0578<NA><NA><NA><NA><NA><NA>SeoulNightViewSpots,BanpoNightView,HangangDate,SeoulTravel,BanpoBridgeMoonlightRainbowFountain,SeoulAtNight,NightView,SeoulScenery,Park,DateNight,RainbowFountain,BanpoHangangPark,HangangParkBanpodaegyoBridge,BanpodaegyoBridge,Seoul,Walking<NA>
815338ko1898 명동성당https://korean.visitseoul.net/attractions/1898-명동성당/KOP015338?utm_source=seoulopendata&utm_medium=attractions&utm_content=KOP015338100-809 서울 중구 명동2가 1-104537 서울 중구 명동길 74 (명동2가, 명동성당)02-774-1784<NA><NA>성당사무실 화 ~ 금 | 09:00 ~ 20:30 토 요 일 | 09:00 ~ 20:00 일 요 일 | 09:00 ~ 21:00<NA>설날, 추석 당일 (성당사무실: 월요일 휴무)2호선 을지로입구역 5번 출구 3호선 을지로3가역 12번 출구 4호선 명동역 9번 출구명동대성당, 고딕,명동, 1898광장,복합문화공간,명동나들이, 성당<NA>
923961enSeMA Bunkerhttps://english.visitseoul.net/attractions/SeMA-Bunker/ENP023532?utm_source=seoulopendata&utm_medium=attractions&utm_content=ENP023532150-010 B-101, Uisadangdaero, Yeongdeungpo-gu, Seoul07327 101, Uisadang-daero, Yeongdeungpo-gu, Seoul+82-2-2124-8941<NA>https://sema.seoul.go.kr/en/indexTuesday - Friday 10:00 - 20:00 KST Sat, Holiday 10:00 - 19:00 KST<NA>MondaysSubway Lines 5 & 9, Yeouido Station, Exit 3SeMABunker,Seoul20,YeouidoStation,SeMA,SeoulMuseumOfArt,Gallery,Yeouido,Culture,UndergroundBunker,History,Art<NA>
고유번호언어상호명콘텐츠URL주소신주소전화번호팩스번호웹사이트운영시간운영요일휴무일교통정보태그장애인편의시설
19473266ko창의문https://korean.visitseoul.net/attractions/창의문/KOP003266?utm_source=seoulopendata&utm_medium=attractions&utm_content=KOP003266110-030 서울시 종로구 부암동 277-1103020 서울특별시 종로구 창의문로 118 (부암동)02-730-9924<NA><NA><NA><NA><NA>3호선 경복궁 3번 출구 도보 30분자하문, 한양도성, 성곽길, 창의문, 광화문, 성곽, 북악산, 북문, 경복궁역,유적지<NA>
194828053ko코끼리분식https://korean.visitseoul.net/attractions/2023029/KOP028053?utm_source=seoulopendata&utm_medium=attractions&utm_content=KOP028053서울 마포구 도화동 345-404172 서울 마포구 도화2길 3 (도화동)02-717-9061<NA>https://www.instagram.com/koggiri_mapo_line5/09:30~21:00(라스트오더 20:30)매일매달 1, 3번째 월요일 정기휴무5호선 마포역 3번 출구에서 270m오래가게, 분식, 마포떡볶이골목,즉석떡볶이, 공덕, 볶음밥<NA>
19495242ko헌인릉https://korean.visitseoul.net/attractions/헌인릉/KOP005240?utm_source=seoulopendata&utm_medium=attractions&utm_content=KOP005240137-180 서울 서초구 내곡동 산 13-106795 서울 서초구 헌인릉길 36-10 (헌인릉)02-445-0347<NA>http://royaltombs.cha.go.kr/html/HtmlPage.do?pg=/new/html/portal_01_09_01.jsp&mn=RT_01_092월 ~ 5월, 9월 ~ 10월 09:00 ~ 18:00 6월 ~ 8월 09:00 ~ 18:30 11월 ~ 1월 09:00 ~ 17:30 (매표는 관람종료 1시간 전까지만 가능)화~일월요일3호선 양재역 7번 출구 버스 환승(407, 408, 440, 462, 471) 2호선 강남역 3번 출구 버스 환승(407, 408, 440, 462, 471) ※ 하차지점 : 헌인릉 버스정류장관광코스, 세계문화유산,강남, 헌릉,서울명소, 문화재청,헌인릉, 조선왕릉, 인릉, 유네스코, 서울문화재<NA>
195043442ko효성한의원https://korean.visitseoul.net/attractions/2023021/KOPvsi1e5?utm_source=seoulopendata&utm_medium=attractions&utm_content=KOPvsi1e5서울 동대문구 제기동 1140-5502569 서울 동대문구 약령중앙로 5 (제기동, 대산빌딩)02-961-5544<NA>blog.naver.com/cjswldls6010:00~18:00월~목, 토, 일금요일1호선 제기동역 2번 출구에서 113m오래가게, 제기동, 한방치료, 한의학, 서울약령시, 한약재<NA>
195119657ko효자베이커리https://korean.visitseoul.net/attractions/효자베이커리/KOP019657?utm_source=seoulopendata&utm_medium=attractions&utm_content=KOP01965703036 서울 종로구 필운대로 5402-736-7629<NA><NA>화~일 8:00 ~ 20:20 *빵 소진시 일찍 닫습니다화~일월요일3호선 경복궁역 2번 출구에서 697m발효빵,오래가게, 빵집, 빵, 도넛, 베이글,고로케<NA>
195228330ko훼드라https://korean.visitseoul.net/attractions/훼드라/KOP028330?utm_source=seoulopendata&utm_medium=attractions&utm_content=KOP02833003789 서울 서대문구 연세로5길 3202-323-3201<NA><NA>12:00~02:00매일없음2호선 신촌역 1번 출구에서 172m오래가게, 연세대, 신촌, 라면, 해장, 분식, 매운라면<NA>
195343491ko휘가로https://korean.visitseoul.net/attractions/2023014/KOPuel8q4?utm_source=seoulopendata&utm_medium=attractions&utm_content=KOPuel8q4서울 관악구 신림동 241-2408814 서울 관악구 신림로11길 20 (신림동)02-889-1722<NA><NA>16:00~손님들이 원하는 시간까지매일없음신림선 서울대벤처타운역 2번 출구에서 614m호프, 관악구, 서울대, 맥주,오래가게, 녹두거리<NA>
195411115ko흑석동 효사길https://korean.visitseoul.net/attractions/흑석동-효사길/KOP011112?utm_source=seoulopendata&utm_medium=attractions&utm_content=KOP011112156-860 서울 동작구 흑석동 173-191 173-19106910 서울 동작구 흑석동 효사 4, 5길<NA><NA><NA><NA><NA><NA>9호선 흑석역 4번출구서울골목투어,효사길전망대,흑석동효사길,골목산책,보고싶다촬영지,서울아름다운야경명소,서울여행,서울낭만명소,서울시티투어,서울가볼만한곳,서울야경명소,서울골목길<NA>
19551999ko흥인지문(동대문)https://korean.visitseoul.net/attractions/흥인지문동대문/KOP001999?utm_source=seoulopendata&utm_medium=attractions&utm_content=KOP001999110-126 서울 종로구 종로6가 6903119 서울 종로구 종로 288 (종로6가, 흥인지문)02-731-0412<NA><NA>09:00 ~ 18:00화~일4호선 동대문역 6번 출구 1호선 동대문역 6번 출구 2호선 동대문역사문화공원역 1번 출구서울동대문볼거리,서울문화재,성곽, 동대문, 역사,서울명소,흥인지문,서울데이트, 문화재,서울사대문,동대문야시장,서울여행,동대문명소<NA>
195643455ko힐스트링https://korean.visitseoul.net/attractions/2023044/KOPht2ct9?utm_source=seoulopendata&utm_medium=attractions&utm_content=KOPht2ct906711 서울 서초구 반포대로1길 8 성호빌딩 1층0507-1407-8195<NA>hillstring.modoo.at09:30~18:00월~금주말, 공휴일3호선 남부터미널역 5번 출구에서 757m오래가게, 현악기제작, 악기수리, 서초, 예술의전당, 바이올린, 명장<NA>