Overview

Dataset statistics

Number of variables5
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.0 KiB
Average record size in memory41.3 B

Variable types

Text2
Categorical3

Alerts

ctprvn_nm is highly overall correlated with gugun_nmHigh correlation
gugun_nm is highly overall correlated with ctprvn_nmHigh correlation
ldgs_nm has unique valuesUnique

Reproduction

Analysis started2023-12-10 10:19:17.907875
Analysis finished2023-12-10 10:19:18.864807
Duration0.96 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

ldgs_nm
Text

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T19:19:19.164912image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length15
Mean length9.04
Min length3

Characters and Unicode

Total characters904
Distinct characters189
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)100.0%

Sample

1st row아르반 호텔
2nd row명지 프렌치코드
3rd row부산 비즈니스 호텔
4th row스탠포드 인 부산
5th row크라운 하버호텔 부산
ValueCountFrequency (%)
호텔 34
 
14.8%
부산 9
 
3.9%
서울 7
 
3.1%
제주 6
 
2.6%
여수 5
 
2.2%
명동 5
 
2.2%
해운대 4
 
1.7%
울산 3
 
1.3%
수안보 3
 
1.3%
바이 3
 
1.3%
Other values (136) 150
65.5%
2023-12-10T19:19:19.813458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
129
 
14.3%
80
 
8.8%
79
 
8.7%
29
 
3.2%
20
 
2.2%
19
 
2.1%
17
 
1.9%
17
 
1.9%
16
 
1.8%
14
 
1.5%
Other values (179) 484
53.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 740
81.9%
Space Separator 129
 
14.3%
Uppercase Letter 16
 
1.8%
Lowercase Letter 7
 
0.8%
Decimal Number 5
 
0.6%
Open Punctuation 2
 
0.2%
Close Punctuation 2
 
0.2%
Other Punctuation 2
 
0.2%
Dash Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
80
 
10.8%
79
 
10.7%
29
 
3.9%
20
 
2.7%
19
 
2.6%
17
 
2.3%
17
 
2.3%
16
 
2.2%
14
 
1.9%
13
 
1.8%
Other values (155) 436
58.9%
Uppercase Letter
ValueCountFrequency (%)
E 4
25.0%
H 3
18.8%
T 2
12.5%
U 1
 
6.2%
N 1
 
6.2%
V 1
 
6.2%
A 1
 
6.2%
X 1
 
6.2%
W 1
 
6.2%
S 1
 
6.2%
Lowercase Letter
ValueCountFrequency (%)
s 2
28.6%
c 1
14.3%
j 1
14.3%
z 1
14.3%
a 1
14.3%
t 1
14.3%
Decimal Number
ValueCountFrequency (%)
1 3
60.0%
0 1
 
20.0%
3 1
 
20.0%
Space Separator
ValueCountFrequency (%)
129
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%
Other Punctuation
ValueCountFrequency (%)
& 2
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 740
81.9%
Common 141
 
15.6%
Latin 23
 
2.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
80
 
10.8%
79
 
10.7%
29
 
3.9%
20
 
2.7%
19
 
2.6%
17
 
2.3%
17
 
2.3%
16
 
2.2%
14
 
1.9%
13
 
1.8%
Other values (155) 436
58.9%
Latin
ValueCountFrequency (%)
E 4
17.4%
H 3
13.0%
s 2
 
8.7%
T 2
 
8.7%
U 1
 
4.3%
N 1
 
4.3%
V 1
 
4.3%
A 1
 
4.3%
c 1
 
4.3%
j 1
 
4.3%
Other values (6) 6
26.1%
Common
ValueCountFrequency (%)
129
91.5%
1 3
 
2.1%
( 2
 
1.4%
) 2
 
1.4%
& 2
 
1.4%
0 1
 
0.7%
- 1
 
0.7%
3 1
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 740
81.9%
ASCII 164
 
18.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
129
78.7%
E 4
 
2.4%
H 3
 
1.8%
1 3
 
1.8%
s 2
 
1.2%
( 2
 
1.2%
) 2
 
1.2%
& 2
 
1.2%
T 2
 
1.2%
U 1
 
0.6%
Other values (14) 14
 
8.5%
Hangul
ValueCountFrequency (%)
80
 
10.8%
79
 
10.7%
29
 
3.9%
20
 
2.7%
19
 
2.6%
17
 
2.3%
17
 
2.3%
16
 
2.2%
14
 
1.9%
13
 
1.8%
Other values (155) 436
58.9%

ctprvn_nm
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)6.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
부산
45 
충북
15 
서울
14 
울산
12 
제주

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row부산
2nd row부산
3rd row부산
4th row부산
5th row부산

Common Values

ValueCountFrequency (%)
부산 45
45.0%
충북 15
 
15.0%
서울 14
 
14.0%
울산 12
 
12.0%
제주 8
 
8.0%
전남 6
 
6.0%

Length

2023-12-10T19:19:20.035256image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:19:20.252141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
부산 45
45.0%
충북 15
 
15.0%
서울 14
 
14.0%
울산 12
 
12.0%
제주 8
 
8.0%
전남 6
 
6.0%

gugun_nm
Categorical

HIGH CORRELATION 

Distinct22
Distinct (%)22.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
수영구
20 
중구
15 
남구
해운대구
서귀포시
Other values (17)
39 

Length

Max length4
Median length3
Mean length2.93
Min length2

Unique

Unique10 ?
Unique (%)10.0%

Sample

1st row부산진구
2nd row강서구
3rd row부산진구
4th row중구
5th row중구

Common Values

ValueCountFrequency (%)
수영구 20
20.0%
중구 15
15.0%
남구 9
9.0%
해운대구 9
9.0%
서귀포시 8
 
8.0%
청주시 6
 
6.0%
여수시 6
 
6.0%
충주시 5
 
5.0%
연제구 4
 
4.0%
강서구 3
 
3.0%
Other values (12) 15
15.0%

Length

2023-12-10T19:19:20.616973image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
수영구 20
20.0%
중구 15
15.0%
남구 9
9.0%
해운대구 9
9.0%
서귀포시 8
 
8.0%
청주시 6
 
6.0%
여수시 6
 
6.0%
충주시 5
 
5.0%
연제구 4
 
4.0%
강서구 3
 
3.0%
Other values (12) 15
15.0%
Distinct97
Distinct (%)97.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T19:19:21.094003image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length22
Mean length16.84
Min length11

Characters and Unicode

Total characters1684
Distinct characters111
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique96 ?
Unique (%)96.0%

Sample

1st row부산 부산진구 부전동 517-60
2nd row부산 강서구 명지동 3239-15
3rd row부산 부산진구 부전동 512-4
4th row부산 중구 남포동5가 56-1
5th row부산 중구 중앙동4가 83-1
ValueCountFrequency (%)
부산 45
 
10.8%
수영구 20
 
4.8%
중구 15
 
3.6%
충북 15
 
3.6%
서울 14
 
3.3%
울산 12
 
2.9%
민락동 10
 
2.4%
해운대구 9
 
2.2%
광안동 9
 
2.2%
남구 9
 
2.2%
Other values (180) 260
62.2%
2023-12-10T19:19:21.867456image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
318
18.9%
1 112
 
6.7%
87
 
5.2%
- 82
 
4.9%
77
 
4.6%
74
 
4.4%
2 60
 
3.6%
51
 
3.0%
3 39
 
2.3%
4 39
 
2.3%
Other values (101) 745
44.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 848
50.4%
Decimal Number 435
25.8%
Space Separator 318
 
18.9%
Dash Punctuation 82
 
4.9%
Math Symbol 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
87
 
10.3%
77
 
9.1%
74
 
8.7%
51
 
6.0%
35
 
4.1%
28
 
3.3%
27
 
3.2%
26
 
3.1%
22
 
2.6%
22
 
2.6%
Other values (88) 399
47.1%
Decimal Number
ValueCountFrequency (%)
1 112
25.7%
2 60
13.8%
3 39
 
9.0%
4 39
 
9.0%
0 38
 
8.7%
5 37
 
8.5%
7 31
 
7.1%
9 31
 
7.1%
6 25
 
5.7%
8 23
 
5.3%
Space Separator
ValueCountFrequency (%)
318
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 82
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 848
50.4%
Common 836
49.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
87
 
10.3%
77
 
9.1%
74
 
8.7%
51
 
6.0%
35
 
4.1%
28
 
3.3%
27
 
3.2%
26
 
3.1%
22
 
2.6%
22
 
2.6%
Other values (88) 399
47.1%
Common
ValueCountFrequency (%)
318
38.0%
1 112
 
13.4%
- 82
 
9.8%
2 60
 
7.2%
3 39
 
4.7%
4 39
 
4.7%
0 38
 
4.5%
5 37
 
4.4%
7 31
 
3.7%
9 31
 
3.7%
Other values (3) 49
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 848
50.4%
ASCII 836
49.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
318
38.0%
1 112
 
13.4%
- 82
 
9.8%
2 60
 
7.2%
3 39
 
4.7%
4 39
 
4.7%
0 38
 
4.5%
5 37
 
4.4%
7 31
 
3.7%
9 31
 
3.7%
Other values (3) 49
 
5.9%
Hangul
ValueCountFrequency (%)
87
 
10.3%
77
 
9.1%
74
 
8.7%
51
 
6.0%
35
 
4.1%
28
 
3.3%
27
 
3.2%
26
 
3.1%
22
 
2.6%
22
 
2.6%
Other values (88) 399
47.1%

utiliiza_ota_nm
Categorical

Distinct15
Distinct (%)15.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
['AGODA', 'BOOKING', 'DAILY', 'EXPEDIA', 'GOODCHOICE', 'HOTELS', 'INTERPARK', 'YANOLJA']
27 
['AGODA', 'BOOKING', 'DAILY', 'EXPEDIA', 'GOODCHOICE', 'HOTELS']
25 
['AGODA', 'BOOKING', 'DAILY', 'EXPEDIA', 'GOODCHOICE', 'HOTELS', 'YANOLJA']
10 
['AGODA', 'BOOKING', 'DAILY', 'EXPEDIA', 'GOODCHOICE', 'INTERPARK', 'YANOLJA']
['AGODA', 'BOOKING', 'DAILY', 'EXPEDIA', 'GOODCHOICE', 'YANOLJA']
Other values (10)
24 

Length

Max length96
Median length88
Mean length70.78
Min length32

Unique

Unique6 ?
Unique (%)6.0%

Sample

1st row['AGODA', 'BOOKING', 'DAILY', 'EXPEDIA', 'GOODCHOICE', 'HOTELS', 'YANOLJA']
2nd row['AGODA', 'GOODCHOICE', 'INTERPARK', 'YANOLJA']
3rd row['AGODA', 'BOOKING', 'DAILY', 'EXPEDIA', 'GOODCHOICE', 'HOTELS', 'INTERPARK', 'YANOLJA']
4th row['AGODA', 'BOOKING', 'DAILY', 'EXPEDIA', 'GOODCHOICE', 'HOTELS', 'INTERPARK', 'YANOLJA']
5th row['AGODA', 'BOOKING', 'DAILY', 'EXPEDIA', 'GOODCHOICE', 'HOTELS', 'INTERPARK', 'YANOLJA']

Common Values

ValueCountFrequency (%)
['AGODA', 'BOOKING', 'DAILY', 'EXPEDIA', 'GOODCHOICE', 'HOTELS', 'INTERPARK', 'YANOLJA'] 27
27.0%
['AGODA', 'BOOKING', 'DAILY', 'EXPEDIA', 'GOODCHOICE', 'HOTELS'] 25
25.0%
['AGODA', 'BOOKING', 'DAILY', 'EXPEDIA', 'GOODCHOICE', 'HOTELS', 'YANOLJA'] 10
 
10.0%
['AGODA', 'BOOKING', 'DAILY', 'EXPEDIA', 'GOODCHOICE', 'INTERPARK', 'YANOLJA'] 8
 
8.0%
['AGODA', 'BOOKING', 'DAILY', 'EXPEDIA', 'GOODCHOICE', 'YANOLJA'] 6
 
6.0%
['AGODA', 'BOOKING', 'DAILY', 'GOODCHOICE'] 6
 
6.0%
['AGODA', 'BOOKING', 'DAILY', 'EXPEDIA', 'GOODCHOICE'] 6
 
6.0%
['AGODA', 'BOOKING', 'EXPEDIA', 'GOODCHOICE', 'HOTELS', 'INTERPARK', 'YANOLJA'] 4
 
4.0%
['AGODA', 'DAILY', 'GOODCHOICE'] 2
 
2.0%
['AGODA', 'GOODCHOICE', 'INTERPARK', 'YANOLJA'] 1
 
1.0%
Other values (5) 5
 
5.0%

Length

2023-12-10T19:19:22.086709image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
agoda 100
15.4%
goodchoice 100
15.4%
booking 95
14.6%
daily 93
14.3%
expedia 91
14.0%
hotels 69
10.6%
yanolja 58
8.9%
interpark 43
6.6%
trip 1
 
0.2%

Correlations

2023-12-10T19:19:22.244823image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ldgs_nmctprvn_nmgugun_nmldgs_addrutiliiza_ota_nm
ldgs_nm1.0001.0001.0001.0001.000
ctprvn_nm1.0001.0000.9871.0000.638
gugun_nm1.0000.9871.0001.0000.415
ldgs_addr1.0001.0001.0001.0001.000
utiliiza_ota_nm1.0000.6380.4151.0001.000
2023-12-10T19:19:22.413769image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
utiliiza_ota_nmgugun_nmctprvn_nm
utiliiza_ota_nm1.0000.1270.341
gugun_nm0.1271.0000.859
ctprvn_nm0.3410.8591.000
2023-12-10T19:19:22.562143image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ctprvn_nmgugun_nmutiliiza_ota_nm
ctprvn_nm1.0000.8590.341
gugun_nm0.8591.0000.127
utiliiza_ota_nm0.3410.1271.000

Missing values

2023-12-10T19:19:18.611744image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T19:19:18.796824image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

ldgs_nmctprvn_nmgugun_nmldgs_addrutiliiza_ota_nm
0아르반 호텔부산부산진구부산 부산진구 부전동 517-60['AGODA', 'BOOKING', 'DAILY', 'EXPEDIA', 'GOODCHOICE', 'HOTELS', 'YANOLJA']
1명지 프렌치코드부산강서구부산 강서구 명지동 3239-15['AGODA', 'GOODCHOICE', 'INTERPARK', 'YANOLJA']
2부산 비즈니스 호텔부산부산진구부산 부산진구 부전동 512-4['AGODA', 'BOOKING', 'DAILY', 'EXPEDIA', 'GOODCHOICE', 'HOTELS', 'INTERPARK', 'YANOLJA']
3스탠포드 인 부산부산중구부산 중구 남포동5가 56-1['AGODA', 'BOOKING', 'DAILY', 'EXPEDIA', 'GOODCHOICE', 'HOTELS', 'INTERPARK', 'YANOLJA']
4크라운 하버호텔 부산부산중구부산 중구 중앙동4가 83-1['AGODA', 'BOOKING', 'DAILY', 'EXPEDIA', 'GOODCHOICE', 'HOTELS', 'INTERPARK', 'YANOLJA']
5라마다 앙코르 바이 윈덤 부산역부산동구부산 동구 초량동 1204-1['AGODA', 'BOOKING', 'DAILY', 'EXPEDIA', 'GOODCHOICE', 'HOTELS', 'YANOLJA']
6센텀 프리미어 호텔부산해운대구부산 해운대구 우동 1521['AGODA', 'BOOKING', 'DAILY', 'EXPEDIA', 'GOODCHOICE', 'HOTELS', 'INTERPARK', 'YANOLJA']
7명지 오션시티호텔부산강서구부산 강서구 명지동 3237-4['AGODA', 'BOOKING', 'EXPEDIA', 'GOODCHOICE', 'HOTELS', 'INTERPARK', 'YANOLJA']
8마리안느호텔부산해운대구부산 해운대구 중동 1400-24['AGODA', 'BOOKING', 'DAILY', 'EXPEDIA', 'GOODCHOICE', 'HOTELS', 'YANOLJA']
9베스트웨스턴 해운대 호텔부산해운대구부산 해운대구 중동 1391-42['AGODA', 'BOOKING', 'DAILY', 'EXPEDIA', 'GOODCHOICE', 'HOTELS', 'INTERPARK', 'YANOLJA']
ldgs_nmctprvn_nmgugun_nmldgs_addrutiliiza_ota_nm
90더포인트호텔 광안리점부산수영구부산 수영구 민락동 181-154['AGODA', 'BOOKING', 'DAILY', 'EXPEDIA', 'GOODCHOICE', 'HOTELS']
91호텔 센트럴베이부산수영구부산 수영구 광안동 197-2['AGODA', 'BOOKING', 'DAILY', 'EXPEDIA', 'GOODCHOICE', 'INTERPARK', 'YANOLJA']
92브라운도트호텔 수영점부산수영구부산 수영구 수영동 450-5['AGODA', 'BOOKING', 'DAILY', 'GOODCHOICE']
93오션투헤븐부산수영구부산 수영구 민락동 181-154['AGODA', 'BOOKING', 'DAILY', 'EXPEDIA', 'GOODCHOICE', 'HOTELS']
94제이스테이부산수영구부산 수영구 광안동 203-18 4~6층['AGODA', 'BOOKING', 'DAILY', 'EXPEDIA', 'GOODCHOICE']
95벡스코호스텔부산수영구부산 수영구 민락동 181-212['AGODA', 'BOOKING', 'EXPEDIA', 'GOODCHOICE']
96호텔미라주부산수영구부산 수영구 광안동 203-4['AGODA', 'DAILY', 'GOODCHOICE']
97피코블루부산수영구부산 수영구 민락동 178-15['AGODA', 'DAILY', 'GOODCHOICE']
98오션스테이호텔부산수영구부산 수영구 민락동 181-154['AGODA', 'BOOKING', 'DAILY', 'EXPEDIA', 'GOODCHOICE', 'HOTELS']
99라움103부산수영구부산 수영구 민락동 110-51['AGODA', 'BOOKING', 'DAILY', 'GOODCHOICE']