Overview

Dataset statistics

Number of variables5
Number of observations640
Missing cells424
Missing cells (%)13.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory25.1 KiB
Average record size in memory40.2 B

Variable types

Text3
Categorical2

Dataset

Description콘텐츠 산업진흥을 위한 해외기업정보를 제공하여 국내 콘텐츠기업과 해외 콘텐츠 기업과의 네트워크 지원하는 정보를 제공하고 있습니다.
Author한국콘텐츠진흥원
URLhttps://www.data.go.kr/data/15015302/fileData.do

Alerts

장르 is highly overall correlated with 기업형태High correlation
기업형태 is highly overall correlated with 장르High correlation
장르 is highly imbalanced (54.1%)Imbalance
홈페이지 has 424 (66.2%) missing valuesMissing
기업명 has unique valuesUnique

Reproduction

Analysis started2023-12-12 20:08:25.600580
Analysis finished2023-12-12 20:08:26.253183
Duration0.65 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

국가
Text

Distinct85
Distinct (%)13.3%
Missing0
Missing (%)0.0%
Memory size5.1 KiB
2023-12-13T05:08:26.487296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length39
Median length28
Mean length4.028125
Min length2

Characters and Unicode

Total characters2578
Distinct characters106
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique31 ?
Unique (%)4.8%

Sample

1st rowAFGHANISTAN
2nd rowAFGHANISTAN
3rd rowALBANIA
4th rowAlgeria
5th rowAlgeria
ValueCountFrequency (%)
러시아 60
 
9.1%
중국 53
 
8.0%
미국 48
 
7.3%
브라질 47
 
7.1%
uae 39
 
5.9%
터키 38
 
5.7%
독일 34
 
5.1%
스페인 28
 
4.2%
이탈리아 26
 
3.9%
캐나다 25
 
3.8%
Other values (76) 263
39.8%
2023-12-13T05:08:26.975122image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 197
 
7.6%
E 131
 
5.1%
125
 
4.8%
109
 
4.2%
N 101
 
3.9%
I 91
 
3.5%
R 84
 
3.3%
O 75
 
2.9%
71
 
2.8%
U 69
 
2.7%
Other values (96) 1525
59.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1248
48.4%
Uppercase Letter 1146
44.5%
Lowercase Letter 163
 
6.3%
Space Separator 21
 
0.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
125
 
10.0%
109
 
8.7%
71
 
5.7%
60
 
4.8%
59
 
4.7%
53
 
4.2%
49
 
3.9%
47
 
3.8%
47
 
3.8%
44
 
3.5%
Other values (51) 584
46.8%
Uppercase Letter
ValueCountFrequency (%)
A 197
17.2%
E 131
11.4%
N 101
8.8%
I 91
 
7.9%
R 84
 
7.3%
O 75
 
6.5%
U 69
 
6.0%
L 58
 
5.1%
S 47
 
4.1%
G 43
 
3.8%
Other values (15) 250
21.8%
Lowercase Letter
ValueCountFrequency (%)
a 29
17.8%
n 23
14.1%
t 13
8.0%
o 12
 
7.4%
e 12
 
7.4%
r 10
 
6.1%
g 10
 
6.1%
b 8
 
4.9%
p 7
 
4.3%
y 7
 
4.3%
Other values (9) 32
19.6%
Space Separator
ValueCountFrequency (%)
21
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1309
50.8%
Hangul 1248
48.4%
Common 21
 
0.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
125
 
10.0%
109
 
8.7%
71
 
5.7%
60
 
4.8%
59
 
4.7%
53
 
4.2%
49
 
3.9%
47
 
3.8%
47
 
3.8%
44
 
3.5%
Other values (51) 584
46.8%
Latin
ValueCountFrequency (%)
A 197
15.0%
E 131
 
10.0%
N 101
 
7.7%
I 91
 
7.0%
R 84
 
6.4%
O 75
 
5.7%
U 69
 
5.3%
L 58
 
4.4%
S 47
 
3.6%
G 43
 
3.3%
Other values (34) 413
31.6%
Common
ValueCountFrequency (%)
21
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1330
51.6%
Hangul 1248
48.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 197
14.8%
E 131
 
9.8%
N 101
 
7.6%
I 91
 
6.8%
R 84
 
6.3%
O 75
 
5.6%
U 69
 
5.2%
L 58
 
4.4%
S 47
 
3.5%
G 43
 
3.2%
Other values (35) 434
32.6%
Hangul
ValueCountFrequency (%)
125
 
10.0%
109
 
8.7%
71
 
5.7%
60
 
4.8%
59
 
4.7%
53
 
4.2%
49
 
3.9%
47
 
3.8%
47
 
3.8%
44
 
3.5%
Other values (51) 584
46.8%

장르
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size5.1 KiB
방송
463 
<NA>
152 
애니
 
14
캐릭터
 
8
게임
 
3

Length

Max length4
Median length2
Mean length2.4875
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row방송
2nd row방송
3rd row방송
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
방송 463
72.3%
<NA> 152
 
23.8%
애니 14
 
2.2%
캐릭터 8
 
1.2%
게임 3
 
0.5%

Length

2023-12-13T05:08:27.169658image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T05:08:27.307621image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
방송 463
72.3%
na 152
 
23.8%
애니 14
 
2.2%
캐릭터 8
 
1.2%
게임 3
 
0.5%

기업형태
Categorical

HIGH CORRELATION 

Distinct28
Distinct (%)4.4%
Missing0
Missing (%)0.0%
Memory size5.1 KiB
케이블/위성 채널
162 
<NA>
134 
배급사
114 
제작사
38 
제작사/배급사
35 
Other values (23)
157 

Length

Max length9
Median length8
Mean length5.2625
Min length2

Unique

Unique5 ?
Unique (%)0.8%

Sample

1st row케이블/위성 채널
2nd row뉴미디어 플랫폼
3rd row퍼블리셔
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
케이블/위성 채널 162
25.3%
<NA> 134
20.9%
배급사 114
17.8%
제작사 38
 
5.9%
제작사/배급사 35
 
5.5%
뉴미디어 플랫폼 28
 
4.4%
지상파채널 15
 
2.3%
미디어 15
 
2.3%
배급 14
 
2.2%
공공기관/조직 9
 
1.4%
Other values (18) 76
11.9%

Length

2023-12-13T05:08:27.444128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
채널 172
20.7%
케이블/위성 162
19.5%
na 134
16.1%
배급사 114
13.7%
제작사 38
 
4.6%
제작사/배급사 35
 
4.2%
뉴미디어 28
 
3.4%
플랫폼 28
 
3.4%
배급 17
 
2.0%
지상파채널 15
 
1.8%
Other values (16) 87
10.5%

기업명
Text

UNIQUE 

Distinct640
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size5.1 KiB
2023-12-13T05:08:27.764941image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length74
Median length46
Mean length20.30625
Min length3

Characters and Unicode

Total characters12996
Distinct characters198
Distinct categories10 ?
Distinct scripts5 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique640 ?
Unique (%)100.0%

Sample

1st rowKHURSHID TV
2nd rowTOLO TV
3rd rowTRING TV SH.A
4th rowEchorouk TV
5th rowEL KHABAR TV
ValueCountFrequency (%)
media 77
 
4.0%
tv 71
 
3.7%
37
 
1.9%
entertainment 32
 
1.7%
ltd 27
 
1.4%
television 26
 
1.4%
inc 26
 
1.4%
group 23
 
1.2%
international 23
 
1.2%
gmbh 23
 
1.2%
Other values (1001) 1554
81.0%
2023-12-13T05:08:28.242469image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1344
 
10.3%
I 861
 
6.6%
A 858
 
6.6%
E 790
 
6.1%
T 784
 
6.0%
N 731
 
5.6%
O 624
 
4.8%
R 514
 
4.0%
S 487
 
3.7%
L 426
 
3.3%
Other values (188) 5577
42.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 8904
68.5%
Lowercase Letter 2239
 
17.2%
Space Separator 1344
 
10.3%
Other Punctuation 213
 
1.6%
Other Letter 191
 
1.5%
Decimal Number 35
 
0.3%
Open Punctuation 34
 
0.3%
Close Punctuation 34
 
0.3%
Math Symbol 1
 
< 0.1%
Other Number 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9
 
4.7%
7
 
3.7%
6
 
3.1%
6
 
3.1%
5
 
2.6%
5
 
2.6%
4
 
2.1%
4
 
2.1%
4
 
2.1%
4
 
2.1%
Other values (97) 137
71.7%
Lowercase Letter
ValueCountFrequency (%)
i 244
10.9%
a 227
10.1%
o 204
 
9.1%
n 193
 
8.6%
e 184
 
8.2%
t 167
 
7.5%
r 155
 
6.9%
d 110
 
4.9%
s 108
 
4.8%
u 104
 
4.6%
Other values (31) 543
24.3%
Uppercase Letter
ValueCountFrequency (%)
I 861
 
9.7%
A 858
 
9.6%
E 790
 
8.9%
T 784
 
8.8%
N 731
 
8.2%
O 624
 
7.0%
R 514
 
5.8%
S 487
 
5.5%
L 426
 
4.8%
D 414
 
4.6%
Other values (16) 2415
27.1%
Decimal Number
ValueCountFrequency (%)
2 7
20.0%
1 6
17.1%
0 5
14.3%
3 5
14.3%
7 3
8.6%
8 3
8.6%
4 3
8.6%
6 1
 
2.9%
9 1
 
2.9%
5 1
 
2.9%
Other Punctuation
ValueCountFrequency (%)
. 134
62.9%
& 30
 
14.1%
, 28
 
13.1%
/ 15
 
7.0%
" 2
 
0.9%
: 1
 
0.5%
! 1
 
0.5%
' 1
 
0.5%
* 1
 
0.5%
Space Separator
ValueCountFrequency (%)
1344
100.0%
Open Punctuation
ValueCountFrequency (%)
( 34
100.0%
Close Punctuation
ValueCountFrequency (%)
) 34
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%
Other Number
ValueCountFrequency (%)
½ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11107
85.5%
Common 1662
 
12.8%
Hangul 142
 
1.1%
Han 49
 
0.4%
Cyrillic 36
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9
 
6.3%
7
 
4.9%
6
 
4.2%
6
 
4.2%
5
 
3.5%
5
 
3.5%
4
 
2.8%
4
 
2.8%
4
 
2.8%
4
 
2.8%
Other values (66) 88
62.0%
Latin
ValueCountFrequency (%)
I 861
 
7.8%
A 858
 
7.7%
E 790
 
7.1%
T 784
 
7.1%
N 731
 
6.6%
O 624
 
5.6%
R 514
 
4.6%
S 487
 
4.4%
L 426
 
3.8%
D 414
 
3.7%
Other values (42) 4618
41.6%
Han
ValueCountFrequency (%)
4
 
8.2%
4
 
8.2%
4
 
8.2%
4
 
8.2%
2
 
4.1%
2
 
4.1%
2
 
4.1%
2
 
4.1%
2
 
4.1%
2
 
4.1%
Other values (21) 21
42.9%
Common
ValueCountFrequency (%)
1344
80.9%
. 134
 
8.1%
( 34
 
2.0%
) 34
 
2.0%
& 30
 
1.8%
, 28
 
1.7%
/ 15
 
0.9%
2 7
 
0.4%
1 6
 
0.4%
0 5
 
0.3%
Other values (14) 25
 
1.5%
Cyrillic
ValueCountFrequency (%)
а 5
13.9%
с 4
11.1%
и 4
11.1%
т 4
11.1%
к 3
8.3%
м 3
8.3%
я 2
 
5.6%
у 2
 
5.6%
н 2
 
5.6%
з 2
 
5.6%
Other values (5) 5
13.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12768
98.2%
Hangul 142
 
1.1%
CJK 49
 
0.4%
Cyrillic 36
 
0.3%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1344
 
10.5%
I 861
 
6.7%
A 858
 
6.7%
E 790
 
6.2%
T 784
 
6.1%
N 731
 
5.7%
O 624
 
4.9%
R 514
 
4.0%
S 487
 
3.8%
L 426
 
3.3%
Other values (65) 5349
41.9%
Hangul
ValueCountFrequency (%)
9
 
6.3%
7
 
4.9%
6
 
4.2%
6
 
4.2%
5
 
3.5%
5
 
3.5%
4
 
2.8%
4
 
2.8%
4
 
2.8%
4
 
2.8%
Other values (66) 88
62.0%
Cyrillic
ValueCountFrequency (%)
а 5
13.9%
с 4
11.1%
и 4
11.1%
т 4
11.1%
к 3
8.3%
м 3
8.3%
я 2
 
5.6%
у 2
 
5.6%
н 2
 
5.6%
з 2
 
5.6%
Other values (5) 5
13.9%
CJK
ValueCountFrequency (%)
4
 
8.2%
4
 
8.2%
4
 
8.2%
4
 
8.2%
2
 
4.1%
2
 
4.1%
2
 
4.1%
2
 
4.1%
2
 
4.1%
2
 
4.1%
Other values (21) 21
42.9%
None
ValueCountFrequency (%)
½ 1
100.0%

홈페이지
Text

MISSING 

Distinct216
Distinct (%)100.0%
Missing424
Missing (%)66.2%
Memory size5.1 KiB
2023-12-13T05:08:28.504076image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length44
Median length31
Mean length18.722222
Min length3

Characters and Unicode

Total characters4044
Distinct characters54
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique216 ?
Unique (%)100.0%

Sample

1st rowwww.khurshid.tv
2nd rowwww.tolo.tv
3rd rowwww.tring.tv
4th rowwww.entv.dz
5th rowwww.iaa.bh
ValueCountFrequency (%)
www.communicationnalingala.net 1
 
0.5%
www.toonmaxmedia.comwww.toonmax.com 1
 
0.5%
www.jamojoy.com 1
 
0.5%
www.kingsoft.com 1
 
0.5%
http://www.katoprod.com 1
 
0.5%
www.prisa.com 1
 
0.5%
www.mtg.com 1
 
0.5%
www.alnaharlive.net 1
 
0.5%
www.emsorg.com 1
 
0.5%
www.rotana.net 1
 
0.5%
Other values (206) 206
95.4%
2023-12-13T05:08:28.985060image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
w 616
15.2%
. 491
12.1%
o 295
 
7.3%
t 264
 
6.5%
a 237
 
5.9%
m 231
 
5.7%
c 221
 
5.5%
r 172
 
4.3%
i 169
 
4.2%
e 164
 
4.1%
Other values (44) 1184
29.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3354
82.9%
Other Punctuation 636
 
15.7%
Uppercase Letter 26
 
0.6%
Decimal Number 21
 
0.5%
Space Separator 5
 
0.1%
Math Symbol 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
w 616
18.4%
o 295
 
8.8%
t 264
 
7.9%
a 237
 
7.1%
m 231
 
6.9%
c 221
 
6.6%
r 172
 
5.1%
i 169
 
5.0%
e 164
 
4.9%
n 152
 
4.5%
Other values (16) 833
24.8%
Uppercase Letter
ValueCountFrequency (%)
N 3
11.5%
W 3
11.5%
O 3
11.5%
M 3
11.5%
E 3
11.5%
C 2
 
7.7%
P 1
 
3.8%
V 1
 
3.8%
S 1
 
3.8%
R 1
 
3.8%
Other values (5) 5
19.2%
Decimal Number
ValueCountFrequency (%)
0 6
28.6%
2 5
23.8%
1 3
14.3%
3 2
 
9.5%
7 2
 
9.5%
5 1
 
4.8%
8 1
 
4.8%
9 1
 
4.8%
Other Punctuation
ValueCountFrequency (%)
. 491
77.2%
/ 112
 
17.6%
: 33
 
5.2%
Space Separator
ValueCountFrequency (%)
5
100.0%
Math Symbol
ValueCountFrequency (%)
= 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3380
83.6%
Common 664
 
16.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
w 616
18.2%
o 295
 
8.7%
t 264
 
7.8%
a 237
 
7.0%
m 231
 
6.8%
c 221
 
6.5%
r 172
 
5.1%
i 169
 
5.0%
e 164
 
4.9%
n 152
 
4.5%
Other values (31) 859
25.4%
Common
ValueCountFrequency (%)
. 491
73.9%
/ 112
 
16.9%
: 33
 
5.0%
0 6
 
0.9%
2 5
 
0.8%
5
 
0.8%
1 3
 
0.5%
3 2
 
0.3%
7 2
 
0.3%
= 2
 
0.3%
Other values (3) 3
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4044
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
w 616
15.2%
. 491
12.1%
o 295
 
7.3%
t 264
 
6.5%
a 237
 
5.9%
m 231
 
5.7%
c 221
 
5.5%
r 172
 
4.3%
i 169
 
4.2%
e 164
 
4.1%
Other values (44) 1184
29.3%

Correlations

2023-12-13T05:08:29.105315image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
국가장르기업형태
국가1.0000.0000.815
장르0.0001.0000.913
기업형태0.8150.9131.000
2023-12-13T05:08:29.213432image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
기업형태장르
기업형태1.0000.733
장르0.7331.000
2023-12-13T05:08:29.323482image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
장르기업형태
장르1.0000.733
기업형태0.7331.000

Missing values

2023-12-13T05:08:26.083743image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T05:08:26.203296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

국가장르기업형태기업명홈페이지
0AFGHANISTAN방송케이블/위성 채널KHURSHID TVwww.khurshid.tv
1AFGHANISTAN방송뉴미디어 플랫폼TOLO TVwww.tolo.tv
2ALBANIA방송퍼블리셔TRING TV SH.Awww.tring.tv
3Algeria<NA><NA>Echorouk TV<NA>
4Algeria<NA><NA>EL KHABAR TV<NA>
5ALGERIA방송공공기관/조직EPTA ETABLISSEMENT PUBLIC DE TELEVISION ALGERIENNEwww.entv.dz
6AUSTRIA방송케이블/위성 채널SERVUS TV<NA>
7AUSTRIA방송케이블/위성 채널TELEKOM AUSTRIA GROUP<NA>
8AUSTRIA방송라이선싱PULS 4 TV GMBH & CO<NA>
9AUSTRIA방송케이블/위성 채널ORF AUSTRIAN BROADCASTING CORPORATION<NA>
국가장르기업형태기업명홈페이지
630터키방송뉴미디어 플랫폼VODAFONE TURKEYwww.vodafone.com.tr
631터키방송케이블/위성 채널WORLD TRAVEL CHANNELwww.worldtravelchannel.com
632터키방송케이블/위성 채널YABAN TVwww.yabantv.com
633터키방송뉴미디어 플랫폼YOUTUBE TURKEYwww.google.com.tr
634파라과이<NA><NA>Septimo Arte SAhttps://www.facebook.com/MediagroupPYfref=ts
635파라과이<NA><NA>Mediagroup SRLwww.mediagroup.com.py
636파라과이<NA><NA>T J L S.A.<NA>
637파라과이<NA><NA>Shopping Chinawww.shoppingchina.com.py
638프랑스<NA><NA>MOBIBASE France<NA>
639프랑스방송케이블/위성 채널ARQIVA FRANCEwww.arqiva.com/