Overview

Dataset statistics

Number of variables4
Number of observations515
Missing cells251
Missing cells (%)12.2%
Duplicate rows3
Duplicate rows (%)0.6%
Total size in memory16.7 KiB
Average record size in memory33.3 B

Variable types

Text3
Numeric1

Dataset

Description국가평생학습포털 늘배움에서 사용중인 학습공통코드와 관련된 데이터로 학습공통코드, 학습공통코드명, 학습공통코드설명 등의 정보를 제공합니다.
Author국가평생교육진흥원
URLhttps://www.data.go.kr/data/15091739/fileData.do

Alerts

Dataset has 3 (0.6%) duplicate rowsDuplicates
부모코드 has 250 (48.5%) missing valuesMissing

Reproduction

Analysis started2023-12-12 16:21:56.340189
Analysis finished2023-12-12 16:21:57.408473
Duration1.07 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct454
Distinct (%)88.3%
Missing1
Missing (%)0.2%
Memory size4.2 KiB
2023-12-13T01:21:57.788988image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length5
Mean length2.9902724
Min length1

Characters and Unicode

Total characters1537
Distinct characters37
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique429 ?
Unique (%)83.5%

Sample

1st row4373
2nd row4374
3rd row4375
4th row4376
5th row4377
ValueCountFrequency (%)
10 9
 
1.8%
20 9
 
1.8%
30 8
 
1.6%
40 5
 
1.0%
50 4
 
0.8%
100 4
 
0.8%
200 4
 
0.8%
60 4
 
0.8%
80 3
 
0.6%
300 3
 
0.6%
Other values (444) 461
89.7%
2023-12-13T01:21:58.402162image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 267
17.4%
4 219
14.2%
2 166
10.8%
0 132
8.6%
7 109
 
7.1%
3 95
 
6.2%
8 80
 
5.2%
6 71
 
4.6%
5 65
 
4.2%
9 41
 
2.7%
Other values (27) 292
19.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1245
81.0%
Uppercase Letter 290
 
18.9%
Dash Punctuation 2
 
0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 27
 
9.3%
A 23
 
7.9%
T 21
 
7.2%
O 19
 
6.6%
R 16
 
5.5%
K 16
 
5.5%
M 16
 
5.5%
I 15
 
5.2%
L 14
 
4.8%
N 14
 
4.8%
Other values (16) 109
37.6%
Decimal Number
ValueCountFrequency (%)
1 267
21.4%
4 219
17.6%
2 166
13.3%
0 132
10.6%
7 109
8.8%
3 95
 
7.6%
8 80
 
6.4%
6 71
 
5.7%
5 65
 
5.2%
9 41
 
3.3%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1247
81.1%
Latin 290
 
18.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 27
 
9.3%
A 23
 
7.9%
T 21
 
7.2%
O 19
 
6.6%
R 16
 
5.5%
K 16
 
5.5%
M 16
 
5.5%
I 15
 
5.2%
L 14
 
4.8%
N 14
 
4.8%
Other values (16) 109
37.6%
Common
ValueCountFrequency (%)
1 267
21.4%
4 219
17.6%
2 166
13.3%
0 132
10.6%
7 109
8.7%
3 95
 
7.6%
8 80
 
6.4%
6 71
 
5.7%
5 65
 
5.2%
9 41
 
3.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1537
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 267
17.4%
4 219
14.2%
2 166
10.8%
0 132
8.6%
7 109
 
7.1%
3 95
 
6.2%
8 80
 
5.2%
6 71
 
4.6%
5 65
 
4.2%
9 41
 
2.7%
Other values (27) 292
19.0%
Distinct478
Distinct (%)92.8%
Missing0
Missing (%)0.0%
Memory size4.2 KiB
2023-12-13T01:21:58.708117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length27
Mean length5.1495146
Min length2

Characters and Unicode

Total characters2652
Distinct characters287
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique462 ?
Unique (%)89.7%

Sample

1st row옥천군
2nd row영동군
3rd row진천군
4th row괴산군
5th row음성군
ValueCountFrequency (%)
프로그램 19
 
3.2%
연령 9
 
1.5%
관련 7
 
1.2%
시설 7
 
1.2%
동구 6
 
1.0%
6
 
1.0%
서구 5
 
0.8%
중구 5
 
0.8%
제주도 4
 
0.7%
남구 4
 
0.7%
Other values (488) 517
87.8%
2023-12-13T01:21:59.145998image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 155
 
5.8%
i 106
 
4.0%
105
 
4.0%
n 102
 
3.8%
83
 
3.1%
e 76
 
2.9%
74
 
2.8%
73
 
2.8%
r 59
 
2.2%
s 52
 
2.0%
Other values (277) 1767
66.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1388
52.3%
Lowercase Letter 905
34.1%
Uppercase Letter 207
 
7.8%
Space Separator 74
 
2.8%
Dash Punctuation 18
 
0.7%
Other Punctuation 18
 
0.7%
Decimal Number 14
 
0.5%
Open Punctuation 13
 
0.5%
Close Punctuation 13
 
0.5%
Math Symbol 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
105
 
7.6%
83
 
6.0%
73
 
5.3%
32
 
2.3%
26
 
1.9%
24
 
1.7%
22
 
1.6%
21
 
1.5%
21
 
1.5%
21
 
1.5%
Other values (211) 960
69.2%
Uppercase Letter
ValueCountFrequency (%)
S 21
 
10.1%
B 19
 
9.2%
T 17
 
8.2%
A 14
 
6.8%
N 12
 
5.8%
C 12
 
5.8%
G 11
 
5.3%
O 9
 
4.3%
K 9
 
4.3%
Y 9
 
4.3%
Other values (16) 74
35.7%
Lowercase Letter
ValueCountFrequency (%)
a 155
17.1%
i 106
11.7%
n 102
11.3%
e 76
8.4%
r 59
 
6.5%
s 52
 
5.7%
o 49
 
5.4%
l 42
 
4.6%
h 38
 
4.2%
u 37
 
4.1%
Other values (15) 189
20.9%
Decimal Number
ValueCountFrequency (%)
0 6
42.9%
4 2
 
14.3%
8 1
 
7.1%
5 1
 
7.1%
3 1
 
7.1%
2 1
 
7.1%
6 1
 
7.1%
1 1
 
7.1%
Other Punctuation
ValueCountFrequency (%)
, 12
66.7%
/ 6
33.3%
Space Separator
ValueCountFrequency (%)
74
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 18
100.0%
Open Punctuation
ValueCountFrequency (%)
( 13
100.0%
Close Punctuation
ValueCountFrequency (%)
) 13
100.0%
Math Symbol
ValueCountFrequency (%)
~ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1388
52.3%
Latin 1112
41.9%
Common 152
 
5.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
105
 
7.6%
83
 
6.0%
73
 
5.3%
32
 
2.3%
26
 
1.9%
24
 
1.7%
22
 
1.6%
21
 
1.5%
21
 
1.5%
21
 
1.5%
Other values (211) 960
69.2%
Latin
ValueCountFrequency (%)
a 155
 
13.9%
i 106
 
9.5%
n 102
 
9.2%
e 76
 
6.8%
r 59
 
5.3%
s 52
 
4.7%
o 49
 
4.4%
l 42
 
3.8%
h 38
 
3.4%
u 37
 
3.3%
Other values (41) 396
35.6%
Common
ValueCountFrequency (%)
74
48.7%
- 18
 
11.8%
( 13
 
8.6%
) 13
 
8.6%
, 12
 
7.9%
/ 6
 
3.9%
0 6
 
3.9%
4 2
 
1.3%
~ 2
 
1.3%
8 1
 
0.7%
Other values (5) 5
 
3.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1388
52.3%
ASCII 1264
47.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 155
 
12.3%
i 106
 
8.4%
n 102
 
8.1%
e 76
 
6.0%
74
 
5.9%
r 59
 
4.7%
s 52
 
4.1%
o 49
 
3.9%
l 42
 
3.3%
h 38
 
3.0%
Other values (56) 511
40.4%
Hangul
ValueCountFrequency (%)
105
 
7.6%
83
 
6.0%
73
 
5.3%
32
 
2.3%
26
 
1.9%
24
 
1.7%
22
 
1.6%
21
 
1.5%
21
 
1.5%
21
 
1.5%
Other values (211) 960
69.2%
Distinct507
Distinct (%)98.4%
Missing0
Missing (%)0.0%
Memory size4.2 KiB
2023-12-13T01:21:59.432762image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length24
Mean length7.4058252
Min length2

Characters and Unicode

Total characters3814
Distinct characters290
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique499 ?
Unique (%)96.9%

Sample

1st row충청북도 옥천군
2nd row충청북도 영동군
3rd row충청북도 진천군
4th row충청북도 괴산군
5th row충청북도 음성군
ValueCountFrequency (%)
경기도 36
 
4.4%
서울특별시 25
 
3.1%
경상북도 23
 
2.8%
전라남도 22
 
2.7%
프로그램 19
 
2.3%
강원도 18
 
2.2%
경상남도 17
 
2.1%
부산광역시 15
 
1.8%
전라북도 14
 
1.7%
충청남도 12
 
1.5%
Other values (513) 614
75.3%
2023-12-13T01:21:59.879313image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
300
 
7.9%
177
 
4.6%
166
 
4.4%
a 155
 
4.1%
i 106
 
2.8%
n 102
 
2.7%
90
 
2.4%
83
 
2.2%
81
 
2.1%
e 76
 
2.0%
Other values (280) 2478
65.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2302
60.4%
Lowercase Letter 905
 
23.7%
Space Separator 300
 
7.9%
Uppercase Letter 207
 
5.4%
Decimal Number 36
 
0.9%
Dash Punctuation 18
 
0.5%
Other Punctuation 18
 
0.5%
Close Punctuation 13
 
0.3%
Open Punctuation 13
 
0.3%
Math Symbol 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
177
 
7.7%
166
 
7.2%
90
 
3.9%
83
 
3.6%
81
 
3.5%
67
 
2.9%
59
 
2.6%
57
 
2.5%
48
 
2.1%
47
 
2.0%
Other values (212) 1427
62.0%
Uppercase Letter
ValueCountFrequency (%)
S 21
 
10.1%
B 19
 
9.2%
T 17
 
8.2%
A 14
 
6.8%
N 12
 
5.8%
C 12
 
5.8%
G 11
 
5.3%
K 9
 
4.3%
Y 9
 
4.3%
M 9
 
4.3%
Other values (16) 74
35.7%
Lowercase Letter
ValueCountFrequency (%)
a 155
17.1%
i 106
11.7%
n 102
11.3%
e 76
8.4%
r 59
 
6.5%
s 52
 
5.7%
o 49
 
5.4%
l 42
 
4.6%
h 38
 
4.2%
u 37
 
4.1%
Other values (15) 189
20.9%
Decimal Number
ValueCountFrequency (%)
0 15
41.7%
5 6
 
16.7%
4 3
 
8.3%
6 2
 
5.6%
3 2
 
5.6%
1 2
 
5.6%
2 2
 
5.6%
8 2
 
5.6%
9 1
 
2.8%
7 1
 
2.8%
Other Punctuation
ValueCountFrequency (%)
, 12
66.7%
/ 6
33.3%
Space Separator
ValueCountFrequency (%)
300
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 18
100.0%
Close Punctuation
ValueCountFrequency (%)
) 13
100.0%
Open Punctuation
ValueCountFrequency (%)
( 13
100.0%
Math Symbol
ValueCountFrequency (%)
~ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2302
60.4%
Latin 1112
29.2%
Common 400
 
10.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
177
 
7.7%
166
 
7.2%
90
 
3.9%
83
 
3.6%
81
 
3.5%
67
 
2.9%
59
 
2.6%
57
 
2.5%
48
 
2.1%
47
 
2.0%
Other values (212) 1427
62.0%
Latin
ValueCountFrequency (%)
a 155
 
13.9%
i 106
 
9.5%
n 102
 
9.2%
e 76
 
6.8%
r 59
 
5.3%
s 52
 
4.7%
o 49
 
4.4%
l 42
 
3.8%
h 38
 
3.4%
u 37
 
3.3%
Other values (41) 396
35.6%
Common
ValueCountFrequency (%)
300
75.0%
- 18
 
4.5%
0 15
 
3.8%
) 13
 
3.2%
( 13
 
3.2%
, 12
 
3.0%
5 6
 
1.5%
/ 6
 
1.5%
4 3
 
0.8%
6 2
 
0.5%
Other values (7) 12
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2302
60.4%
ASCII 1512
39.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
300
19.8%
a 155
 
10.3%
i 106
 
7.0%
n 102
 
6.7%
e 76
 
5.0%
r 59
 
3.9%
s 52
 
3.4%
o 49
 
3.2%
l 42
 
2.8%
h 38
 
2.5%
Other values (58) 533
35.3%
Hangul
ValueCountFrequency (%)
177
 
7.7%
166
 
7.2%
90
 
3.9%
83
 
3.6%
81
 
3.5%
67
 
2.9%
59
 
2.6%
57
 
2.5%
48
 
2.1%
47
 
2.0%
Other values (212) 1427
62.0%

부모코드
Real number (ℝ)

MISSING 

Distinct22
Distinct (%)8.3%
Missing250
Missing (%)48.5%
Infinite0
Infinite (%)0.0%
Mean34.811321
Minimum1
Maximum60
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.7 KiB
2023-12-13T01:22:00.075798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q127
median41
Q346
95-th percentile48
Maximum60
Range59
Interquartile range (IQR)19

Descriptive statistics

Standard deviation14.527999
Coefficient of variation (CV)0.41733547
Kurtosis-0.13894509
Mean34.811321
Median Absolute Deviation (MAD)6
Skewness-1.0087721
Sum9225
Variance211.06275
MonotonicityNot monotonic
2023-12-13T01:22:00.215009image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
41 36
 
7.0%
11 25
 
4.9%
47 23
 
4.5%
46 22
 
4.3%
42 18
 
3.5%
48 17
 
3.3%
26 15
 
2.9%
45 14
 
2.7%
44 11
 
2.1%
2 11
 
2.1%
Other values (12) 73
 
14.2%
(Missing) 250
48.5%
ValueCountFrequency (%)
1 6
 
1.2%
2 11
2.1%
10 3
 
0.6%
11 25
4.9%
20 3
 
0.6%
26 15
2.9%
27 8
 
1.6%
28 10
 
1.9%
29 5
 
1.0%
30 9
 
1.7%
ValueCountFrequency (%)
60 3
 
0.6%
50 5
 
1.0%
48 17
3.3%
47 23
4.5%
46 22
4.3%
45 14
 
2.7%
44 11
 
2.1%
43 11
 
2.1%
42 18
3.5%
41 36
7.0%

Interactions

2023-12-13T01:21:56.684649image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Missing values

2023-12-13T01:21:57.170980image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T01:21:57.267192image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T01:21:57.352994image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

학습공통코드학습공통코드명학습공통코드설명부모코드
04373옥천군충청북도 옥천군43
14374영동군충청북도 영동군43
24375진천군충청북도 진천군43
34376괴산군충청북도 괴산군43
44377음성군충청북도 음성군43
54380단양군충청남도 단양군43
64413천안시충청남도 천안시44
74415공주시충청남도 공주시44
84418보령시충청남도 보령시44
94420아산시충청남도 아산시44
학습공통코드학습공통코드명학습공통코드설명부모코드
50536세종시도<NA>
506114외국어 자격증외국어 자격증<NA>
507115직무능력향상교육직무능력향상교육<NA>
5081164차산업혁명4차산업혁명<NA>
509117컴퓨터컴퓨터<NA>
510118종교교육종교교육<NA>
511119가정생활가정생활<NA>
512120미술미술<NA>
513121지도자지도자<NA>
514122환경생태환경생태<NA>

Duplicate rows

Most frequently occurring

학습공통코드학습공통코드명학습공통코드설명부모코드# duplicates
010직영직영<NA>2
120위탁위탁<NA>2
230병행병행<NA>2