Overview

Dataset statistics

Number of variables7
Number of observations199
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory11.4 KiB
Average record size in memory58.7 B

Variable types

Categorical4
Text1
Numeric2

Alerts

2020-01-01 has constant value ""Constant
[20-29] is highly overall correlated with 1 and 1 other fieldsHigh correlation
1 is highly overall correlated with [20-29]High correlation
1.1 is highly overall correlated with 11.111111High correlation
11.111111 is highly overall correlated with 1.1 and 1 other fieldsHigh correlation
09 is highly overall correlated with 11.111111 and 1 other fieldsHigh correlation
1 is highly imbalanced (76.5%)Imbalance
[20-29] is highly imbalanced (87.8%)Imbalance

Reproduction

Analysis started2023-12-10 06:13:07.689101
Analysis finished2023-12-10 06:13:08.943473
Duration1.25 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

2020-01-01
Categorical

CONSTANT 

Distinct1
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
2020-01-01
199 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020-01-01
2nd row2020-01-01
3rd row2020-01-01
4th row2020-01-01
5th row2020-01-01

Common Values

ValueCountFrequency (%)
2020-01-01 199
100.0%

Length

2023-12-10T15:13:09.042569image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:13:09.168428image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020-01-01 199
100.0%
Distinct145
Distinct (%)72.9%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
2023-12-10T15:13:09.514886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length12
Mean length5.6683417
Min length2

Characters and Unicode

Total characters1128
Distinct characters248
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique119 ?
Unique (%)59.8%

Sample

1st row 무생채
2nd row 채소샐러드
3rd row 쌀밥
4th row 잡곡밥
5th row 고사리나물
ValueCountFrequency (%)
쌀밥 10
 
4.4%
배추김치 7
 
3.1%
떡국 6
 
2.6%
사과 6
 
2.6%
떡만둣국 5
 
2.2%
김치볶음밥 4
 
1.8%
김구이 3
 
1.3%
플레인 3
 
1.3%
후렌치파이 3
 
1.3%
잡곡밥 3
 
1.3%
Other values (158) 177
78.0%
2023-12-10T15:13:10.150584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
227
 
20.1%
31
 
2.7%
27
 
2.4%
26
 
2.3%
23
 
2.0%
21
 
1.9%
( 18
 
1.6%
17
 
1.5%
) 16
 
1.4%
15
 
1.3%
Other values (238) 707
62.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 850
75.4%
Space Separator 227
 
20.1%
Open Punctuation 18
 
1.6%
Close Punctuation 16
 
1.4%
Decimal Number 11
 
1.0%
Other Punctuation 4
 
0.4%
Lowercase Letter 2
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
31
 
3.6%
27
 
3.2%
26
 
3.1%
23
 
2.7%
21
 
2.5%
17
 
2.0%
15
 
1.8%
14
 
1.6%
14
 
1.6%
13
 
1.5%
Other values (226) 649
76.4%
Decimal Number
ValueCountFrequency (%)
0 4
36.4%
9 3
27.3%
5 3
27.3%
8 1
 
9.1%
Other Punctuation
ValueCountFrequency (%)
% 2
50.0%
& 1
25.0%
. 1
25.0%
Lowercase Letter
ValueCountFrequency (%)
l 1
50.0%
m 1
50.0%
Space Separator
ValueCountFrequency (%)
227
100.0%
Open Punctuation
ValueCountFrequency (%)
( 18
100.0%
Close Punctuation
ValueCountFrequency (%)
) 16
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 850
75.4%
Common 276
 
24.5%
Latin 2
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
31
 
3.6%
27
 
3.2%
26
 
3.1%
23
 
2.7%
21
 
2.5%
17
 
2.0%
15
 
1.8%
14
 
1.6%
14
 
1.6%
13
 
1.5%
Other values (226) 649
76.4%
Common
ValueCountFrequency (%)
227
82.2%
( 18
 
6.5%
) 16
 
5.8%
0 4
 
1.4%
9 3
 
1.1%
5 3
 
1.1%
% 2
 
0.7%
& 1
 
0.4%
8 1
 
0.4%
. 1
 
0.4%
Latin
ValueCountFrequency (%)
l 1
50.0%
m 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 850
75.4%
ASCII 278
 
24.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
227
81.7%
( 18
 
6.5%
) 16
 
5.8%
0 4
 
1.4%
9 3
 
1.1%
5 3
 
1.1%
% 2
 
0.7%
& 1
 
0.4%
l 1
 
0.4%
m 1
 
0.4%
Other values (2) 2
 
0.7%
Hangul
ValueCountFrequency (%)
31
 
3.6%
27
 
3.2%
26
 
3.1%
23
 
2.7%
21
 
2.5%
17
 
2.0%
15
 
1.8%
14
 
1.6%
14
 
1.6%
13
 
1.5%
Other values (226) 649
76.4%

1
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct10
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
1
174 
2
 
17
바닐라맛
 
1
튀김)
 
1
5
 
1
Other values (5)
 
5

Length

Max length5
Median length2
Mean length2.0552764
Min length2

Unique

Unique8 ?
Unique (%)4.0%

Sample

1st row 1
2nd row 1
3rd row 1
4th row 1
5th row 1

Common Values

ValueCountFrequency (%)
1 174
87.4%
2 17
 
8.5%
바닐라맛 1
 
0.5%
튀김) 1
 
0.5%
5 1
 
0.5%
3 1
 
0.5%
4 1
 
0.5%
마른것 1
 
0.5%
삶은것) 1
 
0.5%
냉동 1
 
0.5%

Length

2023-12-10T15:13:10.675786image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:13:10.838402image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 174
87.4%
2 17
 
8.5%
바닐라맛 1
 
0.5%
튀김 1
 
0.5%
5 1
 
0.5%
3 1
 
0.5%
4 1
 
0.5%
마른것 1
 
0.5%
삶은것 1
 
0.5%
냉동 1
 
0.5%

1.1
Real number (ℝ)

HIGH CORRELATION 

Distinct31
Distinct (%)15.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.8190955
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 KiB
2023-12-10T15:13:11.030691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median7
Q313
95-th percentile23
Maximum31
Range30
Interquartile range (IQR)10

Descriptive statistics

Standard deviation7.0723132
Coefficient of variation (CV)0.8019318
Kurtosis0.3792087
Mean8.8190955
Median Absolute Deviation (MAD)4
Skewness1.0224888
Sum1755
Variance50.017613
MonotonicityNot monotonic
2023-12-10T15:13:11.207833image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
1 23
 
11.6%
2 16
 
8.0%
3 16
 
8.0%
4 14
 
7.0%
6 14
 
7.0%
5 13
 
6.5%
7 12
 
6.0%
8 11
 
5.5%
9 8
 
4.0%
10 7
 
3.5%
Other values (21) 65
32.7%
ValueCountFrequency (%)
1 23
11.6%
2 16
8.0%
3 16
8.0%
4 14
7.0%
5 13
6.5%
6 14
7.0%
7 12
6.0%
8 11
5.5%
9 8
 
4.0%
10 7
 
3.5%
ValueCountFrequency (%)
31 1
0.5%
30 1
0.5%
29 1
0.5%
28 1
0.5%
27 1
0.5%
26 1
0.5%
25 1
0.5%
24 2
1.0%
23 2
1.0%
22 2
1.0%

11.111111
Real number (ℝ)

HIGH CORRELATION 

Distinct26
Distinct (%)13.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.9805733
Minimum2.777778
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 KiB
2023-12-10T15:13:11.406291image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.777778
5-th percentile2.777778
Q14.347826
median5.263158
Q310
95-th percentile25
Maximum100
Range97.222222
Interquartile range (IQR)5.652174

Descriptive statistics

Standard deviation11.356271
Coefficient of variation (CV)1.2645375
Kurtosis41.886994
Mean8.9805733
Median Absolute Deviation (MAD)2.48538
Skewness5.8154191
Sum1787.1341
Variance128.9649
MonotonicityNot monotonic
2023-12-10T15:13:11.583044image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
2.777778 28
14.1%
10.0 26
13.1%
3.448276 20
10.1%
4.545455 20
10.1%
4.347826 15
7.5%
7.142857 15
7.5%
5.263158 15
7.5%
9.090909 9
 
4.5%
11.111111 8
 
4.0%
25.0 7
 
3.5%
Other values (16) 36
18.1%
ValueCountFrequency (%)
2.777778 28
14.1%
3.448276 20
10.1%
4.0 1
 
0.5%
4.347826 15
7.5%
4.545455 20
10.1%
5.0 1
 
0.5%
5.263158 15
7.5%
5.555556 2
 
1.0%
6.896552 3
 
1.5%
7.142857 15
7.5%
ValueCountFrequency (%)
100.0 2
 
1.0%
50.0 2
 
1.0%
25.0 7
3.5%
20.0 3
 
1.5%
19.0 1
 
0.5%
18.181818 1
 
0.5%
16.666667 6
3.0%
13.0 1
 
0.5%
12.5 6
3.0%
11.111111 8
4.0%

[20-29]
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
[20-29]
194 
10.000000
 
3
4.347826
 
2

Length

Max length10
Median length8
Mean length8.040201
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row [20-29]
2nd row [20-29]
3rd row [20-29]
4th row [20-29]
5th row [20-29]

Common Values

ValueCountFrequency (%)
[20-29] 194
97.5%
10.000000 3
 
1.5%
4.347826 2
 
1.0%

Length

2023-12-10T15:13:11.780369image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:13:11.963408image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20-29 194
97.5%
10.000000 3
 
1.5%
4.347826 2
 
1.0%

09
Categorical

HIGH CORRELATION 

Distinct20
Distinct (%)10.1%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
11
31 
18
24 
12
21 
19
18 
13
17 
Other values (15)
88 

Length

Max length8
Median length3
Mean length3.1256281
Min length3

Unique

Unique3 ?
Unique (%)1.5%

Sample

1st row 09
2nd row 09
3rd row 09
4th row 09
5th row 09

Common Values

ValueCountFrequency (%)
11 31
15.6%
18 24
12.1%
12 21
10.6%
19 18
9.0%
13 17
8.5%
20 14
 
7.0%
17 9
 
4.5%
21 9
 
4.5%
08 8
 
4.0%
15 7
 
3.5%
Other values (10) 41
20.6%

Length

2023-12-10T15:13:12.140568image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
11 31
15.6%
18 24
12.1%
12 21
10.6%
19 18
9.0%
13 17
8.5%
20 14
 
7.0%
17 9
 
4.5%
21 9
 
4.5%
08 8
 
4.0%
10 7
 
3.5%
Other values (10) 41
20.6%

Interactions

2023-12-10T15:13:08.389807image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:13:08.162618image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:13:08.527514image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:13:08.272400image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T15:13:12.253397image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
11.111.111111[20-29]09
11.0000.0000.7251.0000.598
1.10.0001.0000.2370.0000.000
11.1111110.7250.2371.0000.3110.973
[20-29]1.0000.0000.3111.0000.836
090.5980.0000.9730.8361.000
2023-12-10T15:13:12.366247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
[20-29]091
[20-29]1.0000.6390.982
090.6391.0000.223
10.9820.2231.000
2023-12-10T15:13:12.463208image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
1.111.1111111[20-29]09
1.11.000-0.7160.0000.0000.000
11.111111-0.7161.0000.3790.2460.757
10.0000.3791.0000.9820.223
[20-29]0.0000.2460.9821.0000.639
090.0000.7570.2230.6391.000

Missing values

2023-12-10T15:13:08.730707image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T15:13:08.878253image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

2020-01-01치즈케이크11.111.111111[20-29]09
02020-01-01무생채1211.111111[20-29]09
12020-01-01채소샐러드1311.111111[20-29]09
22020-01-01쌀밥1411.111111[20-29]09
32020-01-01잡곡밥1511.111111[20-29]09
42020-01-01고사리나물1611.111111[20-29]09
52020-01-01요거트플러스 플레인1711.111111[20-29]09
62020-01-01고등어구이1811.111111[20-29]09
72020-01-01사과2120.0[20-29]15
82020-01-01캐이크1210.0[20-29]15
92020-01-01아포가토 아이스디저트1310.0[20-29]15
2020-01-01치즈케이크11.111.111111[20-29]09
1892020-01-01우유11010.0[20-29]17
1902020-01-01떡만둣국2120.0[20-29]08
1912020-01-01배추김치2220.0[20-29]08
1922020-01-01떡국1310.0[20-29]08
1932020-01-01콩나물무침1410.0[20-29]08
1942020-01-01오징어채볶음1510.0[20-29]08
1952020-01-01사과1610.0[20-29]08
1962020-01-01자차이1710.0[20-29]08
1972020-01-01김치볶음밥1810.0[20-29]08
1982020-01-01쌀밥117.142857[20-29]07