Overview

Dataset statistics

Number of variables7
Number of observations199
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory11.4 KiB
Average record size in memory58.7 B

Variable types

DateTime1
Text1
Categorical3
Numeric2

Alerts

2020-01-01 has constant value ""Constant
1 is highly overall correlated with 12.121212High correlation
12.121212 is highly overall correlated with 1High correlation
[40-49] is highly overall correlated with [25-27]High correlation
[25-27] is highly overall correlated with [40-49]High correlation
4 is highly imbalanced (66.6%)Imbalance

Reproduction

Analysis started2023-12-10 06:16:08.866593
Analysis finished2023-12-10 06:16:10.184779
Duration1.32 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

2020-01-01
Date

CONSTANT 

Distinct1
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
Minimum2020-01-01 00:00:00
Maximum2020-01-01 00:00:00
2023-12-10T15:16:10.307745image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:16:10.442204image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)
Distinct146
Distinct (%)73.4%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
2023-12-10T15:16:10.844681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length12
Mean length5.4271357
Min length2

Characters and Unicode

Total characters1080
Distinct characters244
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique118 ?
Unique (%)59.3%

Sample

1st row 아메리카노
2nd row 떡국
3rd row 깍두기
4th row 감자조림
5th row 토마토케첩
ValueCountFrequency (%)
쌀밥 10
 
4.5%
배추김치 9
 
4.1%
떡국 5
 
2.3%
쌈장 4
 
1.8%
서울우유 4
 
1.8%
아메리카노 3
 
1.4%
떡만둣국 3
 
1.4%
우유 3
 
1.4%
바나나 3
 
1.4%
뚝배기불고기 3
 
1.4%
Other values (152) 175
78.8%
2023-12-10T15:16:11.492467image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
222
 
20.6%
30
 
2.8%
21
 
1.9%
21
 
1.9%
20
 
1.9%
16
 
1.5%
15
 
1.4%
15
 
1.4%
14
 
1.3%
14
 
1.3%
Other values (234) 692
64.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 824
76.3%
Space Separator 222
 
20.6%
Open Punctuation 14
 
1.3%
Close Punctuation 11
 
1.0%
Decimal Number 6
 
0.6%
Lowercase Letter 2
 
0.2%
Other Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
30
 
3.6%
21
 
2.5%
21
 
2.5%
20
 
2.4%
16
 
1.9%
15
 
1.8%
15
 
1.8%
14
 
1.7%
14
 
1.7%
14
 
1.7%
Other values (223) 644
78.2%
Decimal Number
ValueCountFrequency (%)
0 2
33.3%
1 1
16.7%
5 1
16.7%
7 1
16.7%
2 1
16.7%
Lowercase Letter
ValueCountFrequency (%)
l 1
50.0%
m 1
50.0%
Space Separator
ValueCountFrequency (%)
222
100.0%
Open Punctuation
ValueCountFrequency (%)
( 14
100.0%
Close Punctuation
ValueCountFrequency (%)
) 11
100.0%
Other Punctuation
ValueCountFrequency (%)
% 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 824
76.3%
Common 254
 
23.5%
Latin 2
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
30
 
3.6%
21
 
2.5%
21
 
2.5%
20
 
2.4%
16
 
1.9%
15
 
1.8%
15
 
1.8%
14
 
1.7%
14
 
1.7%
14
 
1.7%
Other values (223) 644
78.2%
Common
ValueCountFrequency (%)
222
87.4%
( 14
 
5.5%
) 11
 
4.3%
0 2
 
0.8%
1 1
 
0.4%
% 1
 
0.4%
5 1
 
0.4%
7 1
 
0.4%
2 1
 
0.4%
Latin
ValueCountFrequency (%)
l 1
50.0%
m 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 824
76.3%
ASCII 256
 
23.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
222
86.7%
( 14
 
5.5%
) 11
 
4.3%
0 2
 
0.8%
1 1
 
0.4%
% 1
 
0.4%
l 1
 
0.4%
m 1
 
0.4%
5 1
 
0.4%
7 1
 
0.4%
Hangul
ValueCountFrequency (%)
30
 
3.6%
21
 
2.5%
21
 
2.5%
20
 
2.4%
16
 
1.9%
15
 
1.8%
15
 
1.8%
14
 
1.7%
14
 
1.7%
14
 
1.7%
Other values (223) 644
78.2%

4
Categorical

IMBALANCE 

Distinct6
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
1
168 
2
20 
3
 
6
4
 
2
삶은것)
 
2

Length

Max length5
Median length2
Mean length2.040201
Min length2

Unique

Unique1 ?
Unique (%)0.5%

Sample

1st row 2
2nd row 2
3rd row 2
4th row 1
5th row 1

Common Values

ValueCountFrequency (%)
1 168
84.4%
2 20
 
10.1%
3 6
 
3.0%
4 2
 
1.0%
삶은것) 2
 
1.0%
튀김) 1
 
0.5%

Length

2023-12-10T15:16:11.734773image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:16:11.919043image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 168
84.4%
2 20
 
10.1%
3 6
 
3.0%
4 2
 
1.0%
삶은것 2
 
1.0%
튀김 1
 
0.5%

1
Real number (ℝ)

HIGH CORRELATION 

Distinct30
Distinct (%)15.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.4422111
Minimum1
Maximum30
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 KiB
2023-12-10T15:16:12.093523image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median8
Q313.5
95-th percentile24
Maximum30
Range29
Interquartile range (IQR)9.5

Descriptive statistics

Standard deviation7.0757394
Coefficient of variation (CV)0.74937315
Kurtosis0.041637885
Mean9.4422111
Median Absolute Deviation (MAD)5
Skewness0.89156146
Sum1879
Variance50.066088
MonotonicityNot monotonic
2023-12-10T15:16:12.298485image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
1 17
 
8.5%
2 15
 
7.5%
4 15
 
7.5%
3 14
 
7.0%
5 14
 
7.0%
6 11
 
5.5%
7 11
 
5.5%
8 10
 
5.0%
10 10
 
5.0%
9 9
 
4.5%
Other values (20) 73
36.7%
ValueCountFrequency (%)
1 17
8.5%
2 15
7.5%
3 14
7.0%
4 15
7.5%
5 14
7.0%
6 11
5.5%
7 11
5.5%
8 10
5.0%
9 9
4.5%
10 10
5.0%
ValueCountFrequency (%)
30 1
 
0.5%
29 1
 
0.5%
28 1
 
0.5%
27 2
1.0%
26 2
1.0%
25 2
1.0%
24 2
1.0%
23 2
1.0%
22 3
1.5%
21 3
1.5%

12.121212
Real number (ℝ)

HIGH CORRELATION 

Distinct25
Distinct (%)12.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.2569814
Minimum2.857143
Maximum30
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 KiB
2023-12-10T15:16:12.509936image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.857143
5-th percentile2.857143
Q13.030303
median5.714286
Q38.333333
95-th percentile20
Maximum30
Range27.142857
Interquartile range (IQR)5.30303

Descriptive statistics

Standard deviation5.5301561
Coefficient of variation (CV)0.76204634
Kurtosis3.2651564
Mean7.2569814
Median Absolute Deviation (MAD)2.683983
Skewness1.8911826
Sum1444.1393
Variance30.582627
MonotonicityNot monotonic
2023-12-10T15:16:12.693149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
2.857143 40
20.1%
3.030303 23
11.6%
5.0 16
 
8.0%
4.761905 16
 
8.0%
10.0 16
 
8.0%
8.333333 14
 
7.0%
20.0 11
 
5.5%
7.142857 11
 
5.5%
5.882353 10
 
5.0%
6.25 10
 
5.0%
Other values (15) 32
16.1%
ValueCountFrequency (%)
2.857143 40
20.1%
3.0 1
 
0.5%
3.030303 23
11.6%
4.761905 16
 
8.0%
5.0 16
 
8.0%
5.714286 6
 
3.0%
5.882353 10
 
5.0%
6.060606 3
 
1.5%
6.25 10
 
5.0%
7.142857 11
 
5.5%
ValueCountFrequency (%)
30.0 1
 
0.5%
25.0 4
 
2.0%
23.529412 1
 
0.5%
23.0 1
 
0.5%
20.0 11
5.5%
16.666667 2
 
1.0%
14.285714 2
 
1.0%
12.5 3
 
1.5%
11.764706 1
 
0.5%
11.428571 1
 
0.5%

[40-49]
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)3.5%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
[40-49]
98 
[0-19]
72 
[50-59]
11 
[60-99]
10 
[20-29]
 
5
Other values (2)
 
3

Length

Max length10
Median length8
Mean length7.6582915
Min length7

Unique

Unique1 ?
Unique (%)0.5%

Sample

1st row [40-49]
2nd row [40-49]
3rd row [40-49]
4th row [40-49]
5th row [40-49]

Common Values

ValueCountFrequency (%)
[40-49] 98
49.2%
[0-19] 72
36.2%
[50-59] 11
 
5.5%
[60-99] 10
 
5.0%
[20-29] 5
 
2.5%
2.857143 2
 
1.0%
10.000000 1
 
0.5%

Length

2023-12-10T15:16:12.926810image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:16:13.118527image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
40-49 98
49.2%
0-19 72
36.2%
50-59 11
 
5.5%
60-99 10
 
5.0%
20-29 5
 
2.5%
2.857143 2
 
1.0%
10.000000 1
 
0.5%

[25-27]
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
[25-27]
59 
[27-29]
53 
[29-31]
39 
[31-33]
25 
[35-100]
15 
Other values (3)

Length

Max length9
Median length8
Mean length8.0653266
Min length7

Unique

Unique1 ?
Unique (%)0.5%

Sample

1st row [25-27]
2nd row [25-27]
3rd row [25-27]
4th row [25-27]
5th row [25-27]

Common Values

ValueCountFrequency (%)
[25-27] 59
29.6%
[27-29] 53
26.6%
[29-31] 39
19.6%
[31-33] 25
12.6%
[35-100] 15
 
7.5%
[33-35] 5
 
2.5%
[0-19] 2
 
1.0%
[50-59] 1
 
0.5%

Length

2023-12-10T15:16:13.343039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:16:13.557804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
25-27 59
29.6%
27-29 53
26.6%
29-31 39
19.6%
31-33 25
12.6%
35-100 15
 
7.5%
33-35 5
 
2.5%
0-19 2
 
1.0%
50-59 1
 
0.5%

Interactions

2023-12-10T15:16:09.573469image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:16:09.326832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:16:09.718021image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:16:09.460932image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T15:16:13.702885image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
4112.121212[40-49][25-27]
41.0000.3140.7030.6740.710
10.3141.0000.5560.0000.000
12.1212120.7030.5561.0000.6560.739
[40-49]0.6740.0000.6561.0000.807
[25-27]0.7100.0000.7390.8071.000
2023-12-10T15:16:13.848354image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
4[40-49][25-27]
41.0000.4820.494
[40-49]0.4821.0000.602
[25-27]0.4940.6021.000
2023-12-10T15:16:13.998513image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
112.1212124[40-49][25-27]
11.000-0.7520.1680.0000.000
12.121212-0.7521.0000.4630.4040.469
40.1680.4631.0000.4820.494
[40-49]0.0000.4040.4821.0000.602
[25-27]0.0000.4690.4940.6021.000

Missing values

2023-12-10T15:16:09.925549image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T15:16:10.102273image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

2020-01-01배추김치4112.121212[40-49][25-27]
02020-01-01아메리카노226.060606[40-49][25-27]
12020-01-01떡국236.060606[40-49][25-27]
22020-01-01깍두기246.060606[40-49][25-27]
32020-01-01감자조림153.030303[40-49][25-27]
42020-01-01토마토케첩163.030303[40-49][25-27]
52020-01-01홈런볼 초코173.030303[40-49][25-27]
62020-01-01락토핏 생유산균 골드183.030303[40-49][25-27]
72020-01-01쌀밥193.030303[40-49][25-27]
82020-01-01빈대떡1103.030303[40-49][25-27]
92020-01-01할라피뇨1113.030303[40-49][25-27]
2020-01-01배추김치4112.121212[40-49][25-27]
1892020-01-01물김치1610.0[60-99][25-27]
1902020-01-01물냉면1710.0[60-99][25-27]
1912020-01-01치킨버거1810.0[60-99][25-27]
1922020-01-01돼지고기보쌈(사태)1910.0[60-99][25-27]
1932020-01-01옥수수통조림(가당)11010.0[60-99][25-27]
1942020-01-01배추김치2116.666667[20-29][25-27]
1952020-01-01코다리조림128.333333[20-29][25-27]
1962020-01-01청포도138.333333[20-29][25-27]
1972020-01-01쌀밥148.333333[20-29][25-27]
1982020-01-01오뚜기 미역국 라면158.333333[20-29][25-27]