Overview

Dataset statistics

Number of variables8
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows1082
Duplicate rows (%)10.8%
Total size in memory712.9 KiB
Average record size in memory73.0 B

Variable types

DateTime1
Text4
Categorical2
Numeric1

Dataset

Description거래일,품목,품종,단위,등급,가격,출하지,친환경구분(일반)
Author서울시농수산식품공사
URLhttps://data.seoul.go.kr/dataList/OA-20950/S/1/datasetView.do

Alerts

친환경구분(일반) has constant value ""Constant
Dataset has 1082 (10.8%) duplicate rowsDuplicates
등급 is highly imbalanced (86.5%)Imbalance

Reproduction

Analysis started2024-05-18 02:23:16.875586
Analysis finished2024-05-18 02:23:18.870864
Duration2 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct48
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2024-03-11 00:00:00
Maximum2024-05-17 00:00:00
2024-05-18T11:23:19.054910image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T11:23:19.331070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=48)

품목
Text

Distinct132
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-18T11:23:19.830137image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length7
Mean length3.0744
Min length1

Characters and Unicode

Total characters30744
Distinct characters180
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique22 ?
Unique (%)0.2%

Sample

1st row방풍나물
2nd row세발나물
3rd row방풍나물
4th row세발나물
5th row우엉
ValueCountFrequency (%)
기타 1053
 
9.4%
채소류 981
 
8.8%
콩나물 920
 
8.2%
마늘 794
 
7.1%
숙주나물 693
 
6.2%
두부 675
 
6.0%
베이비 662
 
5.9%
고사리 522
 
4.7%
취나물 357
 
3.2%
무순 356
 
3.2%
Other values (128) 4183
37.4%
2024-05-18T11:23:20.555892image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2157
 
7.0%
2152
 
7.0%
1510
 
4.9%
1196
 
3.9%
1118
 
3.6%
1060
 
3.4%
1056
 
3.4%
1036
 
3.4%
986
 
3.2%
981
 
3.2%
Other values (170) 17492
56.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 29460
95.8%
Space Separator 1196
 
3.9%
Open Punctuation 44
 
0.1%
Close Punctuation 44
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2157
 
7.3%
2152
 
7.3%
1510
 
5.1%
1118
 
3.8%
1060
 
3.6%
1056
 
3.6%
1036
 
3.5%
986
 
3.3%
981
 
3.3%
937
 
3.2%
Other values (167) 16467
55.9%
Space Separator
ValueCountFrequency (%)
1196
100.0%
Open Punctuation
ValueCountFrequency (%)
( 44
100.0%
Close Punctuation
ValueCountFrequency (%)
) 44
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 29460
95.8%
Common 1284
 
4.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2157
 
7.3%
2152
 
7.3%
1510
 
5.1%
1118
 
3.8%
1060
 
3.6%
1056
 
3.6%
1036
 
3.5%
986
 
3.3%
981
 
3.3%
937
 
3.2%
Other values (167) 16467
55.9%
Common
ValueCountFrequency (%)
1196
93.1%
( 44
 
3.4%
) 44
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 29460
95.8%
ASCII 1284
 
4.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2157
 
7.3%
2152
 
7.3%
1510
 
5.1%
1118
 
3.8%
1060
 
3.6%
1056
 
3.6%
1036
 
3.5%
986
 
3.3%
981
 
3.3%
937
 
3.2%
Other values (167) 16467
55.9%
ASCII
ValueCountFrequency (%)
1196
93.1%
( 44
 
3.4%
) 44
 
3.4%

품종
Text

Distinct224
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-18T11:23:21.114322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length9
Mean length5.6611
Min length2

Characters and Unicode

Total characters56611
Distinct characters219
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique50 ?
Unique (%)0.5%

Sample

1st row방풍나물
2nd row세발나물
3rd row방풍나물
4th row세발나물
5th row우엉채
ValueCountFrequency (%)
수입 3561
21.9%
기타(상장예외 988
 
6.1%
채소류 981
 
6.0%
콩나물 920
 
5.7%
숙주나물 693
 
4.3%
베이비 662
 
4.1%
깐마늘 501
 
3.1%
고사리 500
 
3.1%
대서 452
 
2.8%
무순 356
 
2.2%
Other values (203) 6614
40.8%
2024-05-18T11:23:22.081549image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6228
 
11.0%
3603
 
6.4%
3567
 
6.3%
2297
 
4.1%
2152
 
3.8%
1350
 
2.4%
( 1307
 
2.3%
) 1307
 
2.3%
1302
 
2.3%
1226
 
2.2%
Other values (209) 32272
57.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 47769
84.4%
Space Separator 6228
 
11.0%
Open Punctuation 1307
 
2.3%
Close Punctuation 1307
 
2.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3603
 
7.5%
3567
 
7.5%
2297
 
4.8%
2152
 
4.5%
1350
 
2.8%
1302
 
2.7%
1226
 
2.6%
1120
 
2.3%
1071
 
2.2%
1067
 
2.2%
Other values (206) 29014
60.7%
Space Separator
ValueCountFrequency (%)
6228
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1307
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1307
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 47769
84.4%
Common 8842
 
15.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3603
 
7.5%
3567
 
7.5%
2297
 
4.8%
2152
 
4.5%
1350
 
2.8%
1302
 
2.7%
1226
 
2.6%
1120
 
2.3%
1071
 
2.2%
1067
 
2.2%
Other values (206) 29014
60.7%
Common
ValueCountFrequency (%)
6228
70.4%
( 1307
 
14.8%
) 1307
 
14.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 47769
84.4%
ASCII 8842
 
15.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6228
70.4%
( 1307
 
14.8%
) 1307
 
14.8%
Hangul
ValueCountFrequency (%)
3603
 
7.5%
3567
 
7.5%
2297
 
4.8%
2152
 
4.5%
1350
 
2.8%
1302
 
2.7%
1226
 
2.6%
1120
 
2.3%
1071
 
2.2%
1067
 
2.2%
Other values (206) 29014
60.7%

단위
Text

Distinct128
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-18T11:23:22.522443image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length5
Mean length3.9012
Min length3

Characters and Unicode

Total characters39012
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique41 ?
Unique (%)0.4%

Sample

1st row2키로
2nd row4키로
3rd row10키로
4th row4키로
5th row2키로
ValueCountFrequency (%)
10키로 1640
16.4%
3.5키로 1269
12.7%
4키로 1144
11.4%
1키로 1037
10.4%
20키로 694
 
6.9%
500그람 664
 
6.6%
2키로 482
 
4.8%
50그람 455
 
4.5%
5키로 434
 
4.3%
12키로 349
 
3.5%
Other values (112) 1832
18.3%
2024-05-18T11:23:23.258558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8364
21.4%
8364
21.4%
0 4749
12.2%
1 3748
9.6%
5 3300
 
8.5%
2 1711
 
4.4%
1636
 
4.2%
1636
 
4.2%
. 1632
 
4.2%
3 1512
 
3.9%
Other values (5) 2360
 
6.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 20000
51.3%
Decimal Number 17380
44.6%
Other Punctuation 1632
 
4.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 4749
27.3%
1 3748
21.6%
5 3300
19.0%
2 1711
 
9.8%
3 1512
 
8.7%
4 1333
 
7.7%
6 497
 
2.9%
8 187
 
1.1%
7 176
 
1.0%
9 167
 
1.0%
Other Letter
ValueCountFrequency (%)
8364
41.8%
8364
41.8%
1636
 
8.2%
1636
 
8.2%
Other Punctuation
ValueCountFrequency (%)
. 1632
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 20000
51.3%
Common 19012
48.7%

Most frequent character per script

Common
ValueCountFrequency (%)
0 4749
25.0%
1 3748
19.7%
5 3300
17.4%
2 1711
 
9.0%
. 1632
 
8.6%
3 1512
 
8.0%
4 1333
 
7.0%
6 497
 
2.6%
8 187
 
1.0%
7 176
 
0.9%
Hangul
ValueCountFrequency (%)
8364
41.8%
8364
41.8%
1636
 
8.2%
1636
 
8.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 20000
51.3%
ASCII 19012
48.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
8364
41.8%
8364
41.8%
1636
 
8.2%
1636
 
8.2%
ASCII
ValueCountFrequency (%)
0 4749
25.0%
1 3748
19.7%
5 3300
17.4%
2 1711
 
9.0%
. 1632
 
8.6%
3 1512
 
8.0%
4 1333
 
7.0%
6 497
 
2.6%
8 187
 
1.0%
7 176
 
0.9%

등급
Categorical

IMBALANCE 

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
기타
9492 
 
137
 
128
 
110
등외
 
105
Other values (3)
 
28

Length

Max length2
Median length2
Mean length1.9597
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row기타
2nd row기타
3rd row기타
4th row기타
5th row기타

Common Values

ValueCountFrequency (%)
기타 9492
94.9%
137
 
1.4%
128
 
1.3%
110
 
1.1%
등외 105
 
1.1%
19
 
0.2%
5
 
0.1%
4
 
< 0.1%

Length

2024-05-18T11:23:23.686039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T11:23:24.033790image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
기타 9492
94.9%
137
 
1.4%
128
 
1.3%
110
 
1.1%
등외 105
 
1.1%
19
 
0.2%
5
 
< 0.1%
4
 
< 0.1%

가격
Real number (ℝ)

Distinct529
Distinct (%)5.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29684.876
Minimum320
Maximum2300000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-18T11:23:24.427354image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum320
5-th percentile750
Q14300
median12000
Q331000
95-th percentile125000
Maximum2300000
Range2299680
Interquartile range (IQR)26700

Descriptive statistics

Standard deviation54430.539
Coefficient of variation (CV)1.8336118
Kurtosis323.44948
Mean29684.876
Median Absolute Deviation (MAD)9300
Skewness10.509322
Sum2.9684876 × 108
Variance2.9626835 × 109
MonotonicityNot monotonic
2024-05-18T11:23:24.982453image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2700 491
 
4.9%
4600 266
 
2.7%
18000 244
 
2.4%
10700 242
 
2.4%
4300 218
 
2.2%
17000 194
 
1.9%
7500 183
 
1.8%
20000 165
 
1.7%
4500 162
 
1.6%
19000 156
 
1.6%
Other values (519) 7679
76.8%
ValueCountFrequency (%)
320 18
 
0.2%
350 4
 
< 0.1%
360 26
 
0.3%
370 129
1.3%
380 1
 
< 0.1%
390 5
 
0.1%
400 15
 
0.1%
450 1
 
< 0.1%
480 9
 
0.1%
490 2
 
< 0.1%
ValueCountFrequency (%)
2300000 1
 
< 0.1%
781000 1
 
< 0.1%
780000 1
 
< 0.1%
650000 1
 
< 0.1%
570000 1
 
< 0.1%
550000 4
< 0.1%
516000 1
 
< 0.1%
512000 1
 
< 0.1%
470000 1
 
< 0.1%
450000 1
 
< 0.1%
Distinct153
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-18T11:23:25.564554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length9
Mean length4.6982
Min length2

Characters and Unicode

Total characters46982
Distinct characters159
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique25 ?
Unique (%)0.2%

Sample

1st row충청남도 보령시
2nd row전라남도 해남군
3rd row전라남도 여수시
4th row전라남도 해남군
5th row경상북도 안동시
ValueCountFrequency (%)
중국 3655
25.0%
경기도 1811
 
12.4%
전라남도 984
 
6.7%
광주시 688
 
4.7%
경상남도 421
 
2.9%
페루 413
 
2.8%
충청남도 345
 
2.4%
경상북도 331
 
2.3%
제주자치도 315
 
2.2%
태국 306
 
2.1%
Other values (157) 5357
36.6%
2024-05-18T11:23:26.599185image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4837
 
10.3%
4626
 
9.8%
4059
 
8.6%
3655
 
7.8%
3477
 
7.4%
2574
 
5.5%
2194
 
4.7%
1845
 
3.9%
1550
 
3.3%
1385
 
2.9%
Other values (149) 16780
35.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 42352
90.1%
Space Separator 4626
 
9.8%
Close Punctuation 2
 
< 0.1%
Open Punctuation 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4837
 
11.4%
4059
 
9.6%
3655
 
8.6%
3477
 
8.2%
2574
 
6.1%
2194
 
5.2%
1845
 
4.4%
1550
 
3.7%
1385
 
3.3%
1034
 
2.4%
Other values (146) 15742
37.2%
Space Separator
ValueCountFrequency (%)
4626
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 42352
90.1%
Common 4630
 
9.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4837
 
11.4%
4059
 
9.6%
3655
 
8.6%
3477
 
8.2%
2574
 
6.1%
2194
 
5.2%
1845
 
4.4%
1550
 
3.7%
1385
 
3.3%
1034
 
2.4%
Other values (146) 15742
37.2%
Common
ValueCountFrequency (%)
4626
99.9%
) 2
 
< 0.1%
( 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 42352
90.1%
ASCII 4630
 
9.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4837
 
11.4%
4059
 
9.6%
3655
 
8.6%
3477
 
8.2%
2574
 
6.1%
2194
 
5.2%
1845
 
4.4%
1550
 
3.7%
1385
 
3.3%
1034
 
2.4%
Other values (146) 15742
37.2%
ASCII
ValueCountFrequency (%)
4626
99.9%
) 2
 
< 0.1%
( 2
 
< 0.1%

친환경구분(일반)
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
일반
10000 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row일반
2nd row일반
3rd row일반
4th row일반
5th row일반

Common Values

ValueCountFrequency (%)
일반 10000
100.0%

Length

2024-05-18T11:23:26.996030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T11:23:27.215882image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반 10000
100.0%

Interactions

2024-05-18T11:23:17.905583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-18T11:23:27.339236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
거래일등급가격
거래일1.0000.0000.000
등급0.0001.0000.000
가격0.0000.0001.000
2024-05-18T11:23:27.588842image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
가격등급
가격1.0000.000
등급0.0001.000

Missing values

2024-05-18T11:23:18.254914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-18T11:23:18.679285image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

거래일품목품종단위등급가격출하지친환경구분(일반)
186002024-05-03방풍나물방풍나물2키로기타7000충청남도 보령시일반
719272024-03-28세발나물세발나물4키로기타12000전라남도 해남군일반
997082024-03-11방풍나물방풍나물10키로기타50000전라남도 여수시일반
976802024-03-12세발나물세발나물4키로기타14000전라남도 해남군일반
526042024-04-11우엉우엉채2키로기타28000경상북도 안동시일반
584512024-04-05두부포장두부12키로기타17000인도일반
418962024-04-17채소류 기타채소류 기타(상장예외)10키로기타21000제주자치도 서귀포시일반
215052024-05-02마늘깐마늘 대서20키로기타127000경상북도 의성군일반
976462024-03-12콩나물콩나물 수입3.5키로기타2700중국일반
235452024-04-30토란대토란대 수입1키로기타1100미얀마일반
거래일품목품종단위등급가격출하지친환경구분(일반)
373512024-04-19꽃게꽃게 냉동수5.4키로기타27000중국일반
305522024-04-24고사리고사리 수입10키로기타21000중국일반
636132024-04-03생강생강 원강20키로기타115000충청남도 서산시일반
30572024-05-16콩나물콩나물 수입4키로기타2700중국일반
541662024-04-09숙주나물숙주나물 수입3.5키로기타4500페루일반
690672024-03-29채소류 기타채소류 기타(상장예외)20그람기타2140경기도 광주시일반
436792024-04-17기타 건어류기타 건어류10키로기타163800중국일반
790692024-03-25숙주나물숙주나물 수입3.5키로기타4500중국일반
347592024-04-23마늘잎마늘20키로기타44000전라남도 목포시일반
369812024-04-22마늘깐마늘 대서20키로135000경상남도 창녕군일반

Duplicate rows

Most frequently occurring

거래일품목품종단위등급가격출하지친환경구분(일반)# duplicates
2372024-03-22콩나물콩나물 수입3.5키로기타2700중국일반17
3392024-03-28콩나물콩나물 수입3.5키로기타2700중국일반17
2162024-03-21콩나물콩나물 수입3.5키로기타2700중국일반14
612024-03-13콩나물콩나물 수입3.5키로기타2700중국일반13
812024-03-14콩나물콩나물 수입3.5키로기타2700중국일반13
1392024-03-18콩나물콩나물 수입3.5키로기타2700중국일반11
1702024-03-19콩나물콩나물 수입3.5키로기타2700중국일반11
3152024-03-27콩나물콩나물 수입3.5키로기타2700중국일반11
3572024-03-29콩나물콩나물 수입3.5키로기타2700중국일반11
6242024-04-16콩나물콩나물 수입3.5키로기타2600중국일반10