Overview

Dataset statistics

Number of variables10
Number of observations1000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory83.1 KiB
Average record size in memory85.1 B

Variable types

Text2
Categorical6
Boolean1
Numeric1

Dataset

Description한국주택금융공사 유동화자산부 업무 관련 공개 공공데이터 (해당 부서의 업무와 관련된 데이터베이스에서 공개 가능한 원천 데이터)
Author한국주택금융공사
URLhttps://www.data.go.kr/data/15073243/fileData.do

Alerts

PINT_RCPT_AMT has constant value ""Constant
RPT_OFFER_SEQ is highly overall correlated with TREAT_ORG_CD and 2 other fieldsHigh correlation
RCPT_DY is highly overall correlated with TREAT_ORG_CD and 2 other fieldsHigh correlation
TREAT_ORG_CD is highly overall correlated with RCPT_DY and 4 other fieldsHigh correlation
ONLIN_YN is highly overall correlated with TELGRM_SEQ and 3 other fieldsHigh correlation
WRT_BASIS_DY is highly overall correlated with RCPT_DY and 4 other fieldsHigh correlation
REG_DT is highly overall correlated with RCPT_DY and 4 other fieldsHigh correlation
TELGRM_SEQ is highly overall correlated with ONLIN_YNHigh correlation
RCPT_DY is highly imbalanced (98.9%)Imbalance
RPT_OFFER_SEQ is highly imbalanced (98.6%)Imbalance

Reproduction

Analysis started2023-12-12 07:40:14.130293
Analysis finished2023-12-12 07:40:14.896588
Duration0.77 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct369
Distinct (%)36.9%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
2023-12-12T16:40:15.079017image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length14
Mean length14
Min length14

Characters and Unicode

Total characters14000
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique124 ?
Unique (%)12.4%

Sample

1st rowKHFCMB2015S-24
2nd rowKHFCMB2020S-32
3rd rowKHFCMB2020S-32
4th rowKHFCMB2020S-27
5th rowKHFCMB2020S-27
ValueCountFrequency (%)
khfcmb2018s-14 22
 
2.2%
khfcmb2017s-04 18
 
1.8%
khfcmb2015l-05 11
 
1.1%
khfcmb2018s-09 8
 
0.8%
khfcmb2013s-40 8
 
0.8%
khfcmb2017s-23 8
 
0.8%
khfcmb2017s-14 7
 
0.7%
khfcmb2015s-17 7
 
0.7%
khfcmb2016s-16 7
 
0.7%
khfcmb2016s-26 7
 
0.7%
Other values (359) 897
89.7%
2023-12-12T16:40:15.467674image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 1728
12.3%
2 1482
10.6%
1 1307
9.3%
B 1009
7.2%
K 1000
7.1%
H 1000
7.1%
F 1000
7.1%
C 1000
7.1%
M 1000
7.1%
- 1000
7.1%
Other values (10) 2474
17.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 7008
50.1%
Decimal Number 5992
42.8%
Dash Punctuation 1000
 
7.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1728
28.8%
2 1482
24.7%
1 1307
21.8%
7 235
 
3.9%
6 230
 
3.8%
5 210
 
3.5%
8 208
 
3.5%
3 208
 
3.5%
4 199
 
3.3%
9 185
 
3.1%
Uppercase Letter
ValueCountFrequency (%)
B 1009
14.4%
K 1000
14.3%
H 1000
14.3%
F 1000
14.3%
C 1000
14.3%
M 1000
14.3%
S 803
11.5%
L 188
 
2.7%
A 8
 
0.1%
Dash Punctuation
ValueCountFrequency (%)
- 1000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7008
50.1%
Common 6992
49.9%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1728
24.7%
2 1482
21.2%
1 1307
18.7%
- 1000
14.3%
7 235
 
3.4%
6 230
 
3.3%
5 210
 
3.0%
8 208
 
3.0%
3 208
 
3.0%
4 199
 
2.8%
Latin
ValueCountFrequency (%)
B 1009
14.4%
K 1000
14.3%
H 1000
14.3%
F 1000
14.3%
C 1000
14.3%
M 1000
14.3%
S 803
11.5%
L 188
 
2.7%
A 8
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1728
12.3%
2 1482
10.6%
1 1307
9.3%
B 1009
7.2%
K 1000
7.1%
H 1000
7.1%
F 1000
7.1%
C 1000
7.1%
M 1000
7.1%
- 1000
7.1%
Other values (10) 2474
17.7%
Distinct840
Distinct (%)84.0%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
2023-12-12T16:40:15.720082image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length14
Mean length14
Min length14

Characters and Unicode

Total characters14000
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique753 ?
Unique (%)75.3%

Sample

1st rowB007-2015-0032
2nd rowI001-2020-0018
3rd rowI001-2020-0017
4th rowI001-2020-0016
5th rowI001-2020-0015
ValueCountFrequency (%)
b023-2010-0006 8
 
0.8%
b023-2009-0003 6
 
0.6%
b023-2010-0008 6
 
0.6%
b023-2011-0003 6
 
0.6%
b023-2009-0001 6
 
0.6%
b023-2015-0010 5
 
0.5%
b023-2008-0003 5
 
0.5%
b023-2009-0008 5
 
0.5%
b023-2016-0028 5
 
0.5%
b023-2008-0004 4
 
0.4%
Other values (830) 944
94.4%
2023-12-12T16:40:16.108803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 5025
35.9%
2 2056
14.7%
- 2000
 
14.3%
1 1479
 
10.6%
3 974
 
7.0%
B 713
 
5.1%
5 288
 
2.1%
I 286
 
2.0%
6 269
 
1.9%
4 265
 
1.9%
Other values (4) 645
 
4.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 11000
78.6%
Dash Punctuation 2000
 
14.3%
Uppercase Letter 1000
 
7.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 5025
45.7%
2 2056
18.7%
1 1479
 
13.4%
3 974
 
8.9%
5 288
 
2.6%
6 269
 
2.4%
4 265
 
2.4%
7 234
 
2.1%
8 215
 
2.0%
9 195
 
1.8%
Uppercase Letter
ValueCountFrequency (%)
B 713
71.3%
I 286
28.6%
F 1
 
0.1%
Dash Punctuation
ValueCountFrequency (%)
- 2000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 13000
92.9%
Latin 1000
 
7.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 5025
38.7%
2 2056
15.8%
- 2000
 
15.4%
1 1479
 
11.4%
3 974
 
7.5%
5 288
 
2.2%
6 269
 
2.1%
4 265
 
2.0%
7 234
 
1.8%
8 215
 
1.7%
Latin
ValueCountFrequency (%)
B 713
71.3%
I 286
28.6%
F 1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 5025
35.9%
2 2056
14.7%
- 2000
 
14.3%
1 1479
 
10.6%
3 974
 
7.0%
B 713
 
5.1%
5 288
 
2.1%
I 286
 
2.0%
6 269
 
1.9%
4 265
 
1.9%
Other values (4) 645
 
4.6%

RCPT_DY
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
20201026
999 
20201027
 
1

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row20201026
2nd row20201026
3rd row20201026
4th row20201026
5th row20201026

Common Values

ValueCountFrequency (%)
20201026 999
99.9%
20201027 1
 
0.1%

Length

2023-12-12T16:40:16.250678image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:40:16.367734image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20201026 999
99.9%
20201027 1
 
0.1%

TREAT_ORG_CD
Categorical

HIGH CORRELATION 

Distinct9
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
B023
535 
I001
267 
B032
124 
B031
 
52
I002
 
16
Other values (4)
 
6

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique3 ?
Unique (%)0.3%

Sample

1st rowB007
2nd rowI001
3rd rowI001
4th rowI001
5th rowI001

Common Values

ValueCountFrequency (%)
B023 535
53.5%
I001 267
26.7%
B032 124
 
12.4%
B031 52
 
5.2%
I002 16
 
1.6%
I004 3
 
0.3%
B007 1
 
0.1%
F003 1
 
0.1%
B004 1
 
0.1%

Length

2023-12-12T16:40:16.508989image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:40:16.653675image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
b023 535
53.5%
i001 267
26.7%
b032 124
 
12.4%
b031 52
 
5.2%
i002 16
 
1.6%
i004 3
 
0.3%
b007 1
 
0.1%
f003 1
 
0.1%
b004 1
 
0.1%

RPT_OFFER_SEQ
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
1
998 
2
 
1
6
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique2 ?
Unique (%)0.2%

Sample

1st row2
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 998
99.8%
2 1
 
0.1%
6 1
 
0.1%

Length

2023-12-12T16:40:16.798901image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:40:16.973007image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 998
99.8%
2 1
 
0.1%
6 1
 
0.1%

ONLIN_YN
Boolean

HIGH CORRELATION 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
True
714 
False
286 
ValueCountFrequency (%)
True 714
71.4%
False 286
28.6%
2023-12-12T16:40:17.085352image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

TELGRM_SEQ
Real number (ℝ)

HIGH CORRELATION 

Distinct535
Distinct (%)53.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean155.649
Minimum1
Maximum535
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 KiB
2023-12-12T16:40:17.212176image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median91.5
Q3285.25
95-th percentile485.05
Maximum535
Range534
Interquartile range (IQR)284.25

Descriptive statistics

Standard deviation167.63656
Coefficient of variation (CV)1.0770166
Kurtosis-0.72829093
Mean155.649
Median Absolute Deviation (MAD)90.5
Skewness0.80865677
Sum155649
Variance28102.016
MonotonicityNot monotonic
2023-12-12T16:40:17.356653image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 291
29.1%
27 3
 
0.3%
29 3
 
0.3%
30 3
 
0.3%
31 3
 
0.3%
32 3
 
0.3%
33 3
 
0.3%
34 3
 
0.3%
35 3
 
0.3%
36 3
 
0.3%
Other values (525) 682
68.2%
ValueCountFrequency (%)
1 291
29.1%
2 2
 
0.2%
3 2
 
0.2%
4 2
 
0.2%
5 2
 
0.2%
6 2
 
0.2%
7 2
 
0.2%
8 2
 
0.2%
9 2
 
0.2%
10 2
 
0.2%
ValueCountFrequency (%)
535 1
0.1%
534 1
0.1%
533 1
0.1%
532 1
0.1%
531 1
0.1%
530 1
0.1%
529 1
0.1%
528 1
0.1%
527 1
0.1%
526 1
0.1%

WRT_BASIS_DY
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
20201026
536 
19000101
286 
20201023
176 
20201024
 
1
20201025
 
1

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique2 ?
Unique (%)0.2%

Sample

1st row20201024
2nd row19000101
3rd row19000101
4th row19000101
5th row19000101

Common Values

ValueCountFrequency (%)
20201026 536
53.6%
19000101 286
28.6%
20201023 176
 
17.6%
20201024 1
 
0.1%
20201025 1
 
0.1%

Length

2023-12-12T16:40:17.499310image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:40:17.617707image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20201026 536
53.6%
19000101 286
28.6%
20201023 176
 
17.6%
20201024 1
 
0.1%
20201025 1
 
0.1%

PINT_RCPT_AMT
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
0
1000 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 1000
100.0%

Length

2023-12-12T16:40:17.749487image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:40:17.834991image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 1000
100.0%

REG_DT
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
2020/10/26 10:50:14
535 
2020/10/26 10:57:32
286 
2020/10/26 10:16:35
176 
2020/10/26 15:40:55
 
1
2020/10/26 10:50:03
 
1

Length

Max length19
Median length19
Mean length19
Min length19

Unique

Unique3 ?
Unique (%)0.3%

Sample

1st row2020/10/26 15:40:55
2nd row2020/10/26 10:57:32
3rd row2020/10/26 10:57:32
4th row2020/10/26 10:57:32
5th row2020/10/26 10:57:32

Common Values

ValueCountFrequency (%)
2020/10/26 10:50:14 535
53.5%
2020/10/26 10:57:32 286
28.6%
2020/10/26 10:16:35 176
 
17.6%
2020/10/26 15:40:55 1
 
0.1%
2020/10/26 10:50:03 1
 
0.1%
2020/10/26 10:30:23 1
 
0.1%

Length

2023-12-12T16:40:17.924568image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:40:18.011046image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020/10/26 1000
50.0%
10:50:14 535
26.8%
10:57:32 286
 
14.3%
10:16:35 176
 
8.8%
15:40:55 1
 
< 0.1%
10:50:03 1
 
< 0.1%
10:30:23 1
 
< 0.1%

Interactions

2023-12-12T16:40:14.562545image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T16:40:18.086573image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
RCPT_DYTREAT_ORG_CDRPT_OFFER_SEQONLIN_YNTELGRM_SEQWRT_BASIS_DYREG_DT
RCPT_DY1.0001.0000.0000.0000.0001.0001.000
TREAT_ORG_CD1.0001.0001.0001.0000.6191.0001.000
RPT_OFFER_SEQ0.0001.0001.0000.0000.0000.7171.000
ONLIN_YN0.0001.0000.0001.0000.8951.0001.000
TELGRM_SEQ0.0000.6190.0000.8951.0000.7920.634
WRT_BASIS_DY1.0001.0000.7171.0000.7921.0001.000
REG_DT1.0001.0001.0001.0000.6341.0001.000
2023-12-12T16:40:18.182573image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
RPT_OFFER_SEQRCPT_DYTREAT_ORG_CDONLIN_YNWRT_BASIS_DYREG_DT
RPT_OFFER_SEQ1.0000.0000.9970.0000.7050.998
RCPT_DY0.0001.0000.9960.0000.9980.998
TREAT_ORG_CD0.9970.9961.0000.9960.9980.998
ONLIN_YN0.0000.0000.9961.0000.9980.998
WRT_BASIS_DY0.7050.9980.9980.9981.0000.999
REG_DT0.9980.9980.9980.9980.9991.000
2023-12-12T16:40:18.536933image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
TELGRM_SEQRCPT_DYTREAT_ORG_CDRPT_OFFER_SEQONLIN_YNWRT_BASIS_DYREG_DT
TELGRM_SEQ1.0000.0000.3400.0000.7290.4470.399
RCPT_DY0.0001.0000.9960.0000.0000.9980.998
TREAT_ORG_CD0.3400.9961.0000.9970.9960.9980.998
RPT_OFFER_SEQ0.0000.0000.9971.0000.0000.7050.998
ONLIN_YN0.7290.0000.9960.0001.0000.9980.998
WRT_BASIS_DY0.4470.9980.9980.7050.9981.0000.999
REG_DT0.3990.9980.9980.9980.9980.9991.000

Missing values

2023-12-12T16:40:14.696547image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T16:40:14.840878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

LIQD_PLAN_CDHOLD_CDRCPT_DYTREAT_ORG_CDRPT_OFFER_SEQONLIN_YNTELGRM_SEQWRT_BASIS_DYPINT_RCPT_AMTREG_DT
0KHFCMB2015S-24B007-2015-003220201026B0072Y12020102402020/10/26 15:40:55
1KHFCMB2020S-32I001-2020-001820201026I0011N11900010102020/10/26 10:57:32
2KHFCMB2020S-32I001-2020-001720201026I0011N11900010102020/10/26 10:57:32
3KHFCMB2020S-27I001-2020-001620201026I0011N11900010102020/10/26 10:57:32
4KHFCMB2020S-27I001-2020-001520201026I0011N11900010102020/10/26 10:57:32
5KHFCMB2020S-25I001-2020-001420201026I0011N11900010102020/10/26 10:57:32
6KHFCMB2020S-22I001-2020-001320201026I0011N11900010102020/10/26 10:57:32
7KHFCMB2020S-22I001-2020-001220201026I0011N11900010102020/10/26 10:57:32
8KHFCMB2020S-20I001-2020-001120201026I0011N11900010102020/10/26 10:57:32
9KHFCMB2020S-16I001-2020-001020201026I0011N11900010102020/10/26 10:57:32
LIQD_PLAN_CDHOLD_CDRCPT_DYTREAT_ORG_CDRPT_OFFER_SEQONLIN_YNTELGRM_SEQWRT_BASIS_DYPINT_RCPT_AMTREG_DT
990KHFCMB2016S-13B032-2016-002620201026B0321Y332020102302020/10/26 10:16:35
991KHFCMB2016S-12B032-2016-002320201026B0321Y322020102302020/10/26 10:16:35
992KHFCMB2016S-12B032-2016-002220201026B0321Y312020102302020/10/26 10:16:35
993KHFCMB2016S-11B032-2016-001920201026B0321Y302020102302020/10/26 10:16:35
994KHFCMB2016S-11B032-2016-001820201026B0321Y292020102302020/10/26 10:16:35
995KHFCMB2016S-09B032-2016-001520201026B0321Y282020102302020/10/26 10:16:35
996KHFCMB2016S-09B032-2016-001420201026B0321Y272020102302020/10/26 10:16:35
997KHFCMB2016S-09B032-2016-001320201026B0321Y262020102302020/10/26 10:16:35
998KHFCMB2016S-07B032-2016-001020201026B0321Y252020102302020/10/26 10:16:35
999KHFCMB2016S-07B032-2016-000920201026B0321Y242020102302020/10/26 10:16:35