Overview

Dataset statistics

Number of variables9
Number of observations83
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.0 KiB
Average record size in memory73.5 B

Variable types

Categorical7
Text2

Dataset

Description청년매입임대주택 당첨자 및 예비자와 관련된 정보입니다. 자치구,구분,주택형,주소지,성별,당첨자 커트라인 신청순위,당첨자 커트라인,가점총점,예비자 커트라인 신청순위,예비자 커트라인 가점총점 으로 구성되어 있습니다
Author서울주택도시공사
URLhttps://www.data.go.kr/data/15102879/fileData.do

Alerts

구분 is highly overall correlated with 자치구 and 1 other fieldsHigh correlation
자치구 is highly overall correlated with 구분 and 1 other fieldsHigh correlation
성별 is highly overall correlated with 구분High correlation
예비자 커트라인 순위 is highly overall correlated with 자치구High correlation
성별 is highly imbalanced (50.7%)Imbalance

Reproduction

Analysis started2024-04-20 18:13:44.754397
Analysis finished2024-04-20 18:13:46.427350
Duration1.67 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.4%
Missing0
Missing (%)0.0%
Memory size792.0 B
신규공급
54 
재공급
29 

Length

Max length4
Median length4
Mean length3.6506024
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row재공급
2nd row신규공급
3rd row신규공급
4th row신규공급
5th row신규공급

Common Values

ValueCountFrequency (%)
신규공급 54
65.1%
재공급 29
34.9%

Length

2024-04-21T03:13:46.632993image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T03:13:46.956105image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
신규공급 54
65.1%
재공급 29
34.9%

자치구
Categorical

HIGH CORRELATION 

Distinct19
Distinct (%)22.9%
Missing0
Missing (%)0.0%
Memory size792.0 B
강동구
16 
구로구
11 
금천구
양천구
중랑구
Other values (14)
36 

Length

Max length4
Median length3
Mean length3.1445783
Min length3

Unique

Unique4 ?
Unique (%)4.8%

Sample

1st row강남구
2nd row강남구
3rd row강남구
4th row강남구
5th row강동구

Common Values

ValueCountFrequency (%)
강동구 16
19.3%
구로구 11
13.3%
금천구 8
9.6%
양천구 7
8.4%
중랑구 5
 
6.0%
성북구 5
 
6.0%
강남구 4
 
4.8%
동대문구 4
 
4.8%
서대문구 4
 
4.8%
영등포구 4
 
4.8%
Other values (9) 15
18.1%

Length

2024-04-21T03:13:47.312442image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
강동구 16
19.3%
구로구 11
13.3%
금천구 8
9.6%
양천구 7
8.4%
중랑구 5
 
6.0%
성북구 5
 
6.0%
서대문구 4
 
4.8%
영등포구 4
 
4.8%
동대문구 4
 
4.8%
강남구 4
 
4.8%
Other values (9) 15
18.1%
Distinct79
Distinct (%)95.2%
Missing0
Missing (%)0.0%
Memory size792.0 B
2024-04-21T03:13:48.245159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length15
Mean length11
Min length4

Characters and Unicode

Total characters913
Distinct characters114
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique76 ?
Unique (%)91.6%

Sample

1st row백년빌
2nd row백년빌 [26B]
3rd row백년빌 [33A]
4th row백년빌
5th row와이디하우스 [27B]
ValueCountFrequency (%)
27b 7
 
4.8%
28b 6
 
4.1%
에코리움빌 5
 
3.4%
백년빌 5
 
3.4%
도원휴먼빌 5
 
3.4%
b동 4
 
2.7%
20b 4
 
2.7%
22b 3
 
2.1%
신우휴먼빌암사2차 3
 
2.1%
강동서영스윗홈6 3
 
2.1%
Other values (66) 101
69.2%
2024-04-21T03:13:49.645896image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
145
 
15.9%
] 49
 
5.4%
[ 49
 
5.4%
2 46
 
5.0%
41
 
4.5%
B 37
 
4.1%
26
 
2.8%
23
 
2.5%
1 22
 
2.4%
3 22
 
2.4%
Other values (104) 453
49.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 451
49.4%
Decimal Number 160
 
17.5%
Space Separator 145
 
15.9%
Uppercase Letter 55
 
6.0%
Close Punctuation 49
 
5.4%
Open Punctuation 49
 
5.4%
Other Punctuation 4
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
41
 
9.1%
26
 
5.8%
23
 
5.1%
19
 
4.2%
18
 
4.0%
17
 
3.8%
14
 
3.1%
14
 
3.1%
12
 
2.7%
11
 
2.4%
Other values (87) 256
56.8%
Decimal Number
ValueCountFrequency (%)
2 46
28.7%
1 22
13.8%
3 22
13.8%
0 14
 
8.8%
7 14
 
8.8%
4 12
 
7.5%
8 11
 
6.9%
6 8
 
5.0%
5 7
 
4.4%
9 4
 
2.5%
Uppercase Letter
ValueCountFrequency (%)
B 37
67.3%
A 17
30.9%
C 1
 
1.8%
Space Separator
ValueCountFrequency (%)
145
100.0%
Close Punctuation
ValueCountFrequency (%)
] 49
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 49
100.0%
Other Punctuation
ValueCountFrequency (%)
, 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 451
49.4%
Common 407
44.6%
Latin 55
 
6.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
41
 
9.1%
26
 
5.8%
23
 
5.1%
19
 
4.2%
18
 
4.0%
17
 
3.8%
14
 
3.1%
14
 
3.1%
12
 
2.7%
11
 
2.4%
Other values (87) 256
56.8%
Common
ValueCountFrequency (%)
145
35.6%
] 49
 
12.0%
[ 49
 
12.0%
2 46
 
11.3%
1 22
 
5.4%
3 22
 
5.4%
0 14
 
3.4%
7 14
 
3.4%
4 12
 
2.9%
8 11
 
2.7%
Other values (4) 23
 
5.7%
Latin
ValueCountFrequency (%)
B 37
67.3%
A 17
30.9%
C 1
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 462
50.6%
Hangul 451
49.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
145
31.4%
] 49
 
10.6%
[ 49
 
10.6%
2 46
 
10.0%
B 37
 
8.0%
1 22
 
4.8%
3 22
 
4.8%
A 17
 
3.7%
0 14
 
3.0%
7 14
 
3.0%
Other values (7) 47
 
10.2%
Hangul
ValueCountFrequency (%)
41
 
9.1%
26
 
5.8%
23
 
5.1%
19
 
4.2%
18
 
4.0%
17
 
3.8%
14
 
3.1%
14
 
3.1%
12
 
2.7%
11
 
2.4%
Other values (87) 256
56.8%
Distinct52
Distinct (%)62.7%
Missing0
Missing (%)0.0%
Memory size792.0 B
2024-04-21T03:13:50.701404image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length25
Mean length22.204819
Min length9

Characters and Unicode

Total characters1843
Distinct characters112
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique26 ?
Unique (%)31.3%

Sample

1st row개포로24길 13, 개포동 1216-7
2nd row논현로57길 39, 도곡동 547-2
3rd row논현로57길 39, 도곡동 547-2
4th row도곡로18길 10, 도곡동 547-11
5th row구천면로35길 27, 천호동 391-21,26
ValueCountFrequency (%)
천호동 8
 
2.4%
신월동 7
 
2.1%
양재대로91길 5
 
1.5%
27 5
 
1.5%
성내동 5
 
1.5%
고척동 4
 
1.2%
독산동 4
 
1.2%
진황도로23길 4
 
1.2%
시흥동 4
 
1.2%
22 4
 
1.2%
Other values (167) 278
84.8%
2024-04-21T03:13:52.202941image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
245
 
13.3%
1 143
 
7.8%
2 125
 
6.8%
3 103
 
5.6%
- 101
 
5.5%
, 100
 
5.4%
90
 
4.9%
79
 
4.3%
75
 
4.1%
4 73
 
4.0%
Other values (102) 709
38.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 760
41.2%
Other Letter 637
34.6%
Space Separator 245
 
13.3%
Dash Punctuation 101
 
5.5%
Other Punctuation 100
 
5.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
90
 
14.1%
79
 
12.4%
75
 
11.8%
16
 
2.5%
15
 
2.4%
14
 
2.2%
12
 
1.9%
11
 
1.7%
10
 
1.6%
9
 
1.4%
Other values (89) 306
48.0%
Decimal Number
ValueCountFrequency (%)
1 143
18.8%
2 125
16.4%
3 103
13.6%
4 73
9.6%
6 64
8.4%
5 61
8.0%
8 54
 
7.1%
7 48
 
6.3%
0 45
 
5.9%
9 44
 
5.8%
Space Separator
ValueCountFrequency (%)
245
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 101
100.0%
Other Punctuation
ValueCountFrequency (%)
, 100
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1206
65.4%
Hangul 637
34.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
90
 
14.1%
79
 
12.4%
75
 
11.8%
16
 
2.5%
15
 
2.4%
14
 
2.2%
12
 
1.9%
11
 
1.7%
10
 
1.6%
9
 
1.4%
Other values (89) 306
48.0%
Common
ValueCountFrequency (%)
245
20.3%
1 143
11.9%
2 125
10.4%
3 103
8.5%
- 101
8.4%
, 100
8.3%
4 73
 
6.1%
6 64
 
5.3%
5 61
 
5.1%
8 54
 
4.5%
Other values (3) 137
11.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1206
65.4%
Hangul 637
34.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
245
20.3%
1 143
11.9%
2 125
10.4%
3 103
8.5%
- 101
8.4%
, 100
8.3%
4 73
 
6.1%
6 64
 
5.3%
5 61
 
5.1%
8 54
 
4.5%
Other values (3) 137
11.4%
Hangul
ValueCountFrequency (%)
90
 
14.1%
79
 
12.4%
75
 
11.8%
16
 
2.5%
15
 
2.4%
14
 
2.2%
12
 
1.9%
11
 
1.7%
10
 
1.6%
9
 
1.4%
Other values (89) 306
48.0%

성별
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Memory size792.0 B
성별무관
70 
남자
 
7
여자
 
6

Length

Max length4
Median length4
Mean length3.686747
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row성별무관
2nd row성별무관
3rd row성별무관
4th row성별무관
5th row성별무관

Common Values

ValueCountFrequency (%)
성별무관 70
84.3%
남자 7
 
8.4%
여자 6
 
7.2%

Length

2024-04-21T03:13:52.649964image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T03:13:53.003954image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
성별무관 70
84.3%
남자 7
 
8.4%
여자 6
 
7.2%
Distinct4
Distinct (%)4.8%
Missing0
Missing (%)0.0%
Memory size792.0 B
2순위
41 
1순위
33 
3순위
<NA>
 
2

Length

Max length4
Median length3
Mean length3.0240964
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1순위
2nd row1순위
3rd row1순위
4th row1순위
5th row2순위

Common Values

ValueCountFrequency (%)
2순위 41
49.4%
1순위 33
39.8%
3순위 7
 
8.4%
<NA> 2
 
2.4%

Length

2024-04-21T03:13:53.355097image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T03:13:53.685744image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2순위 41
49.4%
1순위 33
39.8%
3순위 7
 
8.4%
na 2
 
2.4%
Distinct11
Distinct (%)13.3%
Missing0
Missing (%)0.0%
Memory size792.0 B
5점
20 
7점
15 
2점
12 
4점
10 
3점
Other values (6)
18 

Length

Max length4
Median length2
Mean length2.0722892
Min length2

Unique

Unique1 ?
Unique (%)1.2%

Sample

1st row5점
2nd row4점
3rd row8점
4th row3점
5th row4점

Common Values

ValueCountFrequency (%)
5점 20
24.1%
7점 15
18.1%
2점 12
14.5%
4점 10
12.0%
3점 8
 
9.6%
6점 6
 
7.2%
8점 5
 
6.0%
0점 2
 
2.4%
10점 2
 
2.4%
<NA> 2
 
2.4%

Length

2024-04-21T03:13:54.083314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
5점 20
24.1%
7점 15
18.1%
2점 12
14.5%
4점 10
12.0%
3점 8
 
9.6%
6점 6
 
7.2%
8점 5
 
6.0%
0점 2
 
2.4%
10점 2
 
2.4%
na 2
 
2.4%

예비자 커트라인 순위
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)4.8%
Missing0
Missing (%)0.0%
Memory size792.0 B
2순위
42 
3순위
20 
1순위
14 
<NA>

Length

Max length4
Median length3
Mean length3.0843373
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1순위
2nd row2순위
3rd row1순위
4th row2순위
5th row2순위

Common Values

ValueCountFrequency (%)
2순위 42
50.6%
3순위 20
24.1%
1순위 14
 
16.9%
<NA> 7
 
8.4%

Length

2024-04-21T03:13:54.709734image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T03:13:55.051303image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2순위 42
50.6%
3순위 20
24.1%
1순위 14
 
16.9%
na 7
 
8.4%
Distinct9
Distinct (%)10.8%
Missing0
Missing (%)0.0%
Memory size792.0 B
5점
23 
2점
18 
3점
10 
4점
7점
Other values (4)
16 

Length

Max length4
Median length2
Mean length2.1686747
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0점
2nd row8점
3rd row5점
4th row8점
5th row3점

Common Values

ValueCountFrequency (%)
5점 23
27.7%
2점 18
21.7%
3점 10
12.0%
4점 9
 
10.8%
7점 7
 
8.4%
<NA> 7
 
8.4%
6점 4
 
4.8%
8점 3
 
3.6%
0점 2
 
2.4%

Length

2024-04-21T03:13:55.461243image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T03:13:55.850770image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
5점 23
27.7%
2점 18
21.7%
3점 10
12.0%
4점 9
 
10.8%
7점 7
 
8.4%
na 7
 
8.4%
6점 4
 
4.8%
8점 3
 
3.6%
0점 2
 
2.4%

Correlations

2024-04-21T03:13:56.130519image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분자치구주택명주소지성별당첨자 커트라인 순위당첨자 커트라인 가점총점예비자 커트라인 순위예비자 커트라인 가점총점
구분1.0000.8350.7611.0000.3610.2900.1760.2450.168
자치구0.8351.0000.9601.0000.6890.6600.3610.7630.362
주택명0.7610.9601.0000.0001.0000.9750.9470.0000.000
주소지1.0001.0000.0001.0000.8410.7600.6260.8610.640
성별0.3610.6891.0000.8411.0000.4800.6090.4310.238
당첨자 커트라인 순위0.2900.6600.9750.7600.4801.0000.0000.8260.543
당첨자 커트라인 가점총점0.1760.3610.9470.6260.6090.0001.0000.5800.000
예비자 커트라인 순위0.2450.7630.0000.8610.4310.8260.5801.0000.461
예비자 커트라인 가점총점0.1680.3620.0000.6400.2380.5430.0000.4611.000
2024-04-21T03:13:56.440516image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
당첨자 커트라인 순위당첨자 커트라인 가점총점예비자 커트라인 순위성별구분자치구예비자 커트라인 가점총점
당첨자 커트라인 순위1.0000.0000.4980.1900.4660.4030.393
당첨자 커트라인 가점총점0.0001.0000.4010.4310.1220.1220.000
예비자 커트라인 순위0.4980.4011.0000.1620.3970.5050.317
성별0.1900.4310.1621.0000.5700.4330.145
구분0.4660.1220.3970.5701.0000.6890.116
자치구0.4030.1220.5050.4330.6891.0000.135
예비자 커트라인 가점총점0.3930.0000.3170.1450.1160.1351.000
2024-04-21T03:13:56.732434image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분자치구성별당첨자 커트라인 순위당첨자 커트라인 가점총점예비자 커트라인 순위예비자 커트라인 가점총점
구분1.0000.6890.5700.4660.1220.3970.116
자치구0.6891.0000.4330.4030.1220.5050.135
성별0.5700.4331.0000.1900.4310.1620.145
당첨자 커트라인 순위0.4660.4030.1901.0000.0000.4980.393
당첨자 커트라인 가점총점0.1220.1220.4310.0001.0000.4010.000
예비자 커트라인 순위0.3970.5050.1620.4980.4011.0000.317
예비자 커트라인 가점총점0.1160.1350.1450.3930.0000.3171.000

Missing values

2024-04-21T03:13:45.809905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-21T03:13:46.256693image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구분자치구주택명주소지성별당첨자 커트라인 순위당첨자 커트라인 가점총점예비자 커트라인 순위예비자 커트라인 가점총점
0재공급강남구백년빌개포로24길 13, 개포동 1216-7성별무관1순위5점1순위0점
1신규공급강남구백년빌 [26B]논현로57길 39, 도곡동 547-2성별무관1순위4점2순위8점
2신규공급강남구백년빌 [33A]논현로57길 39, 도곡동 547-2성별무관1순위8점1순위5점
3신규공급강남구백년빌도곡로18길 10, 도곡동 547-11성별무관1순위3점2순위8점
4신규공급강동구와이디하우스 [27B]구천면로35길 27, 천호동 391-21,26성별무관2순위4점2순위3점
5신규공급강동구와이디하우스 [38A]구천면로35길 27, 천호동 391-21,26성별무관2순위5점2순위5점
6재공급강동구해담빌구천면로40길 7, 천호동 237-27남자2순위5점2순위2점
7신규공급강동구신우휴먼빌암사2차 [27B]암사길 83-13, 암사동 433-85,88성별무관2순위4점3순위6점
8신규공급강동구신우휴먼빌암사2차 [35A]암사길 83-13, 암사동 433-85,88성별무관2순위5점2순위2점
9신규공급강동구신우휴먼빌암사2차 [45A]암사길 83-13, 암사동 433-85,88성별무관1순위0점2순위5점
구분자치구주택명주소지성별당첨자 커트라인 순위당첨자 커트라인 가점총점예비자 커트라인 순위예비자 커트라인 가점총점
73재공급영등포구양평에스하임2 가동양평로28마길 24, 양평동6가 65성별무관1순위5점2순위8점
74재공급영등포구양평에스하임2 나동양평로28마길 24, 양평동6가 65성별무관1순위7점1순위5점
75신규공급은평구구산유앤아이하우스3 [20B]연서로11길 29-10, 구산동 24-13성별무관2순위5점2순위5점
76신규공급은평구구산유앤아이하우스3 [28B]연서로11길 29-10, 구산동 24-13성별무관2순위7점2순위5점
77재공급종로구다원빌지봉로12길 11, 숭인동 56-37여자1순위7점1순위5점
78신규공급중랑구도원휴먼빌 [18B]동일로123길 76, 중화동 315-31성별무관2순위7점2순위3점
79신규공급중랑구도원휴먼빌 [24B]동일로123길 76, 중화동 315-31성별무관1순위2점2순위5점
80신규공급중랑구도원휴먼빌 [19B]동일로169가길 27, 묵동 235-32성별무관2순위5점2순위5점
81신규공급중랑구도원휴먼빌 [28B]동일로169가길 27, 묵동 235-32성별무관1순위2점2순위5점
82신규공급중랑구도원휴먼빌 [37A]동일로169가길 27, 묵동 235-32성별무관1순위6점1순위2점