Dataset statistics
Number of variables | 7 |
---|---|
Number of observations | 10000 |
Missing cells | 9378 |
Missing cells (%) | 13.4% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 654.3 KiB |
Average record size in memory | 67.0 B |
Variable types
Text | 3 |
---|---|
Categorical | 4 |
Dataset
Description | 파일 다운로드 |
---|---|
Author | 서울특별시 |
URL | https://data.seoul.go.kr/dataList/OA-15658/S/1/datasetView.do |
작업_일자 has constant value "" | Constant |
지역지구구역_구분_코드 is highly overall correlated with 지역지구구역_코드 | High correlation |
지역지구구역_코드 is highly overall correlated with 지역지구구역_구분_코드 and 1 other fields | High correlation |
대표_여부 is highly overall correlated with 지역지구구역_코드 | High correlation |
지역지구구역_코드 is highly imbalanced (87.8%) | Imbalance |
대표_여부 is highly imbalanced (98.0%) | Imbalance |
기타_지역지구구역 has 9378 (93.8%) missing values | Missing |
관리_지역지구구역 has unique values | Unique |
Reproduction
Analysis started | 2024-05-17 21:49:00.304391 |
---|---|
Analysis finished | 2024-05-17 21:49:02.032788 |
Duration | 1.73 second |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
관리_지역지구구역
Text
UNIQUE
 
Distinct | 10000 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
Length
Max length | 15 |
---|---|
Median length | 11 |
Mean length | 10.5179 |
Min length | 7 |
Characters and Unicode
Total characters | 105179 |
---|---|
Distinct characters | 11 |
Distinct categories | 2 ? |
Distinct scripts | 1 ? |
Distinct blocks | 1 ? |
Unique
Unique | 10000 ? |
---|---|
Unique (%) | 100.0% |
Sample
1st row | 11290-21252 |
---|---|
2nd row | 11290-8383 |
3rd row | 11290-31656 |
4th row | 11380-471 |
5th row | 11290-22249 |
Value | Count | Frequency (%) |
11290-21252 | 1 | < 0.1% |
11290-12520 | 1 | < 0.1% |
11290-29825 | 1 | < 0.1% |
11410-2545 | 1 | < 0.1% |
11305-3533 | 1 | < 0.1% |
11290-17940 | 1 | < 0.1% |
11290-25991 | 1 | < 0.1% |
11350-3762 | 1 | < 0.1% |
11290-32011 | 1 | < 0.1% |
11290-24158 | 1 | < 0.1% |
Other values (9990) | 9990 |
Most occurring characters
Value | Count | Frequency (%) |
1 | 27951 | |
0 | 14148 | |
2 | 13448 | |
- | 10000 | 9.5% |
9 | 8773 | 8.3% |
3 | 8646 | 8.2% |
5 | 5570 | 5.3% |
4 | 4643 | 4.4% |
6 | 4584 | 4.4% |
8 | 3808 | 3.6% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 95179 | |
Dash Punctuation | 10000 | 9.5% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
1 | 27951 | |
0 | 14148 | |
2 | 13448 | |
9 | 8773 | 9.2% |
3 | 8646 | 9.1% |
5 | 5570 | 5.9% |
4 | 4643 | 4.9% |
6 | 4584 | 4.8% |
8 | 3808 | 4.0% |
7 | 3608 | 3.8% |
Dash Punctuation
Value | Count | Frequency (%) |
- | 10000 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 105179 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
1 | 27951 | |
0 | 14148 | |
2 | 13448 | |
- | 10000 | 9.5% |
9 | 8773 | 8.3% |
3 | 8646 | 8.2% |
5 | 5570 | 5.3% |
4 | 4643 | 4.4% |
6 | 4584 | 4.4% |
8 | 3808 | 3.6% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 105179 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
1 | 27951 | |
0 | 14148 | |
2 | 13448 | |
- | 10000 | 9.5% |
9 | 8773 | 8.3% |
3 | 8646 | 8.2% |
5 | 5570 | 5.3% |
4 | 4643 | 4.4% |
6 | 4584 | 4.4% |
8 | 3808 | 3.6% |
관리_폐쇄말소대장
Text
Distinct | 8514 |
---|---|
Distinct (%) | 85.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
Length
Max length | 15 |
---|---|
Median length | 10 |
Mean length | 10.2617 |
Min length | 7 |
Characters and Unicode
Total characters | 102617 |
---|---|
Distinct characters | 11 |
Distinct categories | 2 ? |
Distinct scripts | 1 ? |
Distinct blocks | 1 ? |
Unique
Unique | 7098 ? |
---|---|
Unique (%) | 71.0% |
Sample
1st row | 11290-10094 |
---|---|
2nd row | 11290-4357 |
3rd row | 11290-14169 |
4th row | 11380-4819 |
5th row | 11290-10426 |
Value | Count | Frequency (%) |
11290-10751 | 3 | < 0.1% |
11290-14469 | 3 | < 0.1% |
11290-9477 | 3 | < 0.1% |
11230-2499 | 3 | < 0.1% |
11290-6602 | 3 | < 0.1% |
11230-3705 | 3 | < 0.1% |
11290-7204 | 3 | < 0.1% |
11320-1436 | 3 | < 0.1% |
11290-12113 | 3 | < 0.1% |
11410-957 | 3 | < 0.1% |
Other values (8504) | 9970 |
Most occurring characters
Value | Count | Frequency (%) |
1 | 28429 | |
0 | 13811 | |
2 | 12067 | |
- | 10000 | 9.7% |
9 | 8406 | 8.2% |
3 | 7144 | 7.0% |
5 | 5739 | 5.6% |
4 | 4955 | 4.8% |
6 | 4716 | 4.6% |
8 | 3723 | 3.6% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 92617 | |
Dash Punctuation | 10000 | 9.7% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
1 | 28429 | |
0 | 13811 | |
2 | 12067 | |
9 | 8406 | 9.1% |
3 | 7144 | 7.7% |
5 | 5739 | 6.2% |
4 | 4955 | 5.3% |
6 | 4716 | 5.1% |
8 | 3723 | 4.0% |
7 | 3627 | 3.9% |
Dash Punctuation
Value | Count | Frequency (%) |
- | 10000 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 102617 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
1 | 28429 | |
0 | 13811 | |
2 | 12067 | |
- | 10000 | 9.7% |
9 | 8406 | 8.2% |
3 | 7144 | 7.0% |
5 | 5739 | 5.6% |
4 | 4955 | 4.8% |
6 | 4716 | 4.6% |
8 | 3723 | 3.6% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 102617 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
1 | 28429 | |
0 | 13811 | |
2 | 12067 | |
- | 10000 | 9.7% |
9 | 8406 | 8.2% |
3 | 7144 | 7.0% |
5 | 5739 | 5.6% |
4 | 4955 | 4.8% |
6 | 4716 | 4.6% |
8 | 3723 | 3.6% |
지역지구구역_구분_코드
Categorical
HIGH CORRELATION
 
Distinct | 3 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
1 | |
---|---|
2 | |
3 |
Length
Max length | 1 |
---|---|
Median length | 1 |
Mean length | 1 |
Min length | 1 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 3 |
---|---|
2nd row | 3 |
3rd row | 2 |
4th row | 2 |
5th row | 3 |
Common Values
Value | Count | Frequency (%) |
1 | 3510 | |
2 | 3332 | |
3 | 3158 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
1 | 3510 | |
2 | 3332 | |
3 | 3158 |
지역지구구역_코드
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 34 |
---|---|
Distinct (%) | 0.3% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
<NA> | |
---|---|
1020 | 298 |
260 | 119 |
070 | 84 |
1022 | 61 |
Other values (29) | 192 |
Length
Max length | 4 |
---|---|
Median length | 4 |
Mean length | 3.9711 |
Min length | 2 |
Unique
Unique | 9 ? |
---|---|
Unique (%) | 0.1% |
Sample
1st row | <NA> |
---|---|
2nd row | <NA> |
3rd row | <NA> |
4th row | 260 |
5th row | <NA> |
Common Values
Value | Count | Frequency (%) |
<NA> | 9246 | |
1020 | 298 | 3.0% |
260 | 119 | 1.2% |
070 | 84 | 0.8% |
1022 | 61 | 0.6% |
1330 | 50 | 0.5% |
103 | 22 | 0.2% |
1023 | 18 | 0.2% |
1030 | 14 | 0.1% |
1021 | 12 | 0.1% |
Other values (24) | 76 | 0.8% |
Length
Value | Count | Frequency (%) |
na | 9246 | |
1020 | 298 | 3.0% |
260 | 119 | 1.2% |
070 | 84 | 0.8% |
1022 | 61 | 0.6% |
1330 | 50 | 0.5% |
103 | 22 | 0.2% |
1023 | 18 | 0.2% |
1030 | 14 | 0.1% |
1021 | 12 | 0.1% |
Other values (24) | 76 | 0.8% |
대표_여부
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 3 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
1 | |
---|---|
0 | 31 |
<NA> | 1 |
Length
Max length | 4 |
---|---|
Median length | 1 |
Mean length | 1.0003 |
Min length | 1 |
Unique
Unique | 1 ? |
---|---|
Unique (%) | < 0.1% |
Sample
1st row | 1 |
---|---|
2nd row | 1 |
3rd row | 1 |
4th row | 1 |
5th row | 1 |
Common Values
Value | Count | Frequency (%) |
1 | 9968 | |
0 | 31 | 0.3% |
<NA> | 1 | < 0.1% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
1 | 9968 | |
0 | 31 | 0.3% |
na | 1 | < 0.1% |
기타_지역지구구역
Text
MISSING
 
Distinct | 55 |
---|---|
Distinct (%) | 8.8% |
Missing | 9378 |
Missing (%) | 93.8% |
Memory size | 156.2 KiB |
Value | Count | Frequency (%) |
일반주거지역 | 258 | |
주차장정비지구 | 101 | 16.1% |
자연녹지지역 | 50 | 8.0% |
일반주거 | 44 | 7.0% |
개발제한구역 | 32 | 5.1% |
주차장정비 | 18 | 2.9% |
제2종일반주거지역 | 13 | 2.1% |
2종일반주거지역 | 9 | 1.4% |
도시지역 | 9 | 1.4% |
준주거지역 | 9 | 1.4% |
Other values (46) | 85 | 13.5% |
Most occurring characters
Value | Count | Frequency (%) |
지 | 564 | |
주 | 475 | |
역 | 414 | |
일 | 359 | |
거 | 353 | |
반 | 347 | |
구 | 171 | 4.4% |
비 | 122 | 3.1% |
정 | 122 | 3.1% |
차 | 122 | 3.1% |
Other values (53) | 826 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 3799 | |
Decimal Number | 52 | 1.3% |
Other Punctuation | 16 | 0.4% |
Space Separator | 6 | 0.2% |
Close Punctuation | 1 | < 0.1% |
Open Punctuation | 1 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
지 | 564 | |
주 | 475 | |
역 | 414 | |
일 | 359 | |
거 | 353 | |
반 | 347 | |
구 | 171 | 4.5% |
비 | 122 | 3.2% |
정 | 122 | 3.2% |
차 | 122 | 3.2% |
Other values (45) | 750 |
Decimal Number
Value | Count | Frequency (%) |
2 | 26 | |
4 | 14 | |
3 | 6 | 11.5% |
1 | 6 | 11.5% |
Other Punctuation
Value | Count | Frequency (%) |
, | 16 |
Space Separator
Value | Count | Frequency (%) |
6 |
Close Punctuation
Value | Count | Frequency (%) |
) | 1 |
Open Punctuation
Value | Count | Frequency (%) |
( | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 3799 | |
Common | 76 | 2.0% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
지 | 564 | |
주 | 475 | |
역 | 414 | |
일 | 359 | |
거 | 353 | |
반 | 347 | |
구 | 171 | 4.5% |
비 | 122 | 3.2% |
정 | 122 | 3.2% |
차 | 122 | 3.2% |
Other values (45) | 750 |
Common
Value | Count | Frequency (%) |
2 | 26 | |
, | 16 | |
4 | 14 | |
6 | 7.9% | |
3 | 6 | 7.9% |
1 | 6 | 7.9% |
) | 1 | 1.3% |
( | 1 | 1.3% |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 3799 | |
ASCII | 76 | 2.0% |
Most frequent character per block
Hangul
Value | Count | Frequency (%) |
지 | 564 | |
주 | 475 | |
역 | 414 | |
일 | 359 | |
거 | 353 | |
반 | 347 | |
구 | 171 | 4.5% |
비 | 122 | 3.2% |
정 | 122 | 3.2% |
차 | 122 | 3.2% |
Other values (45) | 750 |
ASCII
Value | Count | Frequency (%) |
2 | 26 | |
, | 16 | |
4 | 14 | |
6 | 7.9% | |
3 | 6 | 7.9% |
1 | 6 | 7.9% |
) | 1 | 1.3% |
( | 1 | 1.3% |
작업_일자
Categorical
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
20111227 |
---|
Length
Max length | 8 |
---|---|
Median length | 8 |
Mean length | 8 |
Min length | 8 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 20111227 |
---|---|
2nd row | 20111227 |
3rd row | 20111227 |
4th row | 20111227 |
5th row | 20111227 |
Common Values
Value | Count | Frequency (%) |
20111227 | 10000 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
20111227 | 10000 |
지역지구구역_구분_코드 | 지역지구구역_코드 | 대표_여부 | 기타_지역지구구역 | |
---|---|---|---|---|
지역지구구역_구분_코드 | 1.000 | 0.988 | 0.005 | 1.000 |
지역지구구역_코드 | 0.988 | 1.000 | 0.738 | 0.998 |
대표_여부 | 0.005 | 0.738 | 1.000 | 0.780 |
기타_지역지구구역 | 1.000 | 0.998 | 0.780 | 1.000 |
지역지구구역_코드 | 지역지구구역_구분_코드 | 대표_여부 | |
---|---|---|---|
지역지구구역_코드 | 1.000 | 0.867 | 0.629 |
지역지구구역_구분_코드 | 0.867 | 1.000 | 0.009 |
대표_여부 | 0.629 | 0.009 | 1.000 |
지역지구구역_구분_코드 | 지역지구구역_코드 | 대표_여부 | |
---|---|---|---|
지역지구구역_구분_코드 | 1.000 | 0.867 | 0.009 |
지역지구구역_코드 | 0.867 | 1.000 | 0.629 |
대표_여부 | 0.009 | 0.629 | 1.000 |
관리_지역지구구역 | 관리_폐쇄말소대장 | 지역지구구역_구분_코드 | 지역지구구역_코드 | 대표_여부 | 기타_지역지구구역 | 작업_일자 | |
---|---|---|---|---|---|---|---|
33237 | 11290-21252 | 11290-10094 | 3 | <NA> | 1 | <NA> | 20111227 |
21520 | 11290-8383 | 11290-4357 | 3 | <NA> | 1 | <NA> | 20111227 |
29404 | 11290-31656 | 11290-14169 | 2 | <NA> | 1 | <NA> | 20111227 |
58948 | 11380-471 | 11380-4819 | 2 | 260 | 1 | 주차장정비지구 | 20111227 |
23883 | 11290-22249 | 11290-10426 | 3 | <NA> | 1 | <NA> | 20111227 |
48496 | 11305-5656 | 11305-3170 | 3 | <NA> | 1 | <NA> | 20111227 |
29163 | 11290-40562 | 11290-17532 | 2 | <NA> | 1 | <NA> | 20111227 |
162 | 11290-26171 | 11290-12074 | 1 | <NA> | 1 | <NA> | 20111227 |
13352 | 11290-24160 | 11290-11095 | 3 | <NA> | 1 | <NA> | 20111227 |
39788 | 11320-4773 | 11320-1911 | 1 | <NA> | 1 | <NA> | 20111227 |
관리_지역지구구역 | 관리_폐쇄말소대장 | 지역지구구역_구분_코드 | 지역지구구역_코드 | 대표_여부 | 기타_지역지구구역 | 작업_일자 | |
---|---|---|---|---|---|---|---|
33305 | 11170-3131 | 11170-3579 | 2 | <NA> | 1 | <NA> | 20111227 |
39438 | 11260-490 | 11260-496 | 2 | <NA> | 1 | <NA> | 20111227 |
891 | 11230-2598 | 11230-1712 | 1 | <NA> | 1 | <NA> | 20111227 |
61390 | 11380-1210 | 11380-6704 | 1 | 1020 | 1 | 일반주거지역 | 20111227 |
53257 | 11305-9078 | 11305-4691 | 2 | <NA> | 1 | <NA> | 20111227 |
4292 | 11110-5588 | 11110-2509 | 1 | <NA> | 1 | <NA> | 20111227 |
23396 | 11290-8596 | 11290-4428 | 3 | <NA> | 1 | <NA> | 20111227 |
4142 | 11230-11052 | 11230-4871 | 1 | <NA> | 1 | <NA> | 20111227 |
8500 | 11215-5254 | 11215-2355 | 1 | <NA> | 1 | <NA> | 20111227 |
17205 | 11290-22595 | 11290-10542 | 1 | <NA> | 1 | <NA> | 20111227 |