Overview

Dataset statistics

Number of variables4
Number of observations183
Missing cells370
Missing cells (%)50.5%
Duplicate rows11
Duplicate rows (%)6.0%
Total size in memory6.2 KiB
Average record size in memory34.7 B

Variable types

Unsupported1
Numeric1
Categorical1
Text1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-2743/F/1/datasetView.do

Alerts

Dataset has 11 (6.0%) duplicate rowsDuplicates
2017년 서울시민 안전의식에 대한 여론조사 is highly overall correlated with Unnamed: 2High correlation
Unnamed: 2 is highly overall correlated with 2017년 서울시민 안전의식에 대한 여론조사High correlation
Unnamed: 0 has 183 (100.0%) missing valuesMissing
2017년 서울시민 안전의식에 대한 여론조사 has 150 (82.0%) missing valuesMissing
Unnamed: 3 has 37 (20.2%) missing valuesMissing
Unnamed: 0 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-11 05:25:29.602751
Analysis finished2023-12-11 05:25:30.384556
Duration0.78 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Unnamed: 0
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing183
Missing (%)100.0%
Memory size1.7 KiB

2017년 서울시민 안전의식에 대한 여론조사
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct33
Distinct (%)100.0%
Missing150
Missing (%)82.0%
Infinite0
Infinite (%)0.0%
Mean17
Minimum1
Maximum33
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2023-12-11T14:25:30.464681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2.6
Q19
median17
Q325
95-th percentile31.4
Maximum33
Range32
Interquartile range (IQR)16

Descriptive statistics

Standard deviation9.6695398
Coefficient of variation (CV)0.56879646
Kurtosis-1.2
Mean17
Median Absolute Deviation (MAD)8
Skewness0
Sum561
Variance93.5
MonotonicityStrictly increasing
2023-12-11T14:25:30.683611image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=33)
ValueCountFrequency (%)
26 1
 
0.5%
20 1
 
0.5%
21 1
 
0.5%
22 1
 
0.5%
23 1
 
0.5%
24 1
 
0.5%
25 1
 
0.5%
27 1
 
0.5%
2 1
 
0.5%
28 1
 
0.5%
Other values (23) 23
 
12.6%
(Missing) 150
82.0%
ValueCountFrequency (%)
1 1
0.5%
2 1
0.5%
3 1
0.5%
4 1
0.5%
5 1
0.5%
6 1
0.5%
7 1
0.5%
8 1
0.5%
9 1
0.5%
10 1
0.5%
ValueCountFrequency (%)
33 1
0.5%
32 1
0.5%
31 1
0.5%
30 1
0.5%
29 1
0.5%
28 1
0.5%
27 1
0.5%
26 1
0.5%
25 1
0.5%
24 1
0.5%

Unnamed: 2
Categorical

HIGH CORRELATION 

Distinct43
Distinct (%)23.5%
Missing0
Missing (%)0.0%
Memory size1.6 KiB
1
33 
2
32 
3
30 
4
29 
5
10 
Other values (38)
49 

Length

Max length69
Median length1
Mean length8.6448087
Min length1

Unique

Unique34 ?
Unique (%)18.6%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row귀하께서는 평소 일상생활에서 안전성과 편리성 중 어떤 요소를 더 우선에 두고 생활하십니까?

Common Values

ValueCountFrequency (%)
1 33
18.0%
2 32
17.5%
3 30
16.4%
4 29
15.8%
5 10
 
5.5%
99 5
 
2.7%
<NA> 4
 
2.2%
6 4
 
2.2%
7 2
 
1.1%
평소 일상생활에서 시설물 안전사고(대형건축물, 공사장, 한강교량 등)로부터 얼마나 안전하다고 생각하십니까? 1
 
0.5%
Other values (33) 33
18.0%

Length

2023-12-11T14:25:30.923751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1 33
 
7.4%
2 32
 
7.2%
3 30
 
6.7%
4 29
 
6.5%
생각하십니까 15
 
3.4%
평소 14
 
3.1%
귀하께서는 11
 
2.5%
일상생활에서 11
 
2.5%
얼마나 11
 
2.5%
5 10
 
2.2%
Other values (174) 249
56.0%

Unnamed: 3
Text

MISSING 

Distinct105
Distinct (%)71.9%
Missing37
Missing (%)20.2%
Memory size1.6 KiB
2023-12-11T14:25:31.426820image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length45
Median length24
Mean length9.7123288
Min length2

Characters and Unicode

Total characters1418
Distinct characters206
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique94 ?
Unique (%)64.4%

Sample

1st row안전성이 최우선이다
2nd row안전성이 다소 우선이다
3rd row안전성과 편리성이 비슷하다
4th row편리성이 다소 우선이다
5th row편리성이 최우선이다
ValueCountFrequency (%)
매우 26
 
6.6%
편이다 23
 
5.9%
대체로 20
 
5.1%
있다 9
 
2.3%
안전하다 9
 
2.3%
위험하다 9
 
2.3%
안전한 9
 
2.3%
위험한 9
 
2.3%
이상 5
 
1.3%
5
 
1.3%
Other values (182) 267
68.3%
2023-12-11T14:25:32.138915image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
252
 
17.8%
95
 
6.7%
43
 
3.0%
39
 
2.8%
38
 
2.7%
37
 
2.6%
34
 
2.4%
32
 
2.3%
30
 
2.1%
29
 
2.0%
Other values (196) 789
55.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1093
77.1%
Space Separator 252
 
17.8%
Other Punctuation 31
 
2.2%
Decimal Number 20
 
1.4%
Close Punctuation 8
 
0.6%
Open Punctuation 8
 
0.6%
Math Symbol 5
 
0.4%
Control 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
95
 
8.7%
43
 
3.9%
39
 
3.6%
38
 
3.5%
37
 
3.4%
34
 
3.1%
32
 
2.9%
30
 
2.7%
29
 
2.7%
27
 
2.5%
Other values (182) 689
63.0%
Decimal Number
ValueCountFrequency (%)
1 7
35.0%
3 3
15.0%
2 3
15.0%
6 3
15.0%
5 3
15.0%
4 1
 
5.0%
Other Punctuation
ValueCountFrequency (%)
, 21
67.7%
/ 8
 
25.8%
· 2
 
6.5%
Space Separator
ValueCountFrequency (%)
252
100.0%
Close Punctuation
ValueCountFrequency (%)
) 8
100.0%
Open Punctuation
ValueCountFrequency (%)
( 8
100.0%
Math Symbol
ValueCountFrequency (%)
~ 5
100.0%
Control
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1093
77.1%
Common 325
 
22.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
95
 
8.7%
43
 
3.9%
39
 
3.6%
38
 
3.5%
37
 
3.4%
34
 
3.1%
32
 
2.9%
30
 
2.7%
29
 
2.7%
27
 
2.5%
Other values (182) 689
63.0%
Common
ValueCountFrequency (%)
252
77.5%
, 21
 
6.5%
/ 8
 
2.5%
) 8
 
2.5%
( 8
 
2.5%
1 7
 
2.2%
~ 5
 
1.5%
3 3
 
0.9%
2 3
 
0.9%
6 3
 
0.9%
Other values (4) 7
 
2.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1093
77.1%
ASCII 323
 
22.8%
None 2
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
252
78.0%
, 21
 
6.5%
/ 8
 
2.5%
) 8
 
2.5%
( 8
 
2.5%
1 7
 
2.2%
~ 5
 
1.5%
3 3
 
0.9%
2 3
 
0.9%
6 3
 
0.9%
Other values (3) 5
 
1.5%
Hangul
ValueCountFrequency (%)
95
 
8.7%
43
 
3.9%
39
 
3.6%
38
 
3.5%
37
 
3.4%
34
 
3.1%
32
 
2.9%
30
 
2.7%
29
 
2.7%
27
 
2.5%
Other values (182) 689
63.0%
None
ValueCountFrequency (%)
· 2
100.0%

Interactions

2023-12-11T14:25:29.835113image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T14:25:32.309089image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2017년 서울시민 안전의식에 대한 여론조사Unnamed: 2
2017년 서울시민 안전의식에 대한 여론조사1.0001.000
Unnamed: 21.0001.000
2023-12-11T14:25:32.454166image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2017년 서울시민 안전의식에 대한 여론조사Unnamed: 2
2017년 서울시민 안전의식에 대한 여론조사1.0001.000
Unnamed: 21.0001.000

Missing values

2023-12-11T14:25:30.018323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T14:25:30.179700image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T14:25:30.305070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

Unnamed: 02017년 서울시민 안전의식에 대한 여론조사Unnamed: 2Unnamed: 3
0<NA><NA><NA><NA>
1<NA><NA><NA><NA>
2<NA><NA><NA><NA>
3<NA><NA><NA><NA>
4<NA>1귀하께서는 평소 일상생활에서 안전성과 편리성 중 어떤 요소를 더 우선에 두고 생활하십니까?<NA>
5<NA><NA>1안전성이 최우선이다
6<NA><NA>2안전성이 다소 우선이다
7<NA><NA>3안전성과 편리성이 비슷하다
8<NA><NA>4편리성이 다소 우선이다
9<NA><NA>5편리성이 최우선이다
Unnamed: 02017년 서울시민 안전의식에 대한 여론조사Unnamed: 2Unnamed: 3
173<NA><NA>6학생
174<NA><NA>7무직/기타
175<NA><NA>8응답하고 싶지 않음
176<NA>33귀하께서 거주하고 계신 지역은 다음 중 어디입니까?<NA>
177<NA><NA>1도심권 (종로구, 중구, 용산구)
178<NA><NA>2서북권 (은평구, 서대문구, 마포구)
179<NA><NA>3동북권 (성동구, 광진구, 동대문구, 중랑구, 성북구, 강북구, 도봉구, 노원구)
180<NA><NA>4서남권 (양천구, 강서구, 구로구, 금천구, 영등포구, 동작구, 관악구)
181<NA><NA>5동남권 (서초구, 강남구, 송파구, 강동구)
182<NA><NA>6서울외 (경기, 인천 등)

Duplicate rows

Most frequently occurring

2017년 서울시민 안전의식에 대한 여론조사Unnamed: 2Unnamed: 3# duplicates
0<NA>1매우 안전하다9
6<NA>3대체로 위험한 편이다9
7<NA>4매우 위험하다9
3<NA>2대체로 안전한 편이다8
9<NA>99기타5
10<NA><NA><NA>4
1<NA>1매우 잘 안다2
2<NA>1소화기가 있다2
4<NA>2대체로 알고 있는 편이다2
5<NA>2소화기가 없다2