Overview

Dataset statistics

Number of variables6
Number of observations511
Missing cells478
Missing cells (%)15.6%
Duplicate rows2
Duplicate rows (%)0.4%
Total size in memory24.1 KiB
Average record size in memory48.3 B

Variable types

Text4
Categorical2

Dataset

Description국토안전관리원 COSMIS시스템에서 제공하는 시설물사고사례(위험발생객체,위험발생위치,작업프로세스,인적피해,물적피해,사고원인) 입니다.
Author국토안전관리원
URLhttps://www.data.go.kr/data/15053338/fileData.do

Alerts

Dataset has 2 (0.4%) duplicate rowsDuplicates
위험발생위치 has 311 (60.9%) missing valuesMissing
작업프로세스 has 94 (18.4%) missing valuesMissing
사고원인 has 73 (14.3%) missing valuesMissing

Reproduction

Analysis started2023-12-12 18:54:23.106458
Analysis finished2023-12-12 18:54:24.400944
Duration1.29 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct146
Distinct (%)28.6%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
2023-12-13T03:54:24.600010image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length11
Mean length4.2876712
Min length2

Characters and Unicode

Total characters2191
Distinct characters196
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique92 ?
Unique (%)18.0%

Sample

1st rowACS 폼
2nd rowACS 폼
3rd rowPHC 말뚝
4th rowPSC궤도빔
5th rowPSC 빔
ValueCountFrequency (%)
굴착사면 64
 
11.1%
거푸집 44
 
7.6%
흙막이가시설 41
 
7.1%
슬래브 19
 
3.3%
이동식 19
 
3.3%
가설구조물 18
 
3.1%
터널 17
 
2.9%
거푸집동바리 15
 
2.6%
막장면 14
 
2.4%
타워크레인 12
 
2.1%
Other values (148) 315
54.5%
2023-12-13T03:54:25.277444image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
97
 
4.4%
88
 
4.0%
84
 
3.8%
70
 
3.2%
69
 
3.1%
68
 
3.1%
68
 
3.1%
67
 
3.1%
66
 
3.0%
65
 
3.0%
Other values (186) 1449
66.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2106
96.1%
Space Separator 67
 
3.1%
Uppercase Letter 18
 
0.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
97
 
4.6%
88
 
4.2%
84
 
4.0%
70
 
3.3%
69
 
3.3%
68
 
3.2%
68
 
3.2%
66
 
3.1%
65
 
3.1%
59
 
2.8%
Other values (179) 1372
65.1%
Uppercase Letter
ValueCountFrequency (%)
C 6
33.3%
S 5
27.8%
P 3
16.7%
A 2
 
11.1%
R 1
 
5.6%
H 1
 
5.6%
Space Separator
ValueCountFrequency (%)
67
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2106
96.1%
Common 67
 
3.1%
Latin 18
 
0.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
97
 
4.6%
88
 
4.2%
84
 
4.0%
70
 
3.3%
69
 
3.3%
68
 
3.2%
68
 
3.2%
66
 
3.1%
65
 
3.1%
59
 
2.8%
Other values (179) 1372
65.1%
Latin
ValueCountFrequency (%)
C 6
33.3%
S 5
27.8%
P 3
16.7%
A 2
 
11.1%
R 1
 
5.6%
H 1
 
5.6%
Common
ValueCountFrequency (%)
67
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2106
96.1%
ASCII 85
 
3.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
97
 
4.6%
88
 
4.2%
84
 
4.0%
70
 
3.3%
69
 
3.3%
68
 
3.2%
68
 
3.2%
66
 
3.1%
65
 
3.1%
59
 
2.8%
Other values (179) 1372
65.1%
ASCII
ValueCountFrequency (%)
67
78.8%
C 6
 
7.1%
S 5
 
5.9%
P 3
 
3.5%
A 2
 
2.4%
R 1
 
1.2%
H 1
 
1.2%

위험발생위치
Text

MISSING 

Distinct64
Distinct (%)32.0%
Missing311
Missing (%)60.9%
Memory size4.1 KiB
2023-12-13T03:54:25.682620image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length2
Mean length2.93
Min length1

Characters and Unicode

Total characters586
Distinct characters119
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique46 ?
Unique (%)23.0%

Sample

1st row고소
2nd row고소
3rd row상부
4th row고소
5th row호수주변
ValueCountFrequency (%)
고소 53
23.7%
벽체 30
 
13.4%
상부 19
 
8.5%
천단부 17
 
7.6%
하부 8
 
3.6%
대심도 6
 
2.7%
비계 5
 
2.2%
코너 4
 
1.8%
교대 3
 
1.3%
슬래브 3
 
1.3%
Other values (58) 76
33.9%
2023-12-13T03:54:26.346385image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
58
 
9.9%
53
 
9.0%
53
 
9.0%
33
 
5.6%
30
 
5.1%
24
 
4.1%
20
 
3.4%
20
 
3.4%
19
 
3.2%
14
 
2.4%
Other values (109) 262
44.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 562
95.9%
Space Separator 24
 
4.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
58
 
10.3%
53
 
9.4%
53
 
9.4%
33
 
5.9%
30
 
5.3%
20
 
3.6%
20
 
3.6%
19
 
3.4%
14
 
2.5%
11
 
2.0%
Other values (108) 251
44.7%
Space Separator
ValueCountFrequency (%)
24
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 562
95.9%
Common 24
 
4.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
58
 
10.3%
53
 
9.4%
53
 
9.4%
33
 
5.9%
30
 
5.3%
20
 
3.6%
20
 
3.6%
19
 
3.4%
14
 
2.5%
11
 
2.0%
Other values (108) 251
44.7%
Common
ValueCountFrequency (%)
24
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 562
95.9%
ASCII 24
 
4.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
58
 
10.3%
53
 
9.4%
53
 
9.4%
33
 
5.9%
30
 
5.3%
20
 
3.6%
20
 
3.6%
19
 
3.4%
14
 
2.5%
11
 
2.0%
Other values (108) 251
44.7%
ASCII
ValueCountFrequency (%)
24
100.0%

작업프로세스
Text

MISSING 

Distinct99
Distinct (%)23.7%
Missing94
Missing (%)18.4%
Memory size4.1 KiB
2023-12-13T03:54:26.802163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length2
Mean length2.6618705
Min length2

Characters and Unicode

Total characters1110
Distinct characters144
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique48 ?
Unique (%)11.5%

Sample

1st row인상
2nd row인상
3rd row인양
4th row설치
5th row거치
ValueCountFrequency (%)
타설 50
 
10.9%
해체 45
 
9.8%
설치 37
 
8.1%
굴착 35
 
7.6%
인양 25
 
5.4%
정리 17
 
3.7%
이동 16
 
3.5%
16
 
3.5%
운반 13
 
2.8%
철근조립 12
 
2.6%
Other values (94) 193
42.0%
2023-12-13T03:54:27.389018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
107
 
9.6%
54
 
4.9%
46
 
4.1%
45
 
4.1%
42
 
3.8%
42
 
3.8%
37
 
3.3%
37
 
3.3%
36
 
3.2%
34
 
3.1%
Other values (134) 630
56.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1027
92.5%
Space Separator 42
 
3.8%
Uppercase Letter 37
 
3.3%
Lowercase Letter 3
 
0.3%
Other Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
107
 
10.4%
54
 
5.3%
46
 
4.5%
45
 
4.4%
42
 
4.1%
37
 
3.6%
37
 
3.6%
36
 
3.5%
34
 
3.3%
29
 
2.8%
Other values (122) 560
54.5%
Uppercase Letter
ValueCountFrequency (%)
M 11
29.7%
S 10
27.0%
C 6
16.2%
F 6
16.2%
P 2
 
5.4%
R 1
 
2.7%
D 1
 
2.7%
Lowercase Letter
ValueCountFrequency (%)
i 1
33.3%
l 1
33.3%
e 1
33.3%
Space Separator
ValueCountFrequency (%)
42
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1027
92.5%
Common 43
 
3.9%
Latin 40
 
3.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
107
 
10.4%
54
 
5.3%
46
 
4.5%
45
 
4.4%
42
 
4.1%
37
 
3.6%
37
 
3.6%
36
 
3.5%
34
 
3.3%
29
 
2.8%
Other values (122) 560
54.5%
Latin
ValueCountFrequency (%)
M 11
27.5%
S 10
25.0%
C 6
15.0%
F 6
15.0%
P 2
 
5.0%
R 1
 
2.5%
D 1
 
2.5%
i 1
 
2.5%
l 1
 
2.5%
e 1
 
2.5%
Common
ValueCountFrequency (%)
42
97.7%
/ 1
 
2.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1027
92.5%
ASCII 83
 
7.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
107
 
10.4%
54
 
5.3%
46
 
4.5%
45
 
4.4%
42
 
4.1%
37
 
3.6%
37
 
3.6%
36
 
3.5%
34
 
3.3%
29
 
2.8%
Other values (122) 560
54.5%
ASCII
ValueCountFrequency (%)
42
50.6%
M 11
 
13.3%
S 10
 
12.0%
C 6
 
7.2%
F 6
 
7.2%
P 2
 
2.4%
R 1
 
1.2%
/ 1
 
1.2%
D 1
 
1.2%
i 1
 
1.2%
Other values (2) 2
 
2.4%

인적피해
Categorical

Distinct13
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
매몰
153 
추락
137 
<NA>
83 
충돌
75 
협착
41 
Other values (8)
22 

Length

Max length8
Median length2
Mean length2.3483366
Min length2

Unique

Unique1 ?
Unique (%)0.2%

Sample

1st row추락
2nd row추락
3rd row충돌
4th row추락
5th row추락

Common Values

ValueCountFrequency (%)
매몰 153
29.9%
추락 137
26.8%
<NA> 83
16.2%
충돌 75
14.7%
협착 41
 
8.0%
전락 7
 
1.4%
익사 3
 
0.6%
폭발 3
 
0.6%
절단 2
 
0.4%
재해자 안구피해 2
 
0.4%
Other values (3) 5
 
1.0%

Length

2023-12-13T03:54:27.568444image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
매몰 153
29.8%
추락 137
26.7%
na 83
16.2%
충돌 75
14.6%
협착 41
 
8.0%
전락 7
 
1.4%
익사 3
 
0.6%
폭발 3
 
0.6%
절단 2
 
0.4%
재해자 2
 
0.4%
Other values (4) 7
 
1.4%

물적피해
Categorical

Distinct16
Distinct (%)3.1%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
붕괴
221 
낙하
80 
전도
80 
<NA>
58 
탈락
24 
Other values (11)
48 

Length

Max length4
Median length2
Mean length2.2270059
Min length2

Unique

Unique2 ?
Unique (%)0.4%

Sample

1st row낙하
2nd row낙하
3rd row낙하
4th row전도
5th row낙하

Common Values

ValueCountFrequency (%)
붕괴 221
43.2%
낙하 80
 
15.7%
전도 80
 
15.7%
<NA> 58
 
11.4%
탈락 24
 
4.7%
낙석 12
 
2.3%
파손 11
 
2.2%
충돌 8
 
1.6%
균열 4
 
0.8%
파괴 3
 
0.6%
Other values (6) 10
 
2.0%

Length

2023-12-13T03:54:27.765825image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
붕괴 221
43.2%
낙하 80
 
15.7%
전도 80
 
15.7%
na 58
 
11.4%
탈락 24
 
4.7%
낙석 12
 
2.3%
파손 11
 
2.2%
충돌 8
 
1.6%
균열 4
 
0.8%
파괴 3
 
0.6%
Other values (6) 10
 
2.0%

사고원인
Text

MISSING 

Distinct303
Distinct (%)69.2%
Missing73
Missing (%)14.3%
Memory size4.1 KiB
2023-12-13T03:54:28.278080image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length25
Mean length9.5844749
Min length2

Characters and Unicode

Total characters4198
Distinct characters321
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique260 ?
Unique (%)59.4%

Sample

1st row탈락
2nd row근입심도부족(Anchor볼트)
3rd row와이어로프 이탈
4th row인력으로 안전작업대 설치
5th row인양로프 해체방법 미흡
ValueCountFrequency (%)
미흡 64
 
5.6%
52
 
4.6%
기울기 31
 
2.7%
굴착면 30
 
2.6%
강도 25
 
2.2%
미설치 24
 
2.1%
상부앵커 22
 
1.9%
탈락 16
 
1.4%
미준수 16
 
1.4%
설치 15
 
1.3%
Other values (490) 846
74.1%
2023-12-13T03:54:28.963913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
703
 
16.7%
156
 
3.7%
94
 
2.2%
88
 
2.1%
71
 
1.7%
70
 
1.7%
68
 
1.6%
65
 
1.5%
65
 
1.5%
61
 
1.5%
Other values (311) 2757
65.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3444
82.0%
Space Separator 703
 
16.7%
Other Punctuation 21
 
0.5%
Close Punctuation 9
 
0.2%
Open Punctuation 9
 
0.2%
Lowercase Letter 5
 
0.1%
Uppercase Letter 4
 
0.1%
Decimal Number 3
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
156
 
4.5%
94
 
2.7%
88
 
2.6%
71
 
2.1%
70
 
2.0%
68
 
2.0%
65
 
1.9%
65
 
1.9%
61
 
1.8%
61
 
1.8%
Other values (295) 2645
76.8%
Lowercase Letter
ValueCountFrequency (%)
r 1
20.0%
o 1
20.0%
h 1
20.0%
c 1
20.0%
n 1
20.0%
Uppercase Letter
ValueCountFrequency (%)
C 1
25.0%
P 1
25.0%
B 1
25.0%
A 1
25.0%
Other Punctuation
ValueCountFrequency (%)
/ 10
47.6%
, 9
42.9%
. 2
 
9.5%
Space Separator
ValueCountFrequency (%)
703
100.0%
Close Punctuation
ValueCountFrequency (%)
) 9
100.0%
Open Punctuation
ValueCountFrequency (%)
( 9
100.0%
Decimal Number
ValueCountFrequency (%)
1 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3444
82.0%
Common 745
 
17.7%
Latin 9
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
156
 
4.5%
94
 
2.7%
88
 
2.6%
71
 
2.1%
70
 
2.0%
68
 
2.0%
65
 
1.9%
65
 
1.9%
61
 
1.8%
61
 
1.8%
Other values (295) 2645
76.8%
Latin
ValueCountFrequency (%)
C 1
11.1%
P 1
11.1%
B 1
11.1%
A 1
11.1%
r 1
11.1%
o 1
11.1%
h 1
11.1%
c 1
11.1%
n 1
11.1%
Common
ValueCountFrequency (%)
703
94.4%
/ 10
 
1.3%
) 9
 
1.2%
( 9
 
1.2%
, 9
 
1.2%
1 3
 
0.4%
. 2
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3444
82.0%
ASCII 754
 
18.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
703
93.2%
/ 10
 
1.3%
) 9
 
1.2%
( 9
 
1.2%
, 9
 
1.2%
1 3
 
0.4%
. 2
 
0.3%
C 1
 
0.1%
P 1
 
0.1%
B 1
 
0.1%
Other values (6) 6
 
0.8%
Hangul
ValueCountFrequency (%)
156
 
4.5%
94
 
2.7%
88
 
2.6%
71
 
2.1%
70
 
2.0%
68
 
2.0%
65
 
1.9%
65
 
1.9%
61
 
1.8%
61
 
1.8%
Other values (295) 2645
76.8%

Correlations

2023-12-13T03:54:29.120117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
위험발생위치작업프로세스인적피해물적피해
위험발생위치1.0000.9310.9750.928
작업프로세스0.9311.0000.8440.710
인적피해0.9750.8441.0000.711
물적피해0.9280.7100.7111.000
2023-12-13T03:54:29.718812image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
물적피해인적피해
물적피해1.0000.445
인적피해0.4451.000
2023-12-13T03:54:29.837249image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
인적피해물적피해
인적피해1.0000.445
물적피해0.4451.000

Missing values

2023-12-13T03:54:23.901173image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T03:54:24.104986image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T03:54:24.294658image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

위험발생객체위험발생위치작업프로세스인적피해물적피해사고원인
0ACS 폼고소인상추락낙하탈락
1ACS 폼고소인상추락낙하근입심도부족(Anchor볼트)
2PHC 말뚝<NA>인양충돌낙하와이어로프 이탈
3PSC궤도빔상부설치추락전도인력으로 안전작업대 설치
4PSC 빔<NA>거치추락낙하인양로프 해체방법 미흡
5RCS 발판고소인양충돌<NA><NA>
6가도지반 및 석축호수주변하역익사붕괴운반시 지반상태 이상유무 확인미흡
7가물막이<NA>해체매몰붕괴보강재 미설치
8가설철구조물<NA><NA>추락붕괴서포트 충돌
9가시설<NA><NA>매몰붕괴시공계획서 및 시공상세도 미준수
위험발생객체위험발생위치작업프로세스인적피해물적피해사고원인
501흙막이가시설코너 스트러트굴착<NA>붕괴지하수압 증가
502흙막이가시설코너 스트러트굴착<NA>붕괴지형조건과 설계해석조건 상이
503거푸집<NA>해체매몰붕괴버팀목 미설치
504거푸집<NA>해체협착붕괴해체작업 순서 미흡
505거푸집<NA>해체매몰전도우수유입
506거푸집<NA>해체충돌탈락거푸집 하단 미고정
507건물<NA>보수매몰붕괴작업하중
508흙막이가시설벽체보수 및 버럭반출매몰붕괴과잉굴착
509흙막이가시설벽체보수 및 버럭반출매몰붕괴흙막이가시설 설치시기
510흙막이가시설벽체용접매몰붕괴흙막이 지보공 조립 미흡

Duplicate rows

Most frequently occurring

위험발생객체위험발생위치작업프로세스인적피해물적피해사고원인# duplicates
0거푸집동바리<NA>타설매몰균열<NA>4
1성토사면하단부거적고정전락<NA><NA>2