Overview

Dataset statistics

Number of variables5
Number of observations647
Missing cells0
Missing cells (%)0.0%
Duplicate rows47
Duplicate rows (%)7.3%
Total size in memory25.4 KiB
Average record size in memory40.2 B

Variable types

Categorical2
Text2
DateTime1

Dataset

Description경기도 용인시 가스사업자 현황입니다. 구분, 사업종류, 법인명, 사업소소재지 등의 데이터를 제공합니다. ※ 데이터기준일자 : 2023-07-04
URLhttps://www.data.go.kr/data/15044240/fileData.do

Alerts

데이터기준일자 has constant value ""Constant
Dataset has 47 (7.3%) duplicate rowsDuplicates
구분 is highly overall correlated with 사업종류High correlation
사업종류 is highly overall correlated with 구분High correlation

Reproduction

Analysis started2023-12-12 14:07:10.524146
Analysis finished2023-12-12 14:07:11.190383
Duration0.67 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.2 KiB
고압가스
428 
액화석유가스
219 

Length

Max length6
Median length4
Mean length4.6769706
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row고압가스
2nd row고압가스
3rd row고압가스
4th row고압가스
5th row고압가스

Common Values

ValueCountFrequency (%)
고압가스 428
66.2%
액화석유가스 219
33.8%

Length

2023-12-12T23:07:11.277563image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:07:11.419471image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
고압가스 428
66.2%
액화석유가스 219
33.8%

사업종류
Categorical

HIGH CORRELATION 

Distinct9
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Memory size5.2 KiB
제조
313 
판매사업
72 
저장소설치
71 
저장소
59 
판매
56 
Other values (4)
76 

Length

Max length8
Median length2
Mean length3.0370943
Min length2

Unique

Unique1 ?
Unique (%)0.2%

Sample

1st row제조
2nd row판매
3rd row제조
4th row제조
5th row판매

Common Values

ValueCountFrequency (%)
제조 313
48.4%
판매사업 72
 
11.1%
저장소설치 71
 
11.0%
저장소 59
 
9.1%
판매 56
 
8.7%
충전사업 38
 
5.9%
집단공급사업 24
 
3.7%
가스용품제조사업 13
 
2.0%
충전사업영업소 1
 
0.2%

Length

2023-12-12T23:07:11.543035image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:07:11.685767image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
제조 313
48.4%
판매사업 72
 
11.1%
저장소설치 71
 
11.0%
저장소 59
 
9.1%
판매 56
 
8.7%
충전사업 38
 
5.9%
집단공급사업 24
 
3.7%
가스용품제조사업 13
 
2.0%
충전사업영업소 1
 
0.2%
Distinct435
Distinct (%)67.2%
Missing0
Missing (%)0.0%
Memory size5.2 KiB
2023-12-12T23:07:11.932524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length23
Mean length8.6846986
Min length2

Characters and Unicode

Total characters5619
Distinct characters343
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique324 ?
Unique (%)50.1%

Sample

1st row대신자산신탁(주)
2nd row세광가스텍
3rd row(주)한준에프알
4th row(주)디티앤씨알오
5th row(주)유니온가스(2)
ValueCountFrequency (%)
신한자산신탁(주 11
 
1.5%
주식회사 10
 
1.3%
주)맥서브 9
 
1.2%
삼성전자(주)기흥사업장 9
 
1.2%
청오디피케이(주 9
 
1.2%
주)하나자산신탁 9
 
1.2%
삼성전자(주 6
 
0.8%
대상㈜ 6
 
0.8%
주)케이피텍 5
 
0.7%
경기종합가스 5
 
0.7%
Other values (470) 663
89.4%
2023-12-12T23:07:12.319273image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
( 395
 
7.0%
) 395
 
7.0%
364
 
6.5%
197
 
3.5%
106
 
1.9%
103
 
1.8%
103
 
1.8%
95
 
1.7%
93
 
1.7%
86
 
1.5%
Other values (333) 3682
65.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4526
80.5%
Open Punctuation 395
 
7.0%
Close Punctuation 395
 
7.0%
Space Separator 95
 
1.7%
Decimal Number 79
 
1.4%
Uppercase Letter 76
 
1.4%
Other Symbol 34
 
0.6%
Other Punctuation 8
 
0.1%
Lowercase Letter 6
 
0.1%
Dash Punctuation 3
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
364
 
8.0%
197
 
4.4%
106
 
2.3%
103
 
2.3%
103
 
2.3%
93
 
2.1%
86
 
1.9%
82
 
1.8%
78
 
1.7%
77
 
1.7%
Other values (300) 3237
71.5%
Uppercase Letter
ValueCountFrequency (%)
L 19
25.0%
G 16
21.1%
P 15
19.7%
S 7
 
9.2%
K 6
 
7.9%
C 4
 
5.3%
J 2
 
2.6%
H 2
 
2.6%
I 2
 
2.6%
M 1
 
1.3%
Other values (2) 2
 
2.6%
Decimal Number
ValueCountFrequency (%)
1 37
46.8%
2 13
 
16.5%
9 13
 
16.5%
3 6
 
7.6%
5 3
 
3.8%
8 3
 
3.8%
0 2
 
2.5%
6 1
 
1.3%
7 1
 
1.3%
Lowercase Letter
ValueCountFrequency (%)
a 2
33.3%
g 2
33.3%
s 2
33.3%
Other Punctuation
ValueCountFrequency (%)
. 7
87.5%
/ 1
 
12.5%
Math Symbol
ValueCountFrequency (%)
> 1
50.0%
< 1
50.0%
Open Punctuation
ValueCountFrequency (%)
( 395
100.0%
Close Punctuation
ValueCountFrequency (%)
) 395
100.0%
Space Separator
ValueCountFrequency (%)
95
100.0%
Other Symbol
ValueCountFrequency (%)
34
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4560
81.2%
Common 977
 
17.4%
Latin 82
 
1.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
364
 
8.0%
197
 
4.3%
106
 
2.3%
103
 
2.3%
103
 
2.3%
93
 
2.0%
86
 
1.9%
82
 
1.8%
78
 
1.7%
77
 
1.7%
Other values (301) 3271
71.7%
Common
ValueCountFrequency (%)
( 395
40.4%
) 395
40.4%
95
 
9.7%
1 37
 
3.8%
2 13
 
1.3%
9 13
 
1.3%
. 7
 
0.7%
3 6
 
0.6%
5 3
 
0.3%
- 3
 
0.3%
Other values (7) 10
 
1.0%
Latin
ValueCountFrequency (%)
L 19
23.2%
G 16
19.5%
P 15
18.3%
S 7
 
8.5%
K 6
 
7.3%
C 4
 
4.9%
a 2
 
2.4%
g 2
 
2.4%
J 2
 
2.4%
H 2
 
2.4%
Other values (5) 7
 
8.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4526
80.5%
ASCII 1059
 
18.8%
None 34
 
0.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
( 395
37.3%
) 395
37.3%
95
 
9.0%
1 37
 
3.5%
L 19
 
1.8%
G 16
 
1.5%
P 15
 
1.4%
2 13
 
1.2%
9 13
 
1.2%
. 7
 
0.7%
Other values (22) 54
 
5.1%
Hangul
ValueCountFrequency (%)
364
 
8.0%
197
 
4.4%
106
 
2.3%
103
 
2.3%
103
 
2.3%
93
 
2.1%
86
 
1.9%
82
 
1.8%
78
 
1.7%
77
 
1.7%
Other values (300) 3237
71.5%
None
ValueCountFrequency (%)
34
100.0%
Distinct501
Distinct (%)77.4%
Missing0
Missing (%)0.0%
Memory size5.2 KiB
2023-12-12T23:07:12.596485image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length48
Median length44
Mean length27.57187
Min length16

Characters and Unicode

Total characters17839
Distinct characters274
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique425 ?
Unique (%)65.7%

Sample

1st row경기도 용인시 처인구 남사읍 북리 134-1 외 6필지
2nd row경기도 용인시 처인구 백암면 장평리 80-2 (장평리 150-4)
3rd row경기도 용인시 기흥구 공세동 263-6
4th row경기도 용인시 처인구 백령로20번길 28, D동 (유방동)
5th row경기도 용인시 처인구 남사읍 전궁리 28
ValueCountFrequency (%)
경기도 647
 
16.6%
용인시 540
 
13.8%
처인구 346
 
8.9%
기흥구 164
 
4.2%
용인시처인구 78
 
2.0%
백암면 61
 
1.6%
양지면 59
 
1.5%
남사면 48
 
1.2%
남사읍 45
 
1.2%
농서동 41
 
1.1%
Other values (784) 1870
48.0%
2023-12-12T23:07:13.021316image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3568
20.0%
1093
 
6.1%
881
 
4.9%
680
 
3.8%
668
 
3.7%
664
 
3.7%
658
 
3.7%
651
 
3.6%
1 478
 
2.7%
431
 
2.4%
Other values (264) 8067
45.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 10916
61.2%
Space Separator 3568
 
20.0%
Decimal Number 2468
 
13.8%
Dash Punctuation 254
 
1.4%
Close Punctuation 238
 
1.3%
Open Punctuation 238
 
1.3%
Other Punctuation 120
 
0.7%
Uppercase Letter 37
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1093
 
10.0%
881
 
8.1%
680
 
6.2%
668
 
6.1%
664
 
6.1%
658
 
6.0%
651
 
6.0%
431
 
3.9%
416
 
3.8%
392
 
3.6%
Other values (235) 4382
40.1%
Uppercase Letter
ValueCountFrequency (%)
C 6
16.2%
S 5
13.5%
L 5
13.5%
T 3
8.1%
A 3
8.1%
E 3
8.1%
K 3
8.1%
G 2
 
5.4%
P 2
 
5.4%
I 2
 
5.4%
Other values (3) 3
8.1%
Decimal Number
ValueCountFrequency (%)
1 478
19.4%
2 379
15.4%
4 281
11.4%
3 240
9.7%
5 233
9.4%
0 188
 
7.6%
6 186
 
7.5%
9 168
 
6.8%
7 164
 
6.6%
8 151
 
6.1%
Other Punctuation
ValueCountFrequency (%)
, 119
99.2%
. 1
 
0.8%
Space Separator
ValueCountFrequency (%)
3568
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 254
100.0%
Close Punctuation
ValueCountFrequency (%)
) 238
100.0%
Open Punctuation
ValueCountFrequency (%)
( 238
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 10916
61.2%
Common 6886
38.6%
Latin 37
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1093
 
10.0%
881
 
8.1%
680
 
6.2%
668
 
6.1%
664
 
6.1%
658
 
6.0%
651
 
6.0%
431
 
3.9%
416
 
3.8%
392
 
3.6%
Other values (235) 4382
40.1%
Common
ValueCountFrequency (%)
3568
51.8%
1 478
 
6.9%
2 379
 
5.5%
4 281
 
4.1%
- 254
 
3.7%
3 240
 
3.5%
) 238
 
3.5%
( 238
 
3.5%
5 233
 
3.4%
0 188
 
2.7%
Other values (6) 789
 
11.5%
Latin
ValueCountFrequency (%)
C 6
16.2%
S 5
13.5%
L 5
13.5%
T 3
8.1%
A 3
8.1%
E 3
8.1%
K 3
8.1%
G 2
 
5.4%
P 2
 
5.4%
I 2
 
5.4%
Other values (3) 3
8.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 10916
61.2%
ASCII 6923
38.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3568
51.5%
1 478
 
6.9%
2 379
 
5.5%
4 281
 
4.1%
- 254
 
3.7%
3 240
 
3.5%
) 238
 
3.4%
( 238
 
3.4%
5 233
 
3.4%
0 188
 
2.7%
Other values (19) 826
 
11.9%
Hangul
ValueCountFrequency (%)
1093
 
10.0%
881
 
8.1%
680
 
6.2%
668
 
6.1%
664
 
6.1%
658
 
6.0%
651
 
6.0%
431
 
3.9%
416
 
3.8%
392
 
3.6%
Other values (235) 4382
40.1%

데이터기준일자
Date

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size5.2 KiB
Minimum2023-07-04 00:00:00
Maximum2023-07-04 00:00:00
2023-12-12T23:07:13.122669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:07:13.205792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Correlations

2023-12-12T23:07:13.269458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분사업종류
구분1.0001.000
사업종류1.0001.000
2023-12-12T23:07:13.361936image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분사업종류
구분1.0000.995
사업종류0.9951.000
2023-12-12T23:07:13.436276image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분사업종류
구분1.0000.995
사업종류0.9951.000

Missing values

2023-12-12T23:07:11.006019image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T23:07:11.138223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구분사업종류법인명(상호)사업소소재지데이터기준일자
0고압가스제조대신자산신탁(주)경기도 용인시 처인구 남사읍 북리 134-1 외 6필지2023-07-04
1고압가스판매세광가스텍경기도 용인시 처인구 백암면 장평리 80-2 (장평리 150-4)2023-07-04
2고압가스제조(주)한준에프알경기도 용인시 기흥구 공세동 263-62023-07-04
3고압가스제조(주)디티앤씨알오경기도 용인시 처인구 백령로20번길 28, D동 (유방동)2023-07-04
4고압가스판매(주)유니온가스(2)경기도 용인시 처인구 남사읍 전궁리 282023-07-04
5고압가스판매(주)유니온가스(1)경기도 용인시 처인구 남사읍 전궁리 28-182023-07-04
6고압가스제조신한자산신탁(주)경기도 용인시 처인구 남사읍 북리 45-12023-07-04
7고압가스제조신한자산신탁(주)경기도 용인시 처인구 남사읍 북리 45-12023-07-04
8고압가스제조신한자산신탁(주)경기도 용인시 처인구 남사읍 북리 45-12023-07-04
9고압가스제조신한자산신탁(주)경기도 용인시 처인구 남사읍 북리 45-12023-07-04
구분사업종류법인명(상호)사업소소재지데이터기준일자
637액화석유가스저장소설치(주)녹십자경기도 용인시 구성면 보정리 303호2023-07-04
638액화석유가스저장소설치두암산업경기도 용인시처인구 고림동 649호2023-07-04
639액화석유가스저장소설치(주)도루코경기도 용인시처인구 고림동 931호2023-07-04
640액화석유가스판매사업엘지가스경기도 용인시처인구 포곡읍 둔전리 192-8호2023-07-04
641액화석유가스저장소설치삼성전자(주)2단지경기도 용인시기흥구 농서동 산24호2023-07-04
642액화석유가스저장소설치삼성전자2단지경기도 용인시기흥구 농서동 산24호2023-07-04
643액화석유가스저장소설치삼성전자(주)1단지5기경기도 용인시기흥구 농서동 산24호2023-07-04
644액화석유가스저장소설치삼성전자(주)2단경기도 용인시기흥구 농서동 산24호2023-07-04
645액화석유가스저장소설치삼성전자(주)경기도 용인시기흥구 농서동 산24호2023-07-04
646액화석유가스저장소설치(주)리바트경기도 용인시처인구 남사면 북리 54번지 10 호2023-07-04

Duplicate rows

Most frequently occurring

구분사업종류법인명(상호)사업소소재지데이터기준일자# duplicates
27고압가스제조신한자산신탁(주)경기도 용인시 처인구 남사읍 북리 45-12023-07-0411
19고압가스제조(주)하나자산신탁경기도 용인시 처인구 이동읍 덕성리 1261-12023-07-049
32고압가스제조청오디피케이(주)경기도 용인시 처인구 백암면 가창리 435-12 외 1필지2023-07-048
5고압가스제조(주)맥서브경기도 용인시 처인구 양지면 중부대로 24652023-07-045
21고압가스제조대상㈜경기도 용인시 처인구 백암면 근곡로107번길 42023-07-045
17고압가스제조(주)코스트코 코리아(공세점)경기도 용인시 기흥구 탑실로 38 (공세동)2023-07-044
29고압가스제조제일약품㈜경기도 용인시 처인구 백암면 청강가창로 7, 제일약품2023-07-044
35고압가스제조현대자동차(주)환경기술연구소경기도 용인시 기흥구 마북로240번길 17-5 (마북동)2023-07-044
36액화석유가스저장소설치(주)두암산업경기도 용인시처인구 고림동 649-1호2023-07-044
1고압가스저장소삼성전자(주)기흥사업장경기도 용인시 기흥구 삼성2로 95 (농서동)2023-07-043