KEEL-dataset - data set description

This section describes main characteristics of the Pollution data set and its attributes:

General information

Pollution data set
TypeUnsupervisedOriginReal world
Features 16(Real / Integer / Nominal)(16 / 0 / 0)
Instances60Missing values?No

Attribute description

PRECReal[10.0, 60.0]NONWReal[0.8, 38.5]
JANTReal[12.0, 67.0]WWDRKReal[33.8, 59.7]
JULTReal[63.0, 85.0]POORReal[9.4, 26.4]
OVR65Real[5.6, 11.8]HCReal[1.0, 648.0]
POPNReal[2.92, 3.53]NOXReal[1.0, 319.0]
EDUCReal[9.0, 12.3]SO@Real[1.0, 278.0]
HOUSReal[66.8, 90.7]HUMIDReal[38.0, 73.0]
DENSReal[1441.0, 9699.0]MORTReal[790.733, 1113.156]

Additional information

This datasets was proposed in McDonald, G.C. and Schwing, R.C. (1973) 'Instabilities of regression estimates relating air pollution to mortality', Technometrics, vol.15, 463-482. It contains 16 attributes describing 60 different pollution scenarios. The attributes are the following:

1) PRECReal: Average annual precipitation in inches
2) JANTReal: Average January temperature in degrees F
3) JULTReal: Same for July
4) OVR65Real: of 1960 SMSA population aged 65 or older
5) POPNReal: Average household size
6) EDUCReal: Median school years completed by those over 22
7) HOUSReal: of housing units which are sound and with all facilities
8) DENSReal: Population per sq. mile in urbanized areas, 1960
9) NONWReal: non-white population in urbanized areas, 1960
10) WWDRKReal: employed in white collar occupations
11) POORReal: of families with income less than $3000
12) HCReal: Relative hydrocarbon pollution potential
13) NOXReal: Same for nitric oxides
14) SO@Real: Same for sulphur dioxide
15) HUMIDReal: Annual average % relative humidity at 1pm
16) MORTReal: Total age-adjusted mortality rate per 100,000

In this section you can download some files related to the Pollution data set:

  • The complete data set already formatted in KEEL format can be downloaded from herezip.gif.
  • The header file associated to this data set can be downloaded from heretxt.png.
  • This is not a native data set from the KEEL project. It has been obtained from the Bilkent University Function Approximation Repository. The original page where the data set can be found is:

