KEEL: A software tool to assess evolutionary algorithms for Data Mining problems (regression, classification, clustering, pattern mining and so on)

KEEL-dataset - data set description

Automobile data set

Description
Files and additional references

Description

This section describes main characteristics of the automobile data set and its attributes:

General information

Automobile data set
Type	Classification	Origin	Real world
Features	25	(Real / Integer / Nominal)	(15 / 0 / 10)
Classes	6	Missing values?	Yes
Total instances	205	Instances without missing values	150

Attribute description

Attribute	Domain	Attribute	Domain	Attribute	Domain
Normalized-losses	[65.0, 256.0]	Length	[141.1, 208.1]	Bore	[2.54, 3.94]
Make	{alfa-romero, audi, bmw ....}	Width	[60.3, 72.3]	Stroke	[2.07, 4.17]
Fuel-type	{diesel, gas}	Height	[47.8, 59.8]	Compression-ratio	[7.0, 23.0]
Aspiration	{std, turbo}	Curb-weight	[1488.0, 4066.0]	Horsepower	[48.0, 288.0]
Num-of-doors	{four, two}	Engine-type	{dohc, dohcv, l, ohc, ohcf, ohcv, rotor}	Peak-rpm	[4150.0, 6600.0]
Body-style	{hardtop, wagon, sedan, hatchback, convertible}	Num-of-cylinders	{eight, five, four, six, three, twelve, two}	City-mpg	[13.0, 49.0]
Drive-wheels	{4wd, fwd, rwd}	Engine-size	[61.0, 326.0]	Highway-mpg	[16.0, 54.0]
Engine-location	{front, rear}	Fuel-system	{1bbl, 2bbl, 4bbl, idi, mfi, mpfi, spdi, spfi}	Price	[5118.0, 45400.0]
Wheel-base	[86.6, 120.9]	Output	{-2, -1, 0, 1, 2, 3}

Additional information

This data set consists of three types of entities: (a) the specification of an auto in terms of various characteristics, (b) its assigned insurance risk rating, (c) its normalized losses in use as compared to other cars. Symboling corresponds to the degree to which the auto is more risky than its price indicates. Cars are initially assigned a risk factor symbol associated with its price. Then, if it is more risky (or less), this symbol is adjusted by moving it up (or down) the scale. A value of +3 indicates that the auto is risky, -2 that it is probably pretty safe.

Files and additional references

In this section you can download some files related to the automobile data set:

The complete data set already formatted in KEEL formatcan be downloaded from here .
A copy of the data set already partitioned by means of a 10-folds cross validation procedure can be downloaded from here .
A copy of the data set already partitioned by means of a 5-folds cross validation procedure can be downloaded from here .
The header file associated to this data set can be downloaded from here .
This is not a native data set from the KEEL project. It has been obtained from the UCI Machine Learning Repository. The original page where the data set can be found is: http://archive.ics.uci.edu/ml/datasets/Automobile.