KEEL: A software tool to assess evolutionary algorithms for Data Mining problems (regression, classification, clustering, pattern mining and so on)

KEEL-dataset - data set description

West East data set

Description
Files and additional references

Description

This section describes main characteristics of the westEast data set and its attributes:

General information

West East data set
Type	Multi instance	Origin	Real world
Features	25	(Real / Integer / Nominal)	(24 / 0 / 1)
Instances	213	Classes	2
Missing values?			No

Attribute description

Attribute	Domain	Attribute	Domain	Attribute	Domain
Train_id	{128, ... , 33 }	C3=double	[0.0, 1.0]	L0=circle	[0.0, 1.0]
C0	[1.0, 4.0]	C3=not_double	[0.0, 1.0]	L0=diamond	[0.0, 1.0]
C1=bucket	[0.0, 1.0]	C4=arc	[0.0, 1.0]	L0=hexagon	[0.0, 1.0]
C1=ellipse	[0.0, 1.0]	C4=flat	[0.0, 1.0]	L0=rectangle	[0.0, 1.0]
C1=hexagon	[0.0, 1.0]	C4=jagged	[0.0, 1.0]	L0=triangle	[0.0, 1.0]
C1=rectangle	[0.0, 1.0]	C4=none	[0.0, 1.0]	L0=utriangle	[0.0, 1.0]
C1=u_shaped	[0.0, 1.0]	C4=peaked	[0.0, 1.0]	L1	[0.0, 3.0]
C2=long	[0.0, 1.0]	C5	[2.0, 3.0]	Train_list1_order	[0.0, 3.0]
C2=short	[0.0, 1.0]	Class	{0, 1}

Additional information

The well-known East-West Challenge is originally an ILP problem. The problem consist of predicting whether a train is eastbound or westbound. A train (bag) contains a variable number of cars (instances) that have different shapes and carry different loads (instance-level attributes). As the standard MI assumption is asymmetric and it is not clear whether an eastbound train or a westbound train can be regarded as a positive example in the MI setting, we consider two MI versions of the data for our experiments. This datasets contains wesbound trains as positive examples.

Files and additional references

In this section you can download some files related to the westEast data set:

The complete data set already formatted in KEEL format can be downloaded from here .
A copy of the data set already partitioned by means of a 10-folds cross validation procedure can be downloaded from here .
The header file associated to this data set can be downloaded from here .