main main
KEEL - dataset     Classification data sets

This section shows the classification data sets avalaible in the repository. Every one defines a supervised classification problem, where each of its examples is composed by some nominal or numerical attributes and a nominal output attribute (its class).

Each data file has the following structure:

  • @relation: Name of the data set
  • @attribute: Description of an attribute (one for each attribute)
  • @inputs: List with the names of the input attributes
  • @output: Name of the output attribute
  • @data: Starting tag of the data

The rest of the file contains all the examples belonging to the data set, expressed in comma sepparated values format.

KEEL - dataset

We offer information about experimental studies using these data sets (result files, papers and more) in the Experimental studies in classification section of the repository.

Below you can find all the Classification data sets available. For each data set, it is shown its name and its number of instances, attributes (the table details the number of Real/Integer/Nominal attributes in the data) and classes (number of possible values of the output variable). In addition, the table shows if the corresponding data set has missing values or not (for data sets with missing values the table shows the number of instances without missing values, and the total number of instances between brackets).

The table allows to download each data set in KEEL format (inside a ZIP file). Additionally, it is possible to obtain the data set already partitioned, by means of a 10-folds / 5-folds cross validation procedure and a 10-fold distribution optimally balanced stratified cross-validation (DOB-SCV). The latter validation procedure was proposed in:

J.G. Moreno-Torres, J.A. Sáez, F. Herrera, Study on the impact of partition-induced dataset shift on k-fold cross-validation, IEEE Transactions on Neural Networks and Learning Systems 23 (8) (2012) 1304-1313.

For data sets with missing values, only the cleaned version (where instances with missing values are not included) is provided. A complete version including instances with missing values can be found in the description page of each data set or in the missing values section of KEEL-dataset. Finally, we provide a header file to give additional information about each data set and its attributes.

By clicking in the column headers, you can order the table by names (alphabetically), by the number of examples, attributes or classes, or by the presence of missing values. Clicking again will sort the rows in reverse order.

Namedownarrow.png#Attributes (R/I/N)downarrowS.png#Examplesdownarrow.png#Classesdownarrow.pngMiss Val.downarrow.pngData set10-fcv5-fcv10-dobscvHeader
banana2        (2/0/0)53002Nozip.gifzip.gifzip.gifzip.giftxt.png
haberman3        (0/3/0)3062Nozip.gifzip.gifzip.gifzip.giftxt.png
titanic3        (3/0/0)22012Nozip.gifzip.gifzip.gifzip.giftxt.png
iris4        (4/0/0)1503Nozip.gifzip.gifzip.gifzip.giftxt.png
hayes-roth4        (0/4/0)1603Nozip.gifzip.gifzip.gifzip.giftxt.png
balance4        (4/0/0)6253Nozip.gifzip.gifzip.gifzip.giftxt.png
tae5        (0/5/0)1513Nozip.gifzip.gifzip.gifzip.giftxt.png
newthyroid5        (4/1/0)2153Nozip.gifzip.gifzip.gifzip.giftxt.png
mammographic5        (0/5/0)830 (961)2Yeszip.gifzip.gifzip.gifzip.giftxt.png
phoneme5        (5/0/0)54042Nozip.gifzip.gifzip.gifzip.giftxt.png
bupa6        (1/5/0)3452Nozip.gifzip.gifzip.gifzip.giftxt.png
monk-26        (0/6/0)4322Nozip.gifzip.gifzip.gifzip.giftxt.png
car6        (0/0/6)17284Nozip.gifzip.gifzip.gifzip.giftxt.png
kr-vs-k6        (0/0/16)2805617Nozip.gifzip.gifzip.gifzip.giftxt.png
appendicitis7        (7/0/0)1062Nozip.gifzip.gifzip.gifzip.giftxt.png
ecoli7        (7/0/0)3368Nozip.gifzip.gifzip.gifzip.giftxt.png
led7digit7        (7/0/0)50010Nozip.gifzip.gifzip.gifzip.giftxt.png
post-operative8        (0/0/8)87 (90)3Yeszip.gifzip.gifzip.gifzip.giftxt.png
pima8        (8/0/0)7682Nozip.gifzip.gifzip.gifzip.giftxt.png
yeast8        (8/0/0)148410Nozip.gifzip.gifzip.gifzip.giftxt.png
abalone8        (7/0/1)417428Nozip.gifzip.gifzip.gifzip.giftxt.png
nursery8        (0/0/8)126905Nozip.gifzip.gifzip.gifzip.giftxt.png
glass9        (9/0/0)2147Nozip.gifzip.gifzip.gifzip.giftxt.png
breast9        (0/0/9)277 (286)2Yeszip.gifzip.gifzip.gifzip.giftxt.png
saheart9        (5/3/1)4622Nozip.gifzip.gifzip.gifzip.giftxt.png
wisconsin9        (0/9/0)683 (699)2Yeszip.gifzip.gifzip.gifzip.giftxt.png
tic-tac-toe9        (0/0/9)9582Nozip.gifzip.gifzip.gifzip.giftxt.png
contraceptive9        (0/9/0)14733Nozip.gifzip.gifzip.gifzip.giftxt.png
shuttle9        (0/9/0)580007Nozip.gifzip.gifzip.gifzip.giftxt.png
page-blocks10        (4/6/0)54725Nozip.gifzip.gifzip.gifzip.giftxt.png
magic10        (10/0/0)190202Nozip.gifzip.gifzip.gifzip.giftxt.png
poker10        (0/10/0)102501010Nozip.gifzip.gifzip.gifnotAva.pngtxt.png
flare11        (0/0/11)10666Nozip.gifzip.gifzip.gifzip.giftxt.png
winequality-red11        (11/0/0)159911Nozip.gifzip.gifzip.gifzip.giftxt.png
winequality-white11        (11/0/0)489811Nozip.gifzip.gifzip.gifzip.giftxt.png
wine13        (13/0/0)1783Nozip.gifzip.gifzip.gifzip.giftxt.png
heart13        (1/12/0)2702Nozip.gifzip.gifzip.gifzip.giftxt.png
cleveland13        (13/0/0)297 (303)5Yeszip.gifzip.gifzip.gifzip.giftxt.png
vowel13        (10/3/0)99011Nozip.gifzip.gifzip.gifzip.giftxt.png
marketing13        (0/13/0)6876 (8993)9Yeszip.gifzip.gifzip.gifzip.giftxt.png
australian14        (3/5/6)6902Nozip.gifzip.gifzip.gifzip.giftxt.png
adult14        (6/0/8)45222 (48842)2Yeszip.gifzip.gifzip.gifnotAva.pngtxt.png
crx15        (3/3/9)653 (690)2Yeszip.gifzip.gifzip.gifzip.giftxt.png
zoo16        (0/0/16)1017Nozip.gifzip.gifzip.gifzip.giftxt.png
housevotes16        (0/0/16)232 (435)2Yeszip.gifzip.gifzip.gifzip.giftxt.png
penbased16        (0/16/0)1099210Nozip.gifzip.gifzip.gifzip.giftxt.png
letter16        (0/16/0)2000026Nozip.gifzip.gifzip.gifzip.giftxt.png
lymphography18        (0/3/15)1484Nozip.gifzip.gifzip.gifzip.giftxt.png
vehicle18        (0/18/0)8464Nozip.gifzip.gifzip.gifzip.giftxt.png
hepatitis19        (2/17/0)80 (155)2Yeszip.gifzip.gifzip.gifzip.giftxt.png
bands19        (13/6/0)365 (539)2Yeszip.gifzip.gifzip.gifzip.giftxt.png
segment19        (19/0/0)23107Nozip.gifzip.gifzip.gifzip.giftxt.png
german20        (0/7/13)10002Nozip.gifzip.gifzip.gifzip.giftxt.png
ring20        (20/0/0)74002Nozip.gifzip.gifzip.gifzip.giftxt.png
twonorm20        (20/0/0)74002Nozip.gifzip.gifzip.gifzip.giftxt.png
thyroid21        (6/15/0)72003Nozip.gifzip.gifzip.gifzip.giftxt.png
mushroom22        (0/0/22)5644 (8124)2Yeszip.gifzip.gifzip.gifzip.giftxt.png
automobile25        (15/0/10)150 (205)6Yeszip.gifzip.gifzip.gifzip.giftxt.png
fars29        (5/0/24)1009688Nozip.gifzip.gifzip.gifnotAva.pngtxt.png
wdbc30        (30/0/0)5692Nozip.gifzip.gifzip.gifzip.giftxt.png
ionosphere33        (32/1/0)3512Nozip.gifzip.gifzip.gifzip.giftxt.png
dermatology34        (0/34/0)358 (366)6Yeszip.gifzip.gifzip.gifzip.giftxt.png
chess36        (0/0/36)31962Nozip.gifzip.gifzip.gifzip.giftxt.png
satimage36        (0/36/0)64357Nozip.gifzip.gifzip.gifzip.giftxt.png
texture40        (40/0/0)550011Nozip.gifzip.gifzip.gifzip.giftxt.png
census41        (1/12/28)142521 (299284)3Yeszip.gifzip.gifzip.gifnotAva.pngtxt.png
kddcup41        (26/0/15)49402023Nozip.gifzip.gifzip.gifnotAva.pngtxt.png
connect-442        (0/0/42)675573Nozip.gifzip.gifzip.gifnotAva.pngtxt.png
spectfheart44        (0/44/0)2672Nozip.gifzip.gifzip.gifzip.giftxt.png
spambase57        (57/0/0)45972Nozip.gifzip.gifzip.gifzip.giftxt.png
sonar60        (60/0/0)2082Nozip.gifzip.gifzip.gifzip.giftxt.png
splice60        (0/0/60)31903Nozip.gifzip.gifzip.gifzip.giftxt.png
optdigits64        (0/64/0)562010Nozip.gifzip.gifzip.gifzip.giftxt.png
coil200085        (0/85/0)98222Nozip.gifzip.gifzip.gifzip.giftxt.png
movement_libras90        (90/0/0)36015Nozip.gifzip.gifzip.gifzip.giftxt.png
All data sets (SCV)zip.gif
All data sets (DOB-SCV)zip.gif

Collecting Data Sets

If you have some example data sets and you would like to share them with the rest of the research community by means of this page, please be so kind as to send your data to the Webmaster Team with the following information:

  • People answerable for the data (full name, affiliation, e-mail, web page, ...).
  • training and test data sets considered, preferably in ASCII format.
  • A brief description of the application.
  • References where it is used.
  • Results obtained by the methods proposed by the authors or used for comparison.
  • Type of experiment developed.
  • Any additional useful information.

Collecting Results

If you have applied your methods to some of the problems presented here we will be glad of showing your results in this page. Please be so kind as to send the following information to Webmaster Team:

  • Name of the application considered and type of experiment developed.
  • Results obtained by the methods proposed by the authors or used for comparison.
  • References where the results are shown.
  • Any additional useful information.

Contact Us

If you are interested on being informed of each update made in this page or you would like to comment on it, please contact with the Webmaster Team.

 Copyright 2004-2014, KEEL (Knowledge Extraction based on Evolutionary Learning)
About the Webmaster Team
Valid XHTML 1.1   Valid CSS!