main main
KEEL - dataset     Standard Classification data sets

This section shows the standard classification data sets avalaible in the repository. Every one defines a supervised classification problem, where each of its examples is composed by some nominal or numerical attributes and a nominal output attribute (its class).

Each data file has the following structure:

  • @relation: Name of the data set
  • @attribute: Description of an attribute (one for each attribute)
  • @inputs: List with the names of the input attributes
  • @output: Name of the output attribute
  • @data: Starting tag of the data

The rest of the file contains all the examples belonging to the data set, expressed in comma sepparated values format.

KEEL - dataset

We offer information about experimental studies using these data sets (result files, papers and more) in the Experimental studies in classification section of the repository.


Below you can find all the Standard Classification data sets available. For each data set, it is shown its name and its number of instances, attributes (the table details the number of Real/Integer/Nominal attributes in the data) and classes (number of possible values of the output variable). In addition, the table shows if the corresponding data set has missing values or not (for data sets with missing values the table shows the number of instances without missing values, and the total number of instances between brackets).

The table allows to download each data set in KEEL format (inside a ZIP file). Additionally, it is possible to obtain the data set already partitioned, by means of a 10-folds / 5-folds stratified cross validation (SCV) procedure. The partitions using a 10-folds / 5-folds distribution optimally balanced stratified cross-validation (DOB-SCV) are also available (except for those datasets with a very high number of examples, since this partitioning scheme requires a considerable computation). The latter validation procedure was proposed in:

J.G. Moreno-Torres, J.A. Sáez, F. Herrera, Study on the impact of partition-induced dataset shift on k-fold cross-validation, IEEE Transactions on Neural Networks and Learning Systems 23 (8) (2012) 1304-1313.

Pdf
For data sets with missing values, only the cleaned version (where instances with missing values are not included) is provided. A complete version including instances with missing values can be found in the description page of each data set or in the missing values section of KEEL-dataset. Finally, we provide a header file to give additional information about each data set and its attributes.

By clicking in the column headers, you can order the table by names (alphabetically), by the number of examples, attributes or classes, or by the presence of missing values. Clicking again will sort the rows in reverse order.

Nameuparrow.png#Attributes (R/I/N)downarrow.png#Examplesdownarrow.png#Classesdownarrow.pngMiss Val.downarrow.pngData set10-fcv5-fcv10-dobscv5-dobscvHeader
zoo16        (0/0/16)1017Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
yeast8        (8/0/0)148410Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
wisconsin9        (0/9/0)683 (699)2Yeszip.gifzip.gifzip.gifzip.gifzip.giftxt.png
winequality-white11        (11/0/0)489811Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
winequality-red11        (11/0/0)159911Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
wine13        (13/0/0)1783Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
wdbc30        (30/0/0)5692Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
vowel13        (10/3/0)99011Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
vehicle18        (0/18/0)8464Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
twonorm20        (20/0/0)74002Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
titanic3        (3/0/0)22012Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
tic-tac-toe9        (0/0/9)9582Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
thyroid21        (6/15/0)72003Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
texture40        (40/0/0)550011Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
tae5        (0/5/0)1513Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
splice60        (0/0/60)31903Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
spectfheart44        (0/44/0)2672Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
spambase57        (57/0/0)45972Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
sonar60        (60/0/0)2082Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
shuttle9        (0/9/0)580007Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
segment19        (19/0/0)23107Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
satimage36        (0/36/0)64357Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
saheart9        (5/3/1)4622Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
ring20        (20/0/0)74002Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
post-operative8        (0/0/8)87 (90)3Yeszip.gifzip.gifzip.gifzip.gifzip.giftxt.png
poker10        (0/10/0)102501010Nozip.gifzip.gifzip.gifnotAva.pngnotAva.pngtxt.png
pima8        (8/0/0)7682Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
phoneme5        (5/0/0)54042Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
penbased16        (0/16/0)1099210Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
page-blocks10        (4/6/0)54725Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
optdigits64        (0/64/0)562010Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
nursery8        (0/0/8)126905Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
newthyroid5        (4/1/0)2153Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
mushroom22        (0/0/22)5644 (8124)2Yeszip.gifzip.gifzip.gifzip.gifzip.giftxt.png
movement_libras90        (90/0/0)36015Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
monk-26        (0/6/0)4322Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
marketing13        (0/13/0)6876 (8993)9Yeszip.gifzip.gifzip.gifzip.gifzip.giftxt.png
mammographic5        (0/5/0)830 (961)2Yeszip.gifzip.gifzip.gifzip.gifzip.giftxt.png
magic10        (10/0/0)190202Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
lymphography18        (0/3/15)1484Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
letter16        (0/16/0)2000026Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
led7digit7        (7/0/0)50010Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
kr-vs-k6        (0/0/16)2805617Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
kddcup41        (26/0/15)49402023Nozip.gifzip.gifzip.gifnotAva.pngnotAva.pngtxt.png
iris4        (4/0/0)1503Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
ionosphere33        (32/1/0)3512Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
housevotes16        (0/0/16)232 (435)2Yeszip.gifzip.gifzip.gifzip.gifzip.giftxt.png
hepatitis19        (2/17/0)80 (155)2Yeszip.gifzip.gifzip.gifzip.gifzip.giftxt.png
heart13        (1/12/0)2702Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
hayes-roth4        (0/4/0)1603Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
haberman3        (0/3/0)3062Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
glass9        (9/0/0)2147Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
german20        (0/7/13)10002Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
flare11        (0/0/11)10666Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
fars29        (5/0/24)1009688Nozip.gifzip.gifzip.gifnotAva.pngnotAva.pngtxt.png
ecoli7        (7/0/0)3368Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
dermatology34        (0/34/0)358 (366)6Yeszip.gifzip.gifzip.gifzip.gifzip.giftxt.png
crx15        (3/3/9)653 (690)2Yeszip.gifzip.gifzip.gifzip.gifzip.giftxt.png
contraceptive9        (0/9/0)14733Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
connect-442        (0/0/42)675573Nozip.gifzip.gifzip.gifnotAva.pngnotAva.pngtxt.png
coil200085        (0/85/0)98222Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
cleveland13        (13/0/0)297 (303)5Yeszip.gifzip.gifzip.gifzip.gifzip.giftxt.png
chess36        (0/0/36)31962Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
census41        (1/12/28)142521 (299284)3Yeszip.gifzip.gifzip.gifnotAva.pngnotAva.pngtxt.png
car6        (0/0/6)17284Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
bupa6        (1/5/0)3452Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
breast9        (0/0/9)277 (286)2Yeszip.gifzip.gifzip.gifzip.gifzip.giftxt.png
bands19        (13/6/0)365 (539)2Yeszip.gifzip.gifzip.gifzip.gifzip.giftxt.png
banana2        (2/0/0)53002Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
balance4        (4/0/0)6253Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
automobile25        (15/0/10)150 (205)6Yeszip.gifzip.gifzip.gifzip.gifzip.giftxt.png
australian14        (3/5/6)6902Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
appendicitis7        (7/0/0)1062Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
adult14        (6/0/8)45222 (48842)2Yeszip.gifzip.gifzip.gifnotAva.pngnotAva.pngtxt.png
abalone8        (7/0/1)417428Nozip.gifzip.gifzip.gifzip.gifzip.giftxt.png
All data sets (SCV)zip.gif
All data sets (DOB-SCV)zip.gif

Collecting Data Sets

If you have some example data sets and you would like to share them with the rest of the research community by means of this page, please be so kind as to send your data to the Webmaster Team with the following information:

  • People answerable for the data (full name, affiliation, e-mail, web page, ...).
  • training and test data sets considered, preferably in ASCII format.
  • A brief description of the application.
  • References where it is used.
  • Results obtained by the methods proposed by the authors or used for comparison.
  • Type of experiment developed.
  • Any additional useful information.

Collecting Results

If you have applied your methods to some of the problems presented here we will be glad of showing your results in this page. Please be so kind as to send the following information to Webmaster Team:

  • Name of the application considered and type of experiment developed.
  • Results obtained by the methods proposed by the authors or used for comparison.
  • References where the results are shown.
  • Any additional useful information.

Contact Us

If you are interested on being informed of each update made in this page or you would like to comment on it, please contact with the Webmaster Team.



 
 Copyright 2004-2018, KEEL (Knowledge Extraction based on Evolutionary Learning)
About the Webmaster Team
Valid XHTML 1.1   Valid CSS!