main main
KEEL-dataset - data set description
dataset/images/corel5k.jpg



This section describes main characteristics of the corel5k data set and its attributes:

General information

Corel images data set
TypeMulti labelOriginReal world
Features 499(Real / Integer / Nominal)(0 / 0 / 499)
Instances5000 Classes374
Missing values?No

Additional information

This data set contains 5000 Corel images. There are 374 words in total in the vocabulary (labels) and each image has 4-5 keywords. Images are segmented using Normalized Cuts. Only regions larger than a threshold are used, and there are typically 5-10 regions for each image. Regions are then clustered into 499 blobs using k-means, which are used to describe the final image.




In this section you can download some files related to the corel5k data set:

  • The complete data set already formatted in KEEL format can be downloaded from herezip.gif.
  • A copy of the data set already partitioned by means of a 10-folds cross validation procedure can be downloaded from herezip.gif.
  • A copy of the data set already partitioned by means of a 5-folds cross validation procedure can be downloaded from herezip.gif.
  • The header file associated to this data set can be downloaded from heretxt.png.
  • This is not a native data set from the KEEL project. It has been obtained from the Mulan repository. The original page where the data set can be found is: http://mulan.sourceforge.net/datasets.html.


 
 Copyright 2004-2018, KEEL (Knowledge Extraction based on Evolutionary Learning)
About the Webmaster Team
Valid XHTML 1.1   Valid CSS!