main main
KEEL-dataset - data set description
dataset/images/letter.jpg



This section describes main characteristics of the letter data set and its attributes:

General information

Letter Recognition data set
TypeClassificationOriginReal world
Features 16(Real / Integer / Nominal)(0 / 16 / 0)
Instances20000 Classes26
Missing values?No

Attribute description

AttributeDomainAttributeDomain
X-box[0, 15]Y2bar[0, 15]
Y-box[0, 15]Xybar[0, 15]
Width[0, 15]X2ybr[0, 15]
High[0, 15]Xy2br[0, 15]
Onpix[0, 15]X-ege[0, 15]
X-bar[0, 15]Xegvy[0, 15]
Y-bar[0, 15]Y-ege[0, 15]
X2bar[0, 15]Yegvx[0, 15]
Class{A, ..., Z}

Additional information

The objective is to identify each of a large number of black-and-white rectangular pixel displays as one of the 26 capital letters in the English alphabet. The character images were based on 20 different fonts and each letter within these 20 fonts was randomly distorted to produce a file of 20,000 unique stimuli. Each stimulus was converted into 16 primitive numerical attributes (statistical moments and edge counts) which were then scaled to fit into a range of integer values from 0 through 15.




In this section you can download some files related to the letter data set:

  • The complete data set already formatted in KEEL format can be downloaded from herezip.gif.
  • A copy of the data set already partitioned by means of a 10-folds cross validation procedure can be downloaded from herezip.gif.
  • A copy of the data set already partitioned by means of a 5-folds cross validation procedure can be downloaded from herezip.gif.
  • The header file associated to this data set can be downloaded from heretxt.png.
  • This is not a native data set from the KEEL project. It has been obtained from the UCI Machine Learning Repository. The original page where the data set can be found is: http://archive.ics.uci.edu/ml/datasets/Letter+Recognition.


 
 Copyright 2004-2018, KEEL (Knowledge Extraction based on Evolutionary Learning)
About the Webmaster Team
Valid XHTML 1.1   Valid CSS!