main main
KEEL-dataset - data set description
dataset/images/page-blocks.jpg



This section describes main characteristics of the page-blocks data set and its attributes:

General information

Page Blocks Classification data set
TypeClassificationOriginReal world
Features 10(Real / Integer / Nominal)(4 / 6 / 0)
Instances5472 Classes5
Missing values?No

Attribute description

AttributeDomain
Height[1, 804]
Lenght[1, 553]
Area[7, 143993]
Eccen[0.0070, 537.0]
P_black[0.052, 1.0]
P_and[0.062, 1.0]
Mean_tr[1.0, 4955.0]
Blackpix[1, 33017]
Blackand[7, 46133]
Wb_trans[1, 3212]
Class{1, 2, 3, 4, 5}

Additional information

This database contain blocks of the page layout of a document that has been detected by a segmentation process.

The task is to determine the type of block: Text (1), Horizontal line (2), Graphic (3), Vertical line (4) or Picture (5).




In this section you can download some files related to the page-blocks data set:

  • The complete data set already formatted in KEEL format can be downloaded from herezip.gif.
  • A copy of the data set already partitioned by means of a 10-folds cross validation procedure can be downloaded from herezip.gif.
  • A copy of the data set already partitioned by means of a 5-folds cross validation procedure can be downloaded from herezip.gif.
  • The header file associated to this data set can be downloaded from heretxt.png.
  • This is not a native data set from the KEEL project. It has been obtained from the UCI Machine Learning Repository. The original page where the data set can be found is: http://archive.ics.uci.edu/ml/datasets/Page+Blocks+Classification.


 
 Copyright 2004-2018, KEEL (Knowledge Extraction based on Evolutionary Learning)
About the Webmaster Team
Valid XHTML 1.1   Valid CSS!