main main
KEEL-dataset - data set description

This section describes main characteristics of the house16H data set and its attributes:

General information

House-16H data set
TypeUnsupervisedOriginReal world
Features 17(Real / Integer / Nominal)(10 / 7 / 0)
Instances22784Missing values?No

Attribute description


Additional information

This database was designed on the basis of data provided by US Census Bureau []. The data were collected as part of the 1990 US census. These are mostly counts cumulated at different survey levels. For the purpose of this data set a level State-Place was used. Data from all states was obtained. Most of the counts were changed into appropriate proportions.

These are all concerned with predicting the median price of the house in the region based on demographic composition and a state of housing market in the region. A number in the name signifies the number of attributes of the data set. A following letter denotes a very rough approximation to the difficulty of the task. For Low task difficulty, more correlated attributes were chosen as signified by univariate smooth fit of that input on the target. Tasks with High difficulty have had their attributes chosen to make the modelling more difficult due to higher variance or lower correlation of the inputs to the target.

In this section you can download some files related to the house16H data set:

  • The complete data set already formatted in KEEL format can be downloaded from herezip.gif.
  • The header file associated to this data set can be downloaded from heretxt.png.
  • This is not a native data set from the KEEL project. It has been obtained from the Bilkent University Function Approximation Repository. The original page where the data set can be found is:

 Copyright 2004-2018, KEEL (Knowledge Extraction based on Evolutionary Learning)
About the Webmaster Team
Valid XHTML 1.1   Valid CSS!