main main
KEEL-dataset - data set description

This section describes main characteristics of the transactions30k data set and its attributes:

General information

Transactions (30000) data set data set
Features 3(Real / Integer / Nominal)(0 / 3 / 0)
Instances284284Missing values?No

Attribute description

TransactionID[0, 29999]
ItemID[111, 444]
Quantity[1, 11]

Additional information

A simulated data set modelling 30000 transactions where each instance represents the purchase of an item in a given transaction.

In order to create this dataset, the number of purchased items in a transaction was randomly generated in a uniform distribution of the range of 119. The purchased items in each transaction were then selected from the 64 items in an exponential distribution with the rate parameter set at 16. Their quantities were then assigned from an exponential distribution with the rate parameter set at 5. An item could not be generated twice in a transaction.

This data set was collected from the experimental study appeared in (Tzung-Pei Hong, Chun-Hao Chen, Yeong-Chyi Lee, and Yu-Lung Wu, Genetic-Fuzzy Data Mining With Divide-and-Conquer Strategy, IEEE Transactions on Evolutionary Computation, 12:2, 2008).

In this section you can download some files related to the transactions30k data set:

  • The complete data set already formatted in KEEL format can be downloaded from herezip.gif.
  • The header file associated to this data set can be downloaded from heretxt.png.

 Copyright 2004-2018, KEEL (Knowledge Extraction based on Evolutionary Learning)
About the Webmaster Team
Valid XHTML 1.1   Valid CSS!