KEEL-dataset - data set description

This section describes main characteristics of the transactions90k data set and its attributes:

General information

Transactions (90000) data set data set
Features 3(Real / Integer / Nominal)(0 / 3 / 0)
Instances855367Missing values?No

Attribute description

TransactionID[0, 89999]
ItemID[111, 444]
Quantity[1, 11]

Additional information

A simulated data set modelling 90000 transactions where each instance represents the purchase of an item in a given transaction.

In order to create this dataset, the number of purchased items in a transaction was randomly generated in a uniform distribution of the range of 119. The purchased items in each transaction were then selected from the 64 items in an exponential distribution with the rate parameter set at 16. Their quantities were then assigned from an exponential distribution with the rate parameter set at 5. An item could not be generated twice in a transaction.

This data set was collected from the experimental study appeared in (Tzung-Pei Hong, Chun-Hao Chen, Yeong-Chyi Lee, and Yu-Lung Wu, Genetic-Fuzzy Data Mining With Divide-and-Conquer Strategy, IEEE Transactions on Evolutionary Computation, 12:2, 2008).

