main main
KEEL-dataset - data set description
dataset/images/transactions50k.jpg



This section describes main characteristics of the transactions50k data set and its attributes:

General information

Transactions (50000) data set data set
TypeUnsupervisedOriginLaboratory
Features 3(Real / Integer / Nominal)(0 / 3 / 0)
Instances475649Missing values?No

Attribute description

AttributeDomain
TransactionID[0, 49999]
ItemID[111, 444]
Quantity[1, 11]

Additional information

A simulated data set modelling 50000 transactions where each instance represents the purchase of an item in a given transaction.

In order to create this dataset, the number of purchased items in a transaction was randomly generated in a uniform distribution of the range of 1–19. The purchased items in each transaction were then selected from the 64 items in an exponential distribution with the rate parameter set at 16. Their quantities were then assigned from an exponential distribution with the rate parameter set at 5. An item could not be generated twice in a transaction.

This data set was collected from the experimental study appeared in (Tzung-Pei Hong, Chun-Hao Chen, Yeong-Chyi Lee, and Yu-Lung Wu, Genetic-Fuzzy Data Mining With Divide-and-Conquer Strategy, IEEE Transactions on Evolutionary Computation, 12:2, 2008).




In this section you can download some files related to the transactions50k data set:

  • The complete data set already formatted in KEEL format can be downloaded from herezip.gif.
  • The header file associated to this data set can be downloaded from heretxt.png.


 
 Copyright 2004-2018, KEEL (Knowledge Extraction based on Evolutionary Learning)
About the Webmaster Team
Valid XHTML 1.1   Valid CSS!