This section describes main characteristics of the kddcup data set and its attributes:
General information
KDD Cup 1999 data set |
Type | Classification | Origin | Real world |
Features | 41 | (Real / Integer / Nominal) | (26 / 0 / 15) |
Instances | 494020 |
Classes | 23 |
Missing values? | No |
Attribute description
Attribute | Domain | Attribute | Domain | Attribute | Domain |
Atr-0 | [0.0, 58329.0] | Atr-14 | {0, 1, 2} | Atr-28 | [0.0, 1.0] |
Atr-1 | {icmp, tcp, udp} | Atr-15 | [0.0, 993.0] | Atr-29 | [0.0, 1.0] |
Atr-2 | {auth, ..., Z39_50} | Atr-16 | [0.0, 28.0] | Atr-30 | [0.0, 255.0] |
Atr-3 | {OTH, ... , SH} | Atr-17 | {0, 1, 2} | Atr-31 | [0.0, 255.0] |
Atr-4 | [0.0, 6.9337564E8] | Atr-18 | {0, 1, 2, 3, 4, 6, 8} | Atr-32 | [0.0, 1.0] |
Atr-5 | [0.0, 5155468.0] | Atr-19 | {0} | Atr-33 | [0.0, 1.0] |
Atr-6 | {0, 1} | Atr-20 | {0} | Atr-34 | [0.0, 1.0] |
Atr-7 | {0, 1, 3} | Atr-21 | {0, 1} | Atr-35 | [0.0, 1.0] |
Atr-8 | {0, 1, 2, 3} | Atr-22 | [0.0, 511.0] | Atr-36 | [0.0, 1.0] |
Atr-9 | [0.0, 30.0] | Atr-23 | [0.0, 511.0] | Atr-37 | [0.0, 1.0] |
Atr-10 | {0, 1, 2, 3, 4, 5} | Atr-24 | [0.0, 1.0] | Atr-38 | [0.0, 1.0] |
Atr-11 | {0, 1} | Atr-25 | [0.0, 1.0] | Atr-39 | [0.0, 1.0] |
Atr-12 | [0.0, 884.0] | Atr-26 | [0.0, 1.0] | Atr-40 | [0.0, 1.0] |
Atr-13 | {0, 1} | Atr-27 | [0.0, 1.0] | Class | {back., ..., warezmaster.} |
Additional information
This is a 10% subset of the data set used for The Third International Knowledge Discovery and Data Mining Tools Competition, which was held in conjunction with KDD-99 The Fifth International Conference on Knowledge Discovery and Data Mining. The competition task was to build a network intrusion detector, a predictive model capable of distinguishing between bad connections, called intrusions or attacks, and good normal connections, by employing both nominal and continuous data.
In this section you can download some files related to the kddcup data set:
- The complete data set already formatted in KEEL format can be downloaded from
here.
- A copy of the data set already partitioned by means of a 10-folds cross validation procedure can be downloaded from here.
- A copy of the data set already partitioned by means of a 5-folds cross validation procedure can be downloaded from here.
- The header file associated to this data set can be downloaded from here.
- This is not a native data set from the KEEL project. It has been obtained from the UCI Machine Learning Repository. The original page where the data set can be found is: http://archive.ics.uci.edu/ml/datasets/KDD+Cup+1999+Data.
|