This section describes main characteristics of the mammographic data set and its attributes:
General information
| Mammographic Mass data set |
| Type | Classification | Origin | Real world |
| Features | 5 | (Real / Integer / Nominal) | (0 / 5 / 0) |
| Classes | 2 |
Missing values? | Yes |
| Total instances | 961 |
Instances without missing values | 830 |
Attribute description
| Attribute | Domain |
| BI-RADS | [0,6] |
| Age | [18,96] |
| Shape | [1,4] |
| Margin | [1,5] |
| Density | [1,4] |
| Severity | {0, 1} |
Additional information
This data set can be used to predict the severity (benign or malignant) of a mammographic mass lesion from BI-RADS attributes and the patient's age. It contains a BI-RADS assessment, the patient's age and three BI-RADS attributes together with the ground truth (the severity field, which is the target attribute). The data was collected at the Institute of Radiology of the University Erlangen-Nuremberg between 2003 and 2006.
In this section you can download some files related to the mammographic data set:
- The complete data set already formatted in KEEL formatcan be downloaded from here
.
- A copy of the data set already partitioned by means of a 10-folds cross validation procedure can be downloaded from here
.
- A copy of the data set already partitioned by means of a 5-folds cross validation procedure can be downloaded from here
.
- The header file associated to this data set can be downloaded from here
.
- This is not a native data set from the KEEL project. It has been obtained from the UCI Machine Learning Repository. The original page where the data set can be found is: http://archive.ics.uci.edu/ml/datasets/Mammographic+Mass.
|