This section describes main characteristics of the mutagenesis-bonds data set and its attributes:
General information
Mutagenesis-Bonds data set |
Type | Multi instance | Origin | Real world |
Features | 17 | (Real / Integer / Nominal) | (16 / 0 / 1) |
Instances | 3995 |
Classes | 2 |
Missing values? | No |
Attribute description
Attribute | Domain | Attribute | Domain |
Bonds-bag-id | {3933, ... , 3344} | Elementtype2=cl | [0.0, 1.0] |
Bondtype | [1.0, 7.0] | Elementtype2=f | [0.0, 1.0] |
Charge1 | [-0.533, 1.002] | Elementtype2=h | [0.0, 1.0] |
Charge2 | [-0.781, 1.002] | Elementtype2=i | [0.0, 1.0] |
Elementtype1=br | [0.0, 1.0] | Elementtype2=n | [0.0, 1.0] |
Elementtype1=c | [0.0, 1.0] | Elementtype2=o | [0.0, 1.0] |
Elementtype1=h | [0.0, 1.0] | Quanta1 | [1.0, 232.0] |
Elementtype1=n | [0.0, 1.0] | Quanta2 | [3.0, 232.0] |
Elementtype2=c | [0.0, 1.0] | Class | {0, 1} |
Additional information
The problem consists of predicting the mutagenicity of the molecules, that is, determining whether a molecule is mutagenic or non-mutagenic. The dataset for mutagenesis consists of 188 molecules, of which 125 are mutagenic (active) and 63 are non-mutagenic (inactive). From a MIL perspective different transformations are considered, concretely, mutagenesis-bonds representas all atom-bond tuples of a compound molecules as a bag.
In this section you can download some files related to the mutagenesis-bonds data set:
- The complete data set already formatted in KEEL format can be downloaded from
here.
- A copy of the data set already partitioned by means of a 10-folds cross validation procedure can be downloaded from here.
- The header file associated to this data set can be downloaded from here.
|