This section describes main characteristics of the mutagenesis-chains data set and its attributes:
General information
Mutagenesis-Chains data set |
Type | Multi instance | Origin | Real world |
Features | 25 | (Real / Integer / Nominal) | (24 / 0 / 1) |
Instances | 5349 |
Classes | 2 |
Missing values? | No |
Attribute description
Attribute | Domain | Attribute | Domain | Attribute | Domain |
Chains-bag-id | {5262, ... , 4835} | E1=f | [0.0, 1.0] | E3=c | [0.0, 1.0] |
Bond1 | [1.0, 7.0] | E1=h | [0.0, 1.0] | E3=f | [0.0, 1.0] |
Bond2 | [1.0, 7.0] | E1=i | [0.0, 1.0] | E3=h | [0.0, 1.0] |
Charge1 | [-0.781, 1.002] | E1=n | [0.0, 1.0] | E3=n | [0.0, 1.0] |
Charge2 | [-0.781, 1.002] | E1=o | [0.0, 1.0] | E3=o | [0.0, 1.0] |
Charge3 | [-0.755, 0.597] | E2=c | [0.0, 1.0] | Q1 | [1.0, 232.0] |
E1=br | [0.0, 1.0] | E2=n | [0.0, 1.0] | Q2 | [10.0, 232.0] |
E1=c | [0.0, 1.0] | E2=o | [0.0, 1.0] | Q3 | [1.0, 195.0] |
E1=cl | [0.0, 1.0] | Class | {0, 1} |
Additional information
The problem consists of predicting the mutagenicity of the molecules, that is, determining whether a molecule is mutagenic or non-mutagenic. The dataset for mutagenesis consists of 188 molecules, of which 125 are mutagenic (active) and 63 are non-mutagenic (inactive). From a MIL perspective different transformations are considered, concretely, mutagenesis-chains represensts all adjacent pairs of bounds of a compound molecule as a bag.
In this section you can download some files related to the mutagenesis-chains data set:
- The complete data set already formatted in KEEL format can be downloaded from
here.
- A copy of the data set already partitioned by means of a 10-folds cross validation procedure can be downloaded from here.
- The header file associated to this data set can be downloaded from here.
|