This section describes main characteristics of the musk1 data set and its attributes:
General information
Musk 1 data set |
Type | Multi instance | Origin | Real world |
Features | 167 | (Real / Integer / Nominal) | (166 / 0 / 1) |
Instances | 476 |
Classes | 2 |
Missing values? | No |
Attribute description
Attribute | Domain | Attribute | Domain | Attribute | Domain |
Molecule_name | {MUSK-jf59, ... ,MUSK-316} | F56 | [-282.0, 199.0] | F112 | [-268.0, 211.0] |
F1 | [-9.0, 130.0] | F57 | [-168.0, 231.0] | F113 | [-223.0, 188.0] |
F2 | [-199.0, 98.0] | F58 | [-178.0, 111.0] | F114 | [-206.0, 160.0] |
F3 | [-166.0, 83.0] | F59 | [-173.0, 181.0] | F115 | [-251.0, 201.0] |
F4 | [-115.0, 157.0] | F60 | [-254.0, 178.0] | F116 | [-252.0, 187.0] |
F5 | [-117.0, 238.0] | F61 | [-102.0, 243.0] | F117 | [-104.0, 315.0] |
F6 | [-184.0, 200.0] | F62 | [-195.0, 173.0] | F118 | [-209.0, 137.0] |
F7 | [-170.0, 214.0] | F63 | [-69.0, 110.0] | F119 | [-193.0, 215.0] |
F8 | [-231.0, 188.0] | F64 | [-197.0, 220.0] | F120 | [-201.0, 152.0] |
F9 | [-242.0, 135.0] | F65 | [-159.0, 108.0] | F121 | [-123.0, 254.0] |
F10 | [-284.0, 218.0] | F66 | [6.0, 197.0] | F122 | [-118.0, 270.0] |
F11 | [-327.0, 104.0] | F67 | [-166.0, 262.0] | F123 | [-131.0, 279.0] |
F12 | [-326.0, 196.0] | F68 | [-196.0, 157.0] | F124 | [-121.0, 200.0] |
F13 | [-302.0, 117.0] | F69 | [-134.0, 205.0] | F125 | [-145.0, 149.0] |
F14 | [-348.0, 72.0] | F70 | [-192.0, 168.0] | F126 | [-44.0, 137.0] |
F15 | [-292.0, 77.0] | F71 | [-174.0, 83.0] | F127 | [-285.0, 123.0] |
F16 | [-318.0, 13.0] | F72 | [-150.0, 98.0] | F128 | [-219.0, 238.0] |
F17 | [-225.0, 125.0] | F73 | [-322.0, 133.0] | F129 | [-305.0, 159.0] |
F18 | [-312.0, 139.0] | F74 | [-338.0, 112.0] | F130 | [-188.0, 152.0] |
F19 | [-291.0, 214.0] | F75 | [-272.0, 132.0] | F131 | [-94.0, 140.0] |
F20 | [-243.0, 273.0] | F76 | [-191.0, -65.0] | F132 | [-140.0, 170.0] |
F21 | [-298.0, 196.0] | F77 | [-255.0, 154.0] | F133 | [-322.0, 176.0] |
F22 | [-187.0, 147.0] | F78 | [-318.0, 158.0] | F134 | [-341.0, 187.0] |
F23 | [-255.0, 187.0] | F79 | [-310.0, 185.0] | F135 | [-342.0, 111.0] |
F24 | [-70.0, 312.0] | F80 | [-197.0, 233.0] | F136 | [-197.0, 137.0] |
F25 | [-102.0, 254.0] | F81 | [-258.0, 183.0] | F137 | [-201.0, 186.0] |
F26 | [-243.0, 162.0] | F82 | [-176.0, 192.0] | F138 | [-200.0, 127.0] |
F27 | [-205.0, 157.0] | F83 | [-301.0, 208.0] | F139 | [-246.0, 138.0] |
F28 | [-164.0, 147.0] | F84 | [-98.0, 251.0] | F140 | [-286.0, 207.0] |
F29 | [-140.0, 181.0] | F85 | [-222.0, 183.0] | F141 | [-294.0, 238.0] |
F30 | [-154.0, 258.0] | F86 | [-200.0, 189.0] | F142 | [-186.0, 177.0] |
F31 | [-118.0, 190.0] | F87 | [-208.0, 113.0] | F143 | [-158.0, 213.0] |
F32 | [-149.0, 143.0] | F88 | [-213.0, 105.0] | F144 | [-173.0, 158.0] |
F33 | [-124.0, 219.0] | F89 | [-111.0, 163.0] | F145 | [-179.0, 178.0] |
F34 | [-278.0, 163.0] | F90 | [-159.0, 272.0] | F146 | [-107.0, 240.0] |
F35 | [-156.0, 227.0] | F91 | [-202.0, 168.0] | F147 | [-131.0, -3.0] |
F36 | [16.0, 291.0] | F92 | [-20.0, 167.0] | F148 | [-201.0, 134.0] |
F37 | [-175.0, 224.0] | F93 | [-121.0, 233.0] | F149 | [-215.0, 121.0] |
F38 | [-192.0, 165.0] | F94 | [-326.0, 178.0] | F150 | [-191.0, 174.0] |
F39 | [-145.0, 105.0] | F95 | [-77.0, 149.0] | F151 | [-126.0, 243.0] |
F40 | [-181.0, 189.0] | F96 | [-70.0, 336.0] | F152 | [-125.0, 154.0] |
F41 | [-190.0, 82.0] | F97 | [-191.0, 171.0] | F153 | [-112.0, 207.0] |
F42 | [-151.0, 201.0] | F98 | [-191.0, 194.0] | F154 | [-171.0, 84.0] |
F43 | [-291.0, 168.0] | F99 | [-155.0, 48.0] | F155 | [-145.0, 161.0] |
F44 | [-343.0, 81.0] | F100 | [-155.0, 235.0] | F156 | [-197.0, 113.0] |
F45 | [-315.0, 65.0] | F101 | [-210.0, 88.0] | F157 | [-252.0, 141.0] |
F46 | [-338.0, 115.0] | F102 | [-37.0, 265.0] | F158 | [-324.0, 72.0] |
F47 | [-157.0, 110.0] | F103 | [-296.0, 96.0] | F159 | [-217.0, 173.0] |
F48 | [-294.0, 99.0] | F104 | [-320.0, 95.0] | F160 | [-135.0, 185.0] |
F49 | [-269.0, 180.0] | F105 | [-319.0, 72.0] | F161 | [-126.0, 253.0] |
F50 | [-273.0, 214.0] | F106 | [-287.0, 179.0] | F162 | [-78.0, 291.0] |
F51 | [-329.0, 218.0] | F107 | [-194.0, 144.0] | F163 | [35.0, 302.0] |
F52 | [-205.0, 148.0] | F108 | [-293.0, 52.0] | F164 | [-132.0, 24.0] |
F53 | [-206.0, 159.0] | F109 | [-250.0, 183.0] | F165 | [-258.0, 82.0] |
F54 | [-143.0, 286.0] | F110 | [-295.0, 53.0] | F166 | [-72.0, 235.0] |
F55 | [-113.0, 259.0] | F111 | [-252.0, 117.0] | Class | {0, 1} |
Additional information
The problem consists of determining whether a drug molecule will bind strongly to a target protein. Each molecule may adopt a wide range of shapes or conformations. A positive molecule has at least one shape that can bind well (although it is not known which one) and a negative molecule means none of its shapes can make the molecule bind well. This problem could be represented in a very natural way in MIL settings: each molecule would be a bag and the conformations it can adopt would be the instances in that bag.
In this section you can download some files related to the musk1 data set:
- The complete data set already formatted in KEEL format can be downloaded from
here.
- A copy of the data set already partitioned by means of a 10-folds cross validation procedure can be downloaded from here.
- The header file associated to this data set can be downloaded from here.
|