This section describes main characteristics of the musk2 data set and its attributes:
General information
Musk 2 data set |
Type | Multi instance | Origin | Real world |
Features | 167 | (Real / Integer / Nominal) | (166 / 0 / 1) |
Instances | 6598 |
Classes | 2 |
Missing values? | No |
Attribute description
Attribute | Domain | Attribute | Domain | Attribute | Domain |
Molecule_name | {MUSK-jf66, ... , MUSK-331} | F56 | [-279.0, 214.0] | F112 | [-266.0, 214.0] |
F1 | [-31.0, 292.0] | F57 | [-170.0, 229.0] | F113 | [-224.0, 194.0] |
F2 | [-199.0, 95.0] | F58 | [-178.0, 113.0] | F114 | [-204.0, 180.0] |
F3 | [-167.0, 81.0] | F59 | [-172.0, 200.0] | F115 | [-250.0, 216.0] |
F4 | [-114.0, 161.0] | F60 | [-250.0, 200.0] | F116 | [-257.0, 253.0] |
F5 | [-118.0, 325.0] | F61 | [-102.0, 254.0] | F117 | [-103.0, 315.0] |
F6 | [-183.0, 200.0] | F62 | [-196.0, 180.0] | F118 | [-212.0, 156.0] |
F7 | [-171.0, 220.0] | F63 | [-100.0, 284.0] | F119 | [-196.0, 209.0] |
F8 | [-225.0, 320.0] | F64 | [-195.0, 225.0] | F120 | [-201.0, 152.0] |
F9 | [-245.0, 147.0] | F65 | [-165.0, 112.0] | F121 | [-121.0, 267.0] |
F10 | [-286.0, 231.0] | F66 | [-9.0, 315.0] | F122 | [-117.0, 258.0] |
F11 | [-328.0, 176.0] | F67 | [-167.0, 234.0] | F123 | [-129.0, 276.0] |
F12 | [-321.0, 184.0] | F68 | [-195.0, 149.0] | F124 | [-127.0, 227.0] |
F13 | [-305.0, 195.0] | F69 | [-134.0, 232.0] | F125 | [-144.0, 299.0] |
F14 | [-342.0, 158.0] | F70 | [-191.0, 158.0] | F126 | [-69.0, 308.0] |
F15 | [-294.0, 172.0] | F71 | [-174.0, 86.0] | F127 | [-286.0, 219.0] |
F16 | [-327.0, 80.0] | F72 | [-152.0, 99.0] | F128 | [-221.0, 241.0] |
F17 | [-224.0, 138.0] | F73 | [-324.0, 181.0] | F129 | [-307.0, 206.0] |
F18 | [-308.0, 189.0] | F74 | [-333.0, 172.0] | F130 | [-189.0, 122.0] |
F19 | [-286.0, 225.0] | F75 | [-274.0, 203.0] | F131 | [-123.0, 281.0] |
F20 | [-252.0, 227.0] | F76 | [-195.0, 21.0] | F132 | [-140.0, 255.0] |
F21 | [-295.0, 194.0] | F77 | [-259.0, 156.0] | F133 | [-319.0, 176.0] |
F22 | [-185.0, 190.0] | F78 | [-313.0, 235.0] | F134 | [-338.0, 169.0] |
F23 | [-253.0, 213.0] | F79 | [-306.0, 193.0] | F135 | [-336.0, 219.0] |
F24 | [-76.0, 317.0] | F80 | [-202.0, 309.0] | F136 | [-196.0, 125.0] |
F25 | [-100.0, 277.0] | F81 | [-255.0, 198.0] | F137 | [-197.0, 186.0] |
F26 | [-242.0, 183.0] | F82 | [-175.0, 201.0] | F138 | [-199.0, 130.0] |
F27 | [-205.0, 164.0] | F83 | [-299.0, 175.0] | F139 | [-243.0, 202.0] |
F28 | [-166.0, 145.0] | F84 | [-98.0, 273.0] | F140 | [-283.0, 203.0] |
F29 | [-142.0, 174.0] | F85 | [-220.0, 193.0] | F141 | [-290.0, 188.0] |
F30 | [-162.0, 266.0] | F86 | [-203.0, 194.0] | F142 | [-185.0, 184.0] |
F31 | [-117.0, 309.0] | F87 | [-207.0, 109.0] | F143 | [-157.0, 239.0] |
F32 | [-143.0, 310.0] | F88 | [-213.0, 172.0] | F144 | [-171.0, 208.0] |
F33 | [-139.0, 207.0] | F89 | [-111.0, 152.0] | F145 | [-179.0, 213.0] |
F34 | [-279.0, 160.0] | F90 | [-157.0, 269.0] | F146 | [-106.0, 261.0] |
F35 | [-160.0, 220.0] | F91 | [-202.0, 235.0] | F147 | [-136.0, 172.0] |
F36 | [-7.0, 324.0] | F92 | [-16.0, 306.0] | F148 | [-200.0, 130.0] |
F37 | [-175.0, 147.0] | F93 | [-125.0, 223.0] | F149 | [-213.0, 117.0] |
F38 | [-190.0, 187.0] | F94 | [-328.0, 184.0] | F150 | [-190.0, 185.0] |
F39 | [-148.0, 107.0] | F95 | [-119.0, 238.0] | F151 | [-140.0, 244.0] |
F40 | [-180.0, 194.0] | F96 | [-69.0, 347.0] | F152 | [-128.0, 153.0] |
F41 | [-188.0, 90.0] | F97 | [-191.0, 165.0] | F153 | [-114.0, 211.0] |
F42 | [-150.0, 367.0] | F98 | [-190.0, 203.0] | F154 | [-173.0, 120.0] |
F43 | [-295.0, 225.0] | F99 | [-157.0, 40.0] | F155 | [-143.0, 379.0] |
F44 | [-343.0, 198.0] | F100 | [-156.0, 237.0] | F156 | [-198.0, 153.0] |
F45 | [-310.0, 147.0] | F101 | [-209.0, 91.0] | F157 | [-257.0, 145.0] |
F46 | [-340.0, 161.0] | F102 | [-33.0, 348.0] | F158 | [-328.0, 94.0] |
F47 | [-159.0, 110.0] | F103 | [-299.0, 173.0] | F159 | [-219.0, 179.0] |
F48 | [-290.0, 179.0] | F104 | [-324.0, 191.0] | F160 | [-136.0, 192.0] |
F49 | [-265.0, 273.0] | F105 | [-319.0, 154.0] | F161 | [-120.0, 411.0] |
F50 | [-279.0, 215.0] | F106 | [-284.0, 212.0] | F162 | [-69.0, 355.0] |
F51 | [-326.0, 172.0] | F107 | [-200.0, 159.0] | F163 | [73.0, 625.0] |
F52 | [-206.0, 177.0] | F108 | [-292.0, 167.0] | F164 | [-289.0, 295.0] |
F53 | [-206.0, 169.0] | F109 | [-249.0, 200.0] | F165 | [-428.0, 168.0] |
F54 | [-147.0, 335.0] | F110 | [-291.0, 141.0] | F166 | [-471.0, 367.0] |
F55 | [-112.0, 269.0] | F111 | [-250.0, 209.0] | Class | {0, 1} |
Additional information
The problem consists of determining whether a drug molecule will bind strongly to a target protein. Each molecule may adopt a wide range of shapes or conformations. A positive molecule has at least one shape that can bind well (although it is not known which one) and a negative molecule means none of its shapes can make the molecule bind well. This problem could be represented in a very natural way in MIL settings: each molecule would be a bag and the conformations it can adopt would be the instances in that bag.
In this section you can download some files related to the musk2 data set:
- The complete data set already formatted in KEEL format can be downloaded from
here.
- A copy of the data set already partitioned by means of a 10-folds cross validation procedure can be downloaded from here.
- The header file associated to this data set can be downloaded from here.
|