main main
KEEL-dataset - data set description

This section describes main characteristics of the musk1 data set and its attributes:

General information

Musk 1 data set
TypeMulti instanceOriginReal world
Features 167(Real / Integer / Nominal)(166 / 0 / 1)
Instances476 Classes2
Missing values?No

Attribute description

Molecule_name{MUSK-jf59, ... ,MUSK-316}F56[-282.0, 199.0]F112[-268.0, 211.0]
F1[-9.0, 130.0]F57[-168.0, 231.0]F113[-223.0, 188.0]
F2[-199.0, 98.0]F58[-178.0, 111.0]F114[-206.0, 160.0]
F3[-166.0, 83.0]F59[-173.0, 181.0]F115[-251.0, 201.0]
F4[-115.0, 157.0]F60[-254.0, 178.0]F116[-252.0, 187.0]
F5[-117.0, 238.0]F61[-102.0, 243.0]F117[-104.0, 315.0]
F6[-184.0, 200.0]F62[-195.0, 173.0]F118[-209.0, 137.0]
F7[-170.0, 214.0]F63[-69.0, 110.0]F119[-193.0, 215.0]
F8[-231.0, 188.0]F64[-197.0, 220.0]F120[-201.0, 152.0]
F9[-242.0, 135.0]F65[-159.0, 108.0]F121[-123.0, 254.0]
F10[-284.0, 218.0]F66[6.0, 197.0]F122[-118.0, 270.0]
F11[-327.0, 104.0]F67[-166.0, 262.0]F123[-131.0, 279.0]
F12[-326.0, 196.0]F68[-196.0, 157.0]F124[-121.0, 200.0]
F13[-302.0, 117.0]F69[-134.0, 205.0]F125[-145.0, 149.0]
F14[-348.0, 72.0]F70[-192.0, 168.0]F126[-44.0, 137.0]
F15[-292.0, 77.0]F71[-174.0, 83.0]F127[-285.0, 123.0]
F16[-318.0, 13.0]F72[-150.0, 98.0]F128[-219.0, 238.0]
F17[-225.0, 125.0]F73[-322.0, 133.0]F129[-305.0, 159.0]
F18[-312.0, 139.0]F74[-338.0, 112.0]F130[-188.0, 152.0]
F19[-291.0, 214.0]F75[-272.0, 132.0]F131[-94.0, 140.0]
F20[-243.0, 273.0]F76[-191.0, -65.0]F132[-140.0, 170.0]
F21[-298.0, 196.0]F77[-255.0, 154.0]F133[-322.0, 176.0]
F22[-187.0, 147.0]F78[-318.0, 158.0]F134[-341.0, 187.0]
F23[-255.0, 187.0]F79[-310.0, 185.0]F135[-342.0, 111.0]
F24[-70.0, 312.0]F80[-197.0, 233.0]F136[-197.0, 137.0]
F25[-102.0, 254.0]F81[-258.0, 183.0]F137[-201.0, 186.0]
F26[-243.0, 162.0]F82[-176.0, 192.0]F138[-200.0, 127.0]
F27[-205.0, 157.0]F83[-301.0, 208.0]F139[-246.0, 138.0]
F28[-164.0, 147.0]F84[-98.0, 251.0]F140[-286.0, 207.0]
F29[-140.0, 181.0]F85[-222.0, 183.0]F141[-294.0, 238.0]
F30[-154.0, 258.0]F86[-200.0, 189.0]F142[-186.0, 177.0]
F31[-118.0, 190.0]F87[-208.0, 113.0]F143[-158.0, 213.0]
F32[-149.0, 143.0]F88[-213.0, 105.0]F144[-173.0, 158.0]
F33[-124.0, 219.0]F89[-111.0, 163.0]F145[-179.0, 178.0]
F34[-278.0, 163.0]F90[-159.0, 272.0]F146[-107.0, 240.0]
F35[-156.0, 227.0]F91[-202.0, 168.0]F147[-131.0, -3.0]
F36[16.0, 291.0]F92[-20.0, 167.0]F148[-201.0, 134.0]
F37[-175.0, 224.0]F93[-121.0, 233.0]F149[-215.0, 121.0]
F38[-192.0, 165.0]F94[-326.0, 178.0]F150[-191.0, 174.0]
F39[-145.0, 105.0]F95[-77.0, 149.0]F151[-126.0, 243.0]
F40[-181.0, 189.0]F96[-70.0, 336.0]F152[-125.0, 154.0]
F41[-190.0, 82.0]F97[-191.0, 171.0]F153[-112.0, 207.0]
F42[-151.0, 201.0]F98[-191.0, 194.0]F154[-171.0, 84.0]
F43[-291.0, 168.0]F99[-155.0, 48.0]F155[-145.0, 161.0]
F44[-343.0, 81.0]F100[-155.0, 235.0]F156[-197.0, 113.0]
F45[-315.0, 65.0]F101[-210.0, 88.0]F157[-252.0, 141.0]
F46[-338.0, 115.0]F102[-37.0, 265.0]F158[-324.0, 72.0]
F47[-157.0, 110.0]F103[-296.0, 96.0]F159[-217.0, 173.0]
F48[-294.0, 99.0]F104[-320.0, 95.0]F160[-135.0, 185.0]
F49[-269.0, 180.0]F105[-319.0, 72.0]F161[-126.0, 253.0]
F50[-273.0, 214.0]F106[-287.0, 179.0]F162[-78.0, 291.0]
F51[-329.0, 218.0]F107[-194.0, 144.0]F163[35.0, 302.0]
F52[-205.0, 148.0]F108[-293.0, 52.0]F164[-132.0, 24.0]
F53[-206.0, 159.0]F109[-250.0, 183.0]F165[-258.0, 82.0]
F54[-143.0, 286.0]F110[-295.0, 53.0]F166[-72.0, 235.0]
F55[-113.0, 259.0]F111[-252.0, 117.0]Class{0, 1}

Additional information

The problem consists of determining whether a drug molecule will bind strongly to a target protein. Each molecule may adopt a wide range of shapes or conformations. A positive molecule has at least one shape that can bind well (although it is not known which one) and a negative molecule means none of its shapes can make the molecule bind well. This problem could be represented in a very natural way in MIL settings: each molecule would be a bag and the conformations it can adopt would be the instances in that bag.

In this section you can download some files related to the musk1 data set:

  • The complete data set already formatted in KEEL format can be downloaded from herezip.gif.
  • A copy of the data set already partitioned by means of a 10-folds cross validation procedure can be downloaded from herezip.gif.
  • The header file associated to this data set can be downloaded from heretxt.png.

 Copyright 2004-2018, KEEL (Knowledge Extraction based on Evolutionary Learning)
About the Webmaster Team
Valid XHTML 1.1   Valid CSS!