Now you can request additional data and/or customized columns!

Try It Now!
Files Size Format Created Updated License Source
3 11MB arff csv zip 5 years ago 5 years ago Open Data Commons Public Domain Dedication and License

The resources for this dataset can be found at https://www.openml.org/d/1116

Author:
Source: Unknown - Date unknown
Please cite:

Dataset from the MLRR repository: http://axon.cs.byu.edu:5000/

More infos: https://archive.ics.uci.edu/ml/datasets/Musk+(Version+2)

Download Developers

Data Files

Download files in this dataset

File Description Size Last changed Download
musk_arff 4MB arff (4MB)
musk 4MB csv (4MB) , json (16MB)
musk_zip Compressed versions of dataset. Includes normalized CSV and JSON data with original data and datapackage.json. 6MB zip (6MB)

musk_arff  

Signup to Premium Service for additional or customised data - Get Started

This is a preview version. There might be more data in the original version.

musk  

Signup to Premium Service for additional or customised data - Get Started

This is a preview version. There might be more data in the original version.

Field information

Field Name Order Type (Format) Description
ID 1 number (default)
molecule_name 2 string (default)
conformation_name 3 string (default)
f1 4 number (default)
f2 5 number (default)
f3 6 number (default)
f4 7 number (default)
f5 8 number (default)
f6 9 number (default)
f7 10 number (default)
f8 11 number (default)
f9 12 number (default)
f10 13 number (default)
f11 14 number (default)
f12 15 number (default)
f13 16 number (default)
f14 17 number (default)
f15 18 number (default)
f16 19 number (default)
f17 20 number (default)
f18 21 number (default)
f19 22 number (default)
f20 23 number (default)
f21 24 number (default)
f22 25 number (default)
f23 26 number (default)
f24 27 number (default)
f25 28 number (default)
f26 29 number (default)
f27 30 number (default)
f28 31 number (default)
f29 32 number (default)
f30 33 number (default)
f31 34 number (default)
f32 35 number (default)
f33 36 number (default)
f34 37 number (default)
f35 38 number (default)
f36 39 number (default)
f37 40 number (default)
f38 41 number (default)
f39 42 number (default)
f40 43 number (default)
f41 44 number (default)
f42 45 number (default)
f43 46 number (default)
f44 47 number (default)
f45 48 number (default)
f46 49 number (default)
f47 50 number (default)
f48 51 number (default)
f49 52 number (default)
f50 53 number (default)
f51 54 number (default)
f52 55 number (default)
f53 56 number (default)
f54 57 number (default)
f55 58 number (default)
f56 59 number (default)
f57 60 number (default)
f58 61 number (default)
f59 62 number (default)
f60 63 number (default)
f61 64 number (default)
f62 65 number (default)
f63 66 number (default)
f64 67 number (default)
f65 68 number (default)
f66 69 number (default)
f67 70 number (default)
f68 71 number (default)
f69 72 number (default)
f70 73 number (default)
f71 74 number (default)
f72 75 number (default)
f73 76 number (default)
f74 77 number (default)
f75 78 number (default)
f76 79 number (default)
f77 80 number (default)
f78 81 number (default)
f79 82 number (default)
f80 83 number (default)
f81 84 number (default)
f82 85 number (default)
f83 86 number (default)
f84 87 number (default)
f85 88 number (default)
f86 89 number (default)
f87 90 number (default)
f88 91 number (default)
f89 92 number (default)
f90 93 number (default)
f91 94 number (default)
f92 95 number (default)
f93 96 number (default)
f94 97 number (default)
f95 98 number (default)
f96 99 number (default)
f97 100 number (default)
f98 101 number (default)
f99 102 number (default)
f100 103 number (default)
f101 104 number (default)
f102 105 number (default)
f103 106 number (default)
f104 107 number (default)
f105 108 number (default)
f106 109 number (default)
f107 110 number (default)
f108 111 number (default)
f109 112 number (default)
f110 113 number (default)
f111 114 number (default)
f112 115 number (default)
f113 116 number (default)
f114 117 number (default)
f115 118 number (default)
f116 119 number (default)
f117 120 number (default)
f118 121 number (default)
f119 122 number (default)
f120 123 number (default)
f121 124 number (default)
f122 125 number (default)
f123 126 number (default)
f124 127 number (default)
f125 128 number (default)
f126 129 number (default)
f127 130 number (default)
f128 131 number (default)
f129 132 number (default)
f130 133 number (default)
f131 134 number (default)
f132 135 number (default)
f133 136 number (default)
f134 137 number (default)
f135 138 number (default)
f136 139 number (default)
f137 140 number (default)
f138 141 number (default)
f139 142 number (default)
f140 143 number (default)
f141 144 number (default)
f142 145 number (default)
f143 146 number (default)
f144 147 number (default)
f145 148 number (default)
f146 149 number (default)
f147 150 number (default)
f148 151 number (default)
f149 152 number (default)
f150 153 number (default)
f151 154 number (default)
f152 155 number (default)
f153 156 number (default)
f154 157 number (default)
f155 158 number (default)
f156 159 number (default)
f157 160 number (default)
f158 161 number (default)
f159 162 number (default)
f160 163 number (default)
f161 164 number (default)
f162 165 number (default)
f163 166 number (default)
f164 167 number (default)
f165 168 number (default)
f166 169 number (default)
class 170 number (default)

Integrate this dataset into your favourite tool

Use our data-cli tool designed for data wranglers:

data get https://datahub.io/machine-learning/musk
data info machine-learning/musk
tree machine-learning/musk
# Get a list of dataset's resources
curl -L -s https://datahub.io/machine-learning/musk/datapackage.json | grep path

# Get resources

curl -L https://datahub.io/machine-learning/musk/r/0.arff

curl -L https://datahub.io/machine-learning/musk/r/1.csv

curl -L https://datahub.io/machine-learning/musk/r/2.zip

If you are using R here's how to get the data you want quickly loaded:

install.packages("jsonlite", repos="https://cran.rstudio.com/")
library("jsonlite")

json_file <- 'https://datahub.io/machine-learning/musk/datapackage.json'
json_data <- fromJSON(paste(readLines(json_file), collapse=""))

# get list of all resources:
print(json_data$resources$name)

# print all tabular data(if exists any)
for(i in 1:length(json_data$resources$datahub$type)){
  if(json_data$resources$datahub$type[i]=='derived/csv'){
    path_to_file = json_data$resources$path[i]
    data <- read.csv(url(path_to_file))
    print(data)
  }
}

Note: You might need to run the script with root permissions if you are running on Linux machine

Install the Frictionless Data data package library and the pandas itself:

pip install datapackage
pip install pandas

Now you can use the datapackage in the Pandas:

import datapackage
import pandas as pd

data_url = 'https://datahub.io/machine-learning/musk/datapackage.json'

# to load Data Package into storage
package = datapackage.Package(data_url)

# to load only tabular data
resources = package.resources
for resource in resources:
    if resource.tabular:
        data = pd.read_csv(resource.descriptor['path'])
        print (data)

For Python, first install the `datapackage` library (all the datasets on DataHub are Data Packages):

pip install datapackage

To get Data Package into your Python environment, run following code:

from datapackage import Package

package = Package('https://datahub.io/machine-learning/musk/datapackage.json')

# print list of all resources:
print(package.resource_names)

# print processed tabular data (if exists any)
for resource in package.resources:
    if resource.descriptor['datahub']['type'] == 'derived/csv':
        print(resource.read())

If you are using JavaScript, please, follow instructions below:

Install data.js module using npm:

  $ npm install data.js

Once the package is installed, use the following code snippet:

const {Dataset} = require('data.js')

const path = 'https://datahub.io/machine-learning/musk/datapackage.json'

// We're using self-invoking function here as we want to use async-await syntax:
;(async () => {
  const dataset = await Dataset.load(path)
  // get list of all resources:
  for (const id in dataset.resources) {
    console.log(dataset.resources[id]._descriptor.name)
  }
  // get all tabular data(if exists any)
  for (const id in dataset.resources) {
    if (dataset.resources[id]._descriptor.format === "csv") {
      const file = dataset.resources[id]
      // Get a raw stream
      const stream = await file.stream()
      // entire file as a buffer (be careful with large files!)
      const buffer = await file.buffer
      // print data
      stream.pipe(process.stdout)
    }
  }
})()

Read me

The resources for this dataset can be found at https://www.openml.org/d/1116

Author:
Source: Unknown - Date unknown
Please cite:

Dataset from the MLRR repository: http://axon.cs.byu.edu:5000/

More infos: https://archive.ics.uci.edu/ml/datasets/Musk+(Version+2)

Datapackage.json

Request Customized Data


Notifications of data updates and schema changes

Warranty / guaranteed updates

Workflow integration (e.g. Python packages, NPM packages)

Customized data (e.g. you need different or additional data)

Or suggest your own feature from the link below