Now you can request additional data and/or customized columns!

Try It Now!
Files Size Format Created Updated License Source
3 253kB arff csv zip 3 years ago 3 years ago Open Data Commons Public Domain Dedication and License
The resources for this dataset can be found at https://www.openml.org/d/54 Author: Dr. Pete Mowforth and Dr. Barry Shepherd Source: UCI) Please cite: Siebert,JP. Turing Institute Research Memorandum TIRM-87-018 "Vehicle Recognition Using Rule Based Methods" (March 1987) NAME vehicle read more
Download Developers

Data Files

Download files in this dataset

File Description Size Last changed Download
vehicle_arff 62kB arff (62kB)
vehicle 55kB csv (55kB) , json (441kB)
vehicle_zip Compressed versions of dataset. Includes normalized CSV and JSON data with original data and datapackage.json. 101kB zip (101kB)

vehicle_arff  

Signup to Premium Service for additional or customised data - Get Started

This is a preview version. There might be more data in the original version.

vehicle  

Signup to Premium Service for additional or customised data - Get Started

This is a preview version. There might be more data in the original version.

Field information

Field Name Order Type (Format) Description
COMPACTNESS 1 number (default)
CIRCULARITY 2 number (default)
DISTANCE_CIRCULARITY 3 number (default)
RADIUS_RATIO 4 number (default)
PR.AXIS_ASPECT_RATIO 5 number (default)
MAX.LENGTH_ASPECT_RATIO 6 number (default)
SCATTER_RATIO 7 number (default)
ELONGATEDNESS 8 number (default)
PR.AXIS_RECTANGULARITY 9 number (default)
MAX.LENGTH_RECTANGULARITY 10 number (default)
SCALED_VARIANCE_MAJOR 11 number (default)
SCALED_VARIANCE_MINOR 12 number (default)
SCALED_RADIUS_OF_GYRATION 13 number (default)
SKEWNESS_ABOUT_MAJOR 14 number (default)
SKEWNESS_ABOUT_MINOR 15 number (default)
KURTOSIS_ABOUT_MAJOR 16 number (default)
KURTOSIS_ABOUT_MINOR 17 number (default)
HOLLOWS_RATIO 18 number (default)
Class 19 string (default)

Integrate this dataset into your favourite tool

Use our data-cli tool designed for data wranglers:

data get https://datahub.io/machine-learning/vehicle
data info machine-learning/vehicle
tree machine-learning/vehicle
# Get a list of dataset's resources
curl -L -s https://datahub.io/machine-learning/vehicle/datapackage.json | grep path

# Get resources

curl -L https://datahub.io/machine-learning/vehicle/r/0.arff

curl -L https://datahub.io/machine-learning/vehicle/r/1.csv

curl -L https://datahub.io/machine-learning/vehicle/r/2.zip

If you are using R here's how to get the data you want quickly loaded:

install.packages("jsonlite", repos="https://cran.rstudio.com/")
library("jsonlite")

json_file <- 'https://datahub.io/machine-learning/vehicle/datapackage.json'
json_data <- fromJSON(paste(readLines(json_file), collapse=""))

# get list of all resources:
print(json_data$resources$name)

# print all tabular data(if exists any)
for(i in 1:length(json_data$resources$datahub$type)){
  if(json_data$resources$datahub$type[i]=='derived/csv'){
    path_to_file = json_data$resources$path[i]
    data <- read.csv(url(path_to_file))
    print(data)
  }
}

Note: You might need to run the script with root permissions if you are running on Linux machine

Install the Frictionless Data data package library and the pandas itself:

pip install datapackage
pip install pandas

Now you can use the datapackage in the Pandas:

import datapackage
import pandas as pd

data_url = 'https://datahub.io/machine-learning/vehicle/datapackage.json'

# to load Data Package into storage
package = datapackage.Package(data_url)

# to load only tabular data
resources = package.resources
for resource in resources:
    if resource.tabular:
        data = pd.read_csv(resource.descriptor['path'])
        print (data)

For Python, first install the `datapackage` library (all the datasets on DataHub are Data Packages):

pip install datapackage

To get Data Package into your Python environment, run following code:

from datapackage import Package

package = Package('https://datahub.io/machine-learning/vehicle/datapackage.json')

# print list of all resources:
print(package.resource_names)

# print processed tabular data (if exists any)
for resource in package.resources:
    if resource.descriptor['datahub']['type'] == 'derived/csv':
        print(resource.read())

If you are using JavaScript, please, follow instructions below:

Install data.js module using npm:

  $ npm install data.js

Once the package is installed, use the following code snippet:

const {Dataset} = require('data.js')

const path = 'https://datahub.io/machine-learning/vehicle/datapackage.json'

// We're using self-invoking function here as we want to use async-await syntax:
;(async () => {
  const dataset = await Dataset.load(path)
  // get list of all resources:
  for (const id in dataset.resources) {
    console.log(dataset.resources[id]._descriptor.name)
  }
  // get all tabular data(if exists any)
  for (const id in dataset.resources) {
    if (dataset.resources[id]._descriptor.format === "csv") {
      const file = dataset.resources[id]
      // Get a raw stream
      const stream = await file.stream()
      // entire file as a buffer (be careful with large files!)
      const buffer = await file.buffer
      // print data
      stream.pipe(process.stdout)
    }
  }
})()

Read me

The resources for this dataset can be found at https://www.openml.org/d/54

Author: Dr. Pete Mowforth and Dr. Barry Shepherd
Source: UCI Please cite: Siebert,JP. Turing Institute Research Memorandum TIRM-87-018 “Vehicle Recognition Using Rule Based Methods” (March 1987)

NAME vehicle silhouettes

PURPOSE to classify a given silhouette as one of four types of vehicle, using a set of features extracted from the silhouette. The vehicle may be viewed from one of many different angles.

PROBLEM TYPE classification

SOURCE Drs.Pete Mowforth and Barry Shepherd Turing Institute George House 36 North Hanover St. Glasgow G1 2AD

CONTACT Alistair Sutherland Statistics Dept. Strathclyde University Livingstone Tower 26 Richmond St. GLASGOW G1 1XH Great Britain

     Tel: 041 552 4400 x3033
     
     Fax: 041 552 4711 
     
     e-mail: [email protected]

HISTORY This data was originally gathered at the TI in 1986-87 by JP Siebert. It was partially financed by Barr and Stroud Ltd. The original purpose was to find a method of distinguishing 3D objects within a 2D image by application of an ensemble of shape feature extractors to the 2D silhouettes of the objects. Measures of shape features extracted from example silhouettes of objects to be discriminated were used to generate a class- ification rule tree by means of computer induction. This object recognition strategy was successfully used to discriminate between silhouettes of model cars, vans and buses viewed from constrained elevation but all angles of rotation. The rule tree classification performance compared favourably to MDC (Minimum Distance Classifier) and k-NN (k-Nearest Neigh- bour) statistical classifiers in terms of both error rate and computational efficiency. An investigation of these rule trees generated by example indicated that the tree structure was heavily influenced by the orientation of the objects, and grouped similar object views into single decisions.

DESCRIPTION The features were extracted from the silhouettes by the HIPS (Hierarchical Image Processing System) extension BINATTS, which extracts a combination of scale independent features utilising both classical moments based measures such as scaled variance, skewness and kurtosis about the major/minor axes and heuristic measures such as hollows, circularity, rectangularity and compactness. Four “Corgie” model vehicles were used for the experiment: a double decker bus, Cheverolet van, Saab 9000 and an Opel Manta 400. This particular combination of vehicles was chosen with the expectation that the bus, van and either one of the cars would be readily distinguishable, but it would be more difficult to distinguish between the cars. The images were acquired by a camera looking downwards at the model vehicle from a fixed angle of elevation (34.2 degrees to the horizontal). The vehicles were placed on a diffuse backlit surface (lightbox). The vehicles were painted matte black to minimise highlights. The images were captured using a CRS4000 framestore connected to a vax 750. All images were captured with a spatial resolution of 128x128 pixels quantised to 64 greylevels. These images were thresholded to produce binary vehicle silhouettes, negated (to comply with the processing requirements of BINATTS) and thereafter subjected to shrink-expand-expand-shrink HIPS modules to remove “salt and pepper” image noise. The vehicles were rotated and their angle of orientation was measured using a radial graticule beneath the vehicle. 0 and 180 degrees corresponded to “head on” and “rear” views respectively while 90 and 270 corresponded to profiles in opposite directions. Two sets of 60 images, each set covering a full 360 degree rotation, were captured for each vehicle. The vehicle was rotated by a fixed angle between images. These datasets are known as e2 and e3 respectively. A further two sets of images, e4 and e5, were captured with the camera at elevations of 37.5 degs and 30.8 degs respectively. These sets also contain 60 images per vehicle apart from e4.van which contains only 46 owing to the difficulty of containing the van in the image at some orientations.

ATTRIBUTES

     COMPACTNESS     (average perim)2/area
     
     CIRCULARITY     (average radius)2/area
     
     DISTANCE CIRCULARITY    area/(av.distance from border)2
     
     RADIUS RATIO    (max.rad-min.rad)/av.radius
     
     PR.AXIS ASPECT RATIO    (minor axis)/(major axis)
     
     MAX.LENGTH ASPECT RATIO (length perp. max length)/(max length)
     
     SCATTER RATIO   (inertia about minor axis)/(inertia about major axis)
     
     ELONGATEDNESS           area/(shrink width)2
     
     PR.AXIS RECTANGULARITY  area/(pr.axis length*pr.axis width)
     
     MAX.LENGTH RECTANGULARITY area/(max.length*length perp. to this)
     
     SCALED VARIANCE         (2nd order moment about minor axis)/area
     ALONG MAJOR AXIS
     
     SCALED VARIANCE         (2nd order moment about major axis)/area
     ALONG MINOR AXIS 
     
     SCALED RADIUS OF GYRATION       (mavar+mivar)/area
     
     SKEWNESS ABOUT  (3rd order moment about major axis)/sigma_min3
     MAJOR AXIS
     
     SKEWNESS ABOUT  (3rd order moment about minor axis)/sigma_maj3
     MINOR AXIS
             
     KURTOSIS ABOUT  (4th order moment about major axis)/sigma_min4
     MINOR AXIS  
             
     KURTOSIS ABOUT  (4th order moment about minor axis)/sigma_maj4
     MAJOR AXIS
     
     HOLLOWS RATIO   (area of hollows)/(area of bounding polygon)
     
      Where sigma_maj2 is the variance along the major axis and
     sigma_min2 is the variance along the minor axis, and
     
     area of hollows= area of bounding poly-area of object 
     
      The area of the bounding polygon is found as a side result of
     the computation to find the maximum length. Each individual
     length computation yields a pair of calipers to the object
     orientated at every 5 degrees. The object is propagated into
     an image containing the union of these calipers to obtain an
     image of the bounding polygon. 

NUMBER OF CLASSES

     4       OPEL, SAAB, BUS, VAN

NUMBER OF EXAMPLES

             Total no. = 946
             
             No. in each class
             
               opel 240
               saab 240
               bus  240
               van  226
             
             
             100 examples are being kept by Strathclyde for validation.
             So StatLog partners will receive 846 examples.

NUMBER OF ATTRIBUTES

             No. of atts. = 18
Datapackage.json

Request Customized Data


Notifications of data updates and schema changes

Warranty / guaranteed updates

Workflow integration (e.g. Python packages, NPM packages)

Customized data (e.g. you need different or additional data)

Or suggest your own feature from the link below