Passion Project #2

April 4, 2013
Heather Arthur | @harthvader | github/harthur

Machine Learning

in JavaScript

Machine Learning

Machine Learning

Applications of ML

  • spam filtering
  • face detection
  • recommendations
  • character recognition
  • trad. programming: hand-chosen rules
  • ML algorithms: rules are learned from data

JavaScript

Let's solve a problem

The Problem

  • Should I bother?...
  • Does it have cat pics??
  • ???

Cat Detection

Breaking it down

does this image contain a cat?

Breaking it down

is this section a cat head?

We need a classifier

  • Takes a piece of data, tells you which class it's in
  • Trained with known classifications
  • Bayesian (spam), k-nearest neighbors, neural networks, support vector machines

Support Vector Machines

  • trained with labelled data
  • each piece of data is a vector (array) of numbers
  • label/output is 1 if it's a cat, -1 if not
  • input is [?,?,?,...]

What's the input?

Pixels
too much information
Edges
leaves shape information

Histogram of Oriented Gradients

  • "HOG descriptor"
  • captures strength + direction of edges
  • Pixels -> array of #s (perfect)
  • 48x48 canvas -> HOG of length 1176

Collection

  • negatives: 1000s of random non-cat crops
  • positives: 1000s of cat head crops
    • cat dataset with locations of ears, eyes, nose
    • rotate so eyes are level, crop to frame head
  • resize to 48x48

Training

var hog = require("hog-descriptor");
var svm = require("svm");

var SVM = new svm.SVM();

function train(pics) {
  var inputs = [], labels = [];

  for (var i in pics) {
    inputs[i] = hog.extractHOG(pics[i].canvas);
    labels[i] = pics[i].isCat ? 1 : -1;
  }

  SVM.train(inputs, labels);
}

What training does

SVM.toJSON()

{
  "N": 200,
  "D": 1176,
  "b": -1.9007738283614,
  "kernelType": "linear",
  "w": [-0.039986109425303, -0.10320175974183, 0.047456416498616, -0.059351675560768, 0.058363533996746, -0.030757724103858, -0.077885096059827,-0.10909238884521, -0.027176815631789, 0.011213708955119,0.11097928973275, 0.02171080491539, 0.087237563435941, -0.011851994166769, -0.083436869138139,0.03666910365707, -0.04060984443935, 0.018076869381974, -0.030717675854092, -0.053448579335487
  ,...]
}
It finds the best values for b and w ^ How, you ask?

Is this section a cat head?

function isCat(canvas) {
  var descriptor = hog.extractHOG(canvas);

  var result = SVM.predictOne(descriptor);
  return result == 1;
}

Does this image contain a cat?

  • Test "windows" at different scales, locations
  • Combine overlapping detections
  • Weed out spurious detections

Demo

harthur.github.com/kittydar

ML in JS

  • node.js - collection, training
  • Typed arrays - 2x faster training
  • Web Workers - don't block page
  • Still need: building blocks, more speed

ML in JS

DIY