Machine Learning for JavaScript Hackers
    
    Machine Learning
    for JavaScript Hackers
    
    Machine Learning
    Solves tasks that people are good at,
       but traditional computation is bad at.
    Examples
    
      - face detection
      
 - spam filtering
      
 - recommendation systems
      
 - character recognition
    
 
    traditional computing:
      Hand-chosen conditionals and params.
    ML algorithms:
    Rules are learned from data
  Training
  
    - Learn rules by training on labelled data
 
    - same algorithm, different data
 
    - generic = power
 
  
    The problem
    
      Should I bother?...
      Does it have any cat pics?
      
        
???
      
 
     
    Breaking it down
    does this image contain a cat?
    
    Breaking it down
    is this subsection a cat head?
    
    We need a classifier
    
      - Takes a piece of data, tells you which class it's in.
      
 - Trained with known classifications
      
 - Bayesian (spam), k-nearest neighbors, support vector machines,
        neural networks.
    
 
    neural networks
    
      
        function runNetwork(input) {
          var net = {"layers":[{"0":{},"1":{}},{"0":{"bias":5.1244897557632765,"weights":{"0":-3.591317000303657,"1":-3.594502936141513}},"1":{"bias":1.4480619514263766,"weights":{"0":-5.021099423700753,"1":-5.055736046304716}},"2":{"bias":0.655017127607016,"weights":{"0":-3.9842614825641096,"1":-4.020357237374914}}},{"0":{"bias":-3.093322979654723,"weights":{"0":7.328941033927063,"1":-5.699647431673055,"2":-3.879799253666414}}}]};
          for (var i = 1; i < net.layers.length; i++) {
            var layer = net.layers[i];
            var output = {};
            for (var id in layer) {
              var node = layer[id];
              var sum = node.bias;
              for (var iid in node.weights) {
                sum += node.weights[iid] * input[iid];
              }
              output[id] = (1 / (1 + Math.exp(-sum)));
            }
            input = output;
          }
          return output;
        }
      
    
    
      Not going to go into how it works, but these floating point values are trained.
      Optimization problem, try to find floating point values that minimize the error in the training set.
 
  input and output
  
    
      net.train([
        { input: [0.7, 0.1, 0.3], output: [1] },
        { input: [1.0, 0.8, 0.7], output: [0] },
        { input: [0.5, 0.6, 0.7], output: [0] }
      ]);
      var output = net.run([0.5, 0.5, 0.6]);   // [0.001]
    
  
    
      - output is 
[1] if cat, [0] if not
       - input is 
[?, ?, ?, ...]
     
    What is the input?
    
      
      too much variation
      
      leaves shape information
     
    want to make it easy on the network, giving it data
      it can get a better handle on, data that doesn't vary too much
     
  Histogram of Oriented Gradients
  
    - Captures strength + direction of edges
 
    - Pixels → array of #s from 
0 to 1 
      perfect!
    48x48 canvas  → HOG of length 1176 
  
  Collection
  
    - negatives:
      1000s of cat-free crops
    
 
    - positives:
      1000s of cat head crops
    
 
    - 
      resize to 
48x48
     
    - 
      
        node-canvas
      
    
 
  
  Training
  
    
    var hog = require("hog-descriptor");
    var brain = require("brain");
    var data = pics.map(function(pic) {
      return {
        input: hog.extractHOG(pic.canvas),
        expected: [pic.cat ? 1 : 0]
      }
    })
    var net = new brain.NeuralNetwork();
    net.train(data);
    
  
  Is this subsection a cat head?
  
    
    function isCat(canvas) {
      var features = hog.extractHOG(canvas);
      var prob = net.run(features);
      return prob > 0.9;
    }
    
  
  Does this image contain a cat?
  
    - Test "windows" at different scales, locations
    
 - Combine overlapping detections
    
 - Weed out spurious detections
  
 
  Demo
  http://harthur.github.com/kittydar
  Not As Fun
  
    - Weak typing / implicit conversions
 
      NaN...0/0
      or undefined + 7?
       
      - 
        worse: 
Math.abs(null) = 0 silent destruction
       
    
    - Speed
 
    not that bad
    - Lack of building blocks