deep learning – THE HYPERTEXT

Netflix for Robots

rg — Thu, 10 Dec 2015 06:08:24 +0000

For my final project in Learning Machines, I forced a deep learning machine to watch every episode of The X-Files.

Watching every episode of The X-Files in high school on Netflix DVDs that came in the mail (remember those?) seemed like the thing to do. It was a great show, with 9 seasons of 20+ episodes a piece. So, it only seemed fair to provide a robot friend with the same experience.

I’m currently running NeuralTalk2, which is truly wonderful open source image captioning code consisting of convolutional and recurrent neural networks. The software requires a GPU to train models, so I’m running it on an Amazon Web Services GPU server instance. At ~50 cents per hour, it’s a lot more expensive than Netflix.

Andrej Karpathy wrote NeuralTalk2 in Torch, which is based in Lua, and it requires a lot of dependencies. However, it was a lot easier to set up than the Deep Dream code I experimented with over the summer.

The training process has involved a lot of trial and error. The learning process seems to just halt sometimes, and the machine often wants to issue the same caption for every image.

Rather than training the machine with an image caption set, I trained it with dialogue from subtitles and matching frames extracted at 10 second intervals from every episode of The X-Files. This is just an experiment, and I’m not expecting stellar results.

That said, the robot is already spitting out some pretty weird and genuinely creepy lines. I can’t wait until I have a version that’s trained well enough to feed in new images and get varied results.

Sound Camera, Part III

rg — Wed, 21 Oct 2015 22:10:27 +0000

I completed the physical prototype of the sound camera inside the enclosure I specified in my prior post, the Kodak Brownie Model 2.

I started by adding a shutter button to the top of the enclosure. I used a Cherry MX Blue mechanical keyboard switch that I had leftover from a project last year.

The battery and Raspberry Pi just barely fit into the enclosure:

The Raspberry Pi camera module is wedged snugly beneath the camera’s front plate:

In additional to playing the song, I added some functionality that provides a bit of context to the user. Using the pico2wave text-to-speech utility, the camera speaks the tags aloud before playing the song. Additionally, using SoX, the camera plays an initialization tone generated from the color histogram of the image before reading the tags.

Here’s the code that’s currently running on the Raspberry Pi:

from __future__ import unicode_literals

import os
import json
import uuid
import time
from random import choice as rc
from random import sample as rs
import re
import subprocess

import RPi.GPIO as GPIO
import picamera
from clarifai.client import ClarifaiApi
import requests
from PIL import Image

import sys
import threading

import spotify

import genius_token

# SPOTIFY STUFF

# Assuming a spotify_appkey.key in the current dir
session = spotify.Session()

# Process events in the background
loop = spotify.EventLoop(session)
loop.start()

# Connect an audio sink
audio = spotify.AlsaSink(session)

# Events for coordination
logged_in = threading.Event()
logged_out = threading.Event()
end_of_track = threading.Event()

logged_out.set()


def on_connection_state_updated(session):
    if session.connection.state is spotify.ConnectionState.LOGGED_IN:
        logged_in.set()
        logged_out.clear()
    elif session.connection.state is spotify.ConnectionState.LOGGED_OUT:
        logged_in.clear()
        logged_out.set()


def on_end_of_track(self):
    end_of_track.set()

# Register event listeners
session.on(
    spotify.SessionEvent.CONNECTION_STATE_UPDATED, on_connection_state_updated)
session.on(spotify.SessionEvent.END_OF_TRACK, on_end_of_track)

# Assuming a previous login with remember_me=True and a proper logout
# session.relogin()
# session.login(genius_token.spotify_un, genius_token.spotify_pwd, remember_me=True)

# logged_in.wait()

# CAMERA STUFF

# Init Camera
camera = picamera.PiCamera()

# Init GPIO
GPIO.setmode(GPIO.BCM)

# Button Pin
GPIO.setup(18, GPIO.IN, pull_up_down=GPIO.PUD_UP)

IMGPATH = '/home/pi/soundcamera/img/'

clarifai_api = ClarifaiApi()

def chunks(l, n):
    """Yield successive n-sized chunks from l."""
    for i in xrange(0, len(l), n):
        yield l[i:i+n]

def take_photo():
    fn = str(int(time.time()))+'.jpg' # TODO: Change to timestamp hash
    fp = IMGPATH+fn
    camera.capture(fp)
    return fp

def chunks(l, n):
    """Yield successive n-sized chunks from l."""
    for i in xrange(0, len(l), n):
        yield l[i:i+n]

def get_tags(fp):
    fileObj = open(fp)
    result = clarifai_api.tag_images(fileObj)
    resultObj = result['results'][0]
    tags = resultObj['result']['tag']['classes']
    return tags

def genius_search(tags):
    access_token = genius_token.token
    payload = {
        'q': ' '.join(tags),
        'access_token': access_token
    }
    endpt = 'http://api.genius.com/search'
    response = requests.get(endpt, params=payload)
    results = response.json()
    hits = results['response']['hits']
    
    artists_titles = []
    
    for h in hits:
        hit_result = h['result']
        if hit_result['url'].endswith('lyrics'):
            artists_titles.append(
                (hit_result['primary_artist']['name'], hit_result['title'])
            )
    
    return artists_titles

def spotify_search(query):
    endpt = "https://api.spotify.com/v1/search"
    payload = {
        'q': query,
        'type': 'track'
    }
    response = requests.get(endpt, params=payload)
    result = response.json()
    result_zero = result['tracks']['items'][0]
    
    return result_zero['uri']

def main(fn):
    tags = get_tags(fn)
    for tag_chunk in chunks(tags,3):
        artists_titles = genius_search(tag_chunk)
        for artist, title in artists_titles:
            try:
                result_uri = spotify_search(artist+' '+title)
            except IndexError:
                pass
            else:
                print tag_chunk
                byline = "%s by %s" % (title, artist)
                print byline
                to_read = ', '.join(tag_chunk) + ". " + byline
                return to_read, result_uri

def play_uri(track_uri):
    # Play a track
    # audio = spotify.AlsaSink(session)
    session.login(genius_token.spotify_un, genius_token.spotify_pwd, remember_me=True)
    logged_in.wait()
    track = session.get_track(track_uri).load()
    session.player.load(track)
    session.player.play()


def stop_track():
    session.player.play(False)
    session.player.unload()
    session.logout()
    logged_out.wait()
    audio._close()

def talk(msg):
    proc = subprocess.Popen(
        ['bash', '/home/pi/soundcamera/play_text.sh', msg]
    )
    proc.communicate()

def play_tone(freqs):
    freq1, freq2 = freqs
    proc = subprocess.Popen(
        ['play', '-n', 'synth', '0.25', 'saw', "%i-%i" % (freq1, freq2)]
    )
    proc.communicate()

def histo_tone(fp):
    im = Image.open(fp)
    hist = im.histogram()
    vals = map(sum, chunks(hist, 64)) # list of 12 values
    print vals
    map(play_tone, chunks(vals,2))

if __name__ == "__main__":
    input_state = True
    new_state = True
    hold_counter = 0
    while 1:
        input_state = GPIO.input(18)
        if not (input_state and new_state):
            talk("capturing")

            # Hold for 15 seconds to turn off
            while not GPIO.input(18):
                time.sleep(0.1)
                hold_counter += 1
                if hold_counter > 150:
                    os.system('shutdown now -h')
                    sys.exit()

            # Reset hold counter
            hold_counter = 0

            # Else take photo
            try:
                img_fp = take_photo()
                msg, uri = main(img_fp)
                histo_tone(img_fp)
                talk(msg)
                play_uri(uri)
            except:
                print sys.exc_info()

            # Wait for playback to complete or Ctrl+C
            try:
                while not end_of_track.wait(0.1):
                    # If new photo, play new song
                    new_state = GPIO.input(18)
                    if not new_state:
                        stop_track()
                        # time.sleep(2)
                        break
            except KeyboardInterrupt:
                pass

Candidate Image Explorer

rg — Thu, 17 Sep 2015 15:53:26 +0000

For this week’s homework in Designing for Data Personalization with Sam Slover, I made progress on a project that I’m working on for Fusion as part of their 2016 US Presidential Election coverage. I began this project by downloading all the images from each candidate’s Twitter, Facebook, and Instagram account — about 60,000 in total — then running those images through Clarifai‘s convolutional neural networks to generate descriptive tags.

With all the images hosted on Amazon s3, and the tag data hosted on parse.com, I created a simple page where users can explore the candidates’ images by topic and by candidate. The default is all topics and all candidates, but users can narrow the selection of images displayed by making multiple selections from each field. Additionally, more images will load as you scroll down the page.

Unfortunately, the AI-enabled image tagging doesn’t always work as well as one might hope.

Here’s the page’s JavaScript code:

var name2slug = {};
var slug2name = {};

Array.prototype.remove = function() {
    var what, a = arguments, L = a.length, ax;
    while (L && this.length) {
        what = a[--L];
        while ((ax = this.indexOf(what)) !== -1) {
            this.splice(ax, 1);
        }
    }
    return this;
}

Array.prototype.chunk = function(chunkSize) {
    var array=this;
    return [].concat.apply([],
        array.map(function(elem,i) {
            return i%chunkSize ? [] : [array.slice(i,i+chunkSize)];
        })
    );
}

function dateFromString(str) {
	var m = str.match(/(\d+)-(\d+)-(\d+)T(\d+):(\d+):(\d+)Z/);
	var date = new Date(Date.UTC(+m[1], +m[2], +m[3], +m[4], +m[5], +m[6]));
	var options = {
	    weekday: "long", year: "numeric", month: "short",
	    day: "numeric", hour: "2-digit", minute: "2-digit"
	};
	return date.toLocaleTimeString("en-us", options);
}

function updatePhotos(query) {
	$.ajax({
		url: 'https://api.parse.com/1/classes/all_photos?limit=1000&where='+JSON.stringify(query),
		type: 'GET',
		dataType: 'json',
		success: function(response) {
			// console.log(response);
			$('#img-container').empty();

			var curChunk = 0;
			var resultChunks = response['results'].chunk(30);

			function appendPhotos(chunkNo) {

				resultChunks[chunkNo].map(function(obj){
					var date = dateFromString(obj['datetime'])
					var imgUrl = "https://s3-us-west-2.amazonaws.com/electionscrape/" + obj['source'] + "/400px_" + obj['filename'];
					var fullImgUrl = "https://s3-us-west-2.amazonaws.com/electionscrape/" + obj['source'] + "/" + obj['filename'];
					$('#img-container').append(
						$('').append(
							''+slug2name[obj['candidate']]+'
'+date+'
'+obj['source']+''
						) // not a missing semicolon
					);
					// console.log(obj['candidate']);
					// console.log(obj['datetime']);
					// console.log(obj['source']);
					// console.log(obj['filename']);
				});

			}

			appendPhotos(curChunk);

			window.onscroll = function(ev) {
			    if ((window.innerHeight + window.scrollY) >= document.body.offsetHeight) {
			        curChunk++;
			        appendPhotos(curChunk);
			    }
			};


		},
		error: function(response) { "error" },
		beforeSend: setHeader
	});
}

function setHeader(xhr) {
	xhr.setRequestHeader("X-Parse-Application-Id", "ID-GOES-HERE");
	xhr.setRequestHeader("X-Parse-REST-API-Key", "KEY-GOES-HERE");
}

function makeQuery(candArr, tagArr) {

	orArr = tagArr.map(function(tag){
		return { "tags": tag };
	})

	if (tagArr.length === 0 && candArr.length > 0) {
		var query = {
			'candidate': {"$in": candArr}
		};
	}
	else if (tagArr.length > 0 && candArr.length === 0) {
		var query = {
			'$or': orArr
		};
	}
	else if (tagArr.length === 0 && candArr.length === 0) {
		var query = {};
	}
	else {
		var query = {
			'candidate': {"$in": candArr},
			'$or': orArr
		};
	}

	updatePhotos(query);

}

(function(){

$('.grid').masonry({
  // options
  itemSelector: '.grid-item',
  columnWidth: 300
});

var selectedCandidates = [];
var selectedTags = [];

$.getJSON("data/candidates.json", function(data){
	var candNames = Object.keys(data).map(function(slug){
		var name = data[slug]['name'];
		name2slug[name] = slug;
		slug2name[slug] = name;
		return name;
	}).sort();

	candNames.map(function(name){
		$('#candidate-dropdown').append(
			''+name+''
		);
	});

	$('.candidate-item').click(function(){
		var name = $(this).text();
		var slug = name2slug[name];
		if ($.inArray(slug, selectedCandidates) === -1) {
			selectedCandidates.push(slug);
			makeQuery(selectedCandidates, selectedTags);
			console.log(selectedCandidates);
			$('#selected-candidates').append(
				$('')
					.click(function(){
						$(this).fadeOut("fast", function(){
							selectedCandidates.remove(name2slug[$(this).text()]);
							makeQuery(selectedCandidates, selectedTags);
							console.log(selectedCandidates);
						});
					}) // THIS IS NOT A MISSING SEMI-COLON
			);
		}
	});
});


$.getJSON("data/tags.json", function(data){
	var tags = data["tags"].sort();
	tags.map(function(tag){
		$('#tag-dropdown').append(
			''+tag+''
		);
	});

	$('.tag-item').click(function(){
		var tag = $(this).text();
		if ($.inArray(tag, selectedTags) === -1) {
			selectedTags.push(tag);
			makeQuery(selectedCandidates, selectedTags);
			console.log(selectedTags);
			$('#selected-tags').append(
				$('')
					.click(function(){
						$(this).fadeOut("fast", function(){
							selectedTags.remove($(this).text());
							makeQuery(selectedCandidates, selectedTags);
							console.log(selectedTags);
						});
					})
			);
		}
	});
});

makeQuery(selectedCandidates, selectedTags);

})();