sound camera – THE HYPERTEXT

word.camera exhibition

rg — Tue, 24 Nov 2015 19:58:34 +0000

This week, I’ve been exhibiting my ongoing project, word.camera, at IDFA DocLab in Amsterdam. My installation consists of four cameras:

The original word.camera physical prototype:
The sound camera physical prototype:
A new word.camera model that uses a context-free grammar to generate poems based on the images it captures:
A talking, pan-tilt-zoom surveillance camera that looks for faces in the hallway and then describes them aloud. (See also: this Motherboard video)

During the exhibition, I was also invited to deliver two lectures. Here are my slides from the first lecture:

Click here for my lecture slides

And here’s a video of the second one:

Visitors are able to reserve the portable cameras for half hour blocks by leaving their ID at the volunteer kiosk. I have really enjoyed watching people borrow and use my cameras.

Sound Camera, Part III

rg — Wed, 21 Oct 2015 22:10:27 +0000

I completed the physical prototype of the sound camera inside the enclosure I specified in my prior post, the Kodak Brownie Model 2.

I started by adding a shutter button to the top of the enclosure. I used a Cherry MX Blue mechanical keyboard switch that I had leftover from a project last year.

The battery and Raspberry Pi just barely fit into the enclosure:

The Raspberry Pi camera module is wedged snugly beneath the camera’s front plate:

In additional to playing the song, I added some functionality that provides a bit of context to the user. Using the pico2wave text-to-speech utility, the camera speaks the tags aloud before playing the song. Additionally, using SoX, the camera plays an initialization tone generated from the color histogram of the image before reading the tags.

Here’s the code that’s currently running on the Raspberry Pi:

from __future__ import unicode_literals

import os
import json
import uuid
import time
from random import choice as rc
from random import sample as rs
import re
import subprocess

import RPi.GPIO as GPIO
import picamera
from clarifai.client import ClarifaiApi
import requests
from PIL import Image

import sys
import threading

import spotify

import genius_token

# SPOTIFY STUFF

# Assuming a spotify_appkey.key in the current dir
session = spotify.Session()

# Process events in the background
loop = spotify.EventLoop(session)
loop.start()

# Connect an audio sink
audio = spotify.AlsaSink(session)

# Events for coordination
logged_in = threading.Event()
logged_out = threading.Event()
end_of_track = threading.Event()

logged_out.set()


def on_connection_state_updated(session):
    if session.connection.state is spotify.ConnectionState.LOGGED_IN:
        logged_in.set()
        logged_out.clear()
    elif session.connection.state is spotify.ConnectionState.LOGGED_OUT:
        logged_in.clear()
        logged_out.set()


def on_end_of_track(self):
    end_of_track.set()

# Register event listeners
session.on(
    spotify.SessionEvent.CONNECTION_STATE_UPDATED, on_connection_state_updated)
session.on(spotify.SessionEvent.END_OF_TRACK, on_end_of_track)

# Assuming a previous login with remember_me=True and a proper logout
# session.relogin()
# session.login(genius_token.spotify_un, genius_token.spotify_pwd, remember_me=True)

# logged_in.wait()

# CAMERA STUFF

# Init Camera
camera = picamera.PiCamera()

# Init GPIO
GPIO.setmode(GPIO.BCM)

# Button Pin
GPIO.setup(18, GPIO.IN, pull_up_down=GPIO.PUD_UP)

IMGPATH = '/home/pi/soundcamera/img/'

clarifai_api = ClarifaiApi()

def chunks(l, n):
    """Yield successive n-sized chunks from l."""
    for i in xrange(0, len(l), n):
        yield l[i:i+n]

def take_photo():
    fn = str(int(time.time()))+'.jpg' # TODO: Change to timestamp hash
    fp = IMGPATH+fn
    camera.capture(fp)
    return fp

def chunks(l, n):
    """Yield successive n-sized chunks from l."""
    for i in xrange(0, len(l), n):
        yield l[i:i+n]

def get_tags(fp):
    fileObj = open(fp)
    result = clarifai_api.tag_images(fileObj)
    resultObj = result['results'][0]
    tags = resultObj['result']['tag']['classes']
    return tags

def genius_search(tags):
    access_token = genius_token.token
    payload = {
        'q': ' '.join(tags),
        'access_token': access_token
    }
    endpt = 'http://api.genius.com/search'
    response = requests.get(endpt, params=payload)
    results = response.json()
    hits = results['response']['hits']
    
    artists_titles = []
    
    for h in hits:
        hit_result = h['result']
        if hit_result['url'].endswith('lyrics'):
            artists_titles.append(
                (hit_result['primary_artist']['name'], hit_result['title'])
            )
    
    return artists_titles

def spotify_search(query):
    endpt = "https://api.spotify.com/v1/search"
    payload = {
        'q': query,
        'type': 'track'
    }
    response = requests.get(endpt, params=payload)
    result = response.json()
    result_zero = result['tracks']['items'][0]
    
    return result_zero['uri']

def main(fn):
    tags = get_tags(fn)
    for tag_chunk in chunks(tags,3):
        artists_titles = genius_search(tag_chunk)
        for artist, title in artists_titles:
            try:
                result_uri = spotify_search(artist+' '+title)
            except IndexError:
                pass
            else:
                print tag_chunk
                byline = "%s by %s" % (title, artist)
                print byline
                to_read = ', '.join(tag_chunk) + ". " + byline
                return to_read, result_uri

def play_uri(track_uri):
    # Play a track
    # audio = spotify.AlsaSink(session)
    session.login(genius_token.spotify_un, genius_token.spotify_pwd, remember_me=True)
    logged_in.wait()
    track = session.get_track(track_uri).load()
    session.player.load(track)
    session.player.play()


def stop_track():
    session.player.play(False)
    session.player.unload()
    session.logout()
    logged_out.wait()
    audio._close()

def talk(msg):
    proc = subprocess.Popen(
        ['bash', '/home/pi/soundcamera/play_text.sh', msg]
    )
    proc.communicate()

def play_tone(freqs):
    freq1, freq2 = freqs
    proc = subprocess.Popen(
        ['play', '-n', 'synth', '0.25', 'saw', "%i-%i" % (freq1, freq2)]
    )
    proc.communicate()

def histo_tone(fp):
    im = Image.open(fp)
    hist = im.histogram()
    vals = map(sum, chunks(hist, 64)) # list of 12 values
    print vals
    map(play_tone, chunks(vals,2))

if __name__ == "__main__":
    input_state = True
    new_state = True
    hold_counter = 0
    while 1:
        input_state = GPIO.input(18)
        if not (input_state and new_state):
            talk("capturing")

            # Hold for 15 seconds to turn off
            while not GPIO.input(18):
                time.sleep(0.1)
                hold_counter += 1
                if hold_counter > 150:
                    os.system('shutdown now -h')
                    sys.exit()

            # Reset hold counter
            hold_counter = 0

            # Else take photo
            try:
                img_fp = take_photo()
                msg, uri = main(img_fp)
                histo_tone(img_fp)
                talk(msg)
                play_uri(uri)
            except:
                print sys.exc_info()

            # Wait for playback to complete or Ctrl+C
            try:
                while not end_of_track.wait(0.1):
                    # If new photo, play new song
                    new_state = GPIO.input(18)
                    if not new_state:
                        stop_track()
                        # time.sleep(2)
                        break
            except KeyboardInterrupt:
                pass

Sound Camera, Part II

rg — Tue, 06 Oct 2015 02:20:44 +0000

Using JavaScript and Python Flask, I created a functional software prototype of the Sound Camera: rossgoodwin.com/soundcamera

The front-end JavaScript code is available on GitHub. Here is the primary back-end Python code:

import os
import json
import uuid
from base64 import decodestring
import time
from random import choice as rc
from random import sample as rs
import re

import PIL
from PIL import Image
import requests
import exifread

from flask import Flask, request, abort, jsonify
from flask.ext.cors import CORS
from werkzeug import secure_filename

from clarifai.client import ClarifaiApi

app = Flask(__name__)
CORS(app)

app.config['UPLOAD_FOLDER'] = '/var/www/SoundCamera/SoundCamera/static/img'
IMGPATH = '/var/www/SoundCamera/SoundCamera/static/img/'

clarifai_api = ClarifaiApi()

@app.route("/")
def index():
    return "These aren't the droids you're looking for."

@app.route("/img", methods=["POST"])
def img():
	request.get_data()
	if request.method == "POST":
		f = request.files['file']
		if f:
			filename = secure_filename(f.filename)
			f.save(os.path.join(app.config['UPLOAD_FOLDER'], filename))
			new_filename = resize_image(filename)
			return jsonify(uri=main(new_filename))
		else:
			abort(501)

@app.route("/b64", methods=["POST"])
def base64():
	if request.method == "POST":
		fstring = request.form['base64str']
		filename = str(uuid.uuid4())+'.jpg'
		file_obj = open(IMGPATH+filename, 'w')
		file_obj.write(fstring.decode('base64'))
		file_obj.close()
		return jsonify(uri=main(filename))

@app.route("/url")
def url():
	img_url = request.args.get('url')
	response = requests.get(img_url, stream=True)
	orig_filename = img_url.split('/')[-1]
	if response.status_code == 200:
		with open(IMGPATH+orig_filename, 'wb') as f:
			for chunk in response.iter_content(1024):
				f.write(chunk)
		new_filename = resize_image(orig_filename)
		return jsonify(uri=main(new_filename))
	else:
		abort(500)


# def allowed_img_file(filename):
#     return '.' in filename and \
# 		filename.rsplit('.', 1)[1].lower() in set(['.jpg', '.jpeg', '.png'])

def resize_image(fn):
    longedge = 640
    orientDict = {
        1: (0, 1),
        2: (0, PIL.Image.FLIP_LEFT_RIGHT),
        3: (-180, 1),
        4: (0, PIL.Image.FLIP_TOP_BOTTOM),
        5: (-90, PIL.Image.FLIP_LEFT_RIGHT),
        6: (-90, 1),
        7: (90, PIL.Image.FLIP_LEFT_RIGHT),
        8: (90, 1)
    }

    imgOriList = []
    try:
        f = open(IMGPATH+fn, "rb")
        exifTags = exifread.process_file(f, details=False, stop_tag='Image Orientation')
        if 'Image Orientation' in exifTags:
            imgOriList.extend(exifTags['Image Orientation'].values)
    except:
        pass

    img = Image.open(IMGPATH+fn)
    w, h = img.size
    newName = str(uuid.uuid4())+'.jpeg'
    if w >= h:
        wpercent = (longedge/float(w))
        hsize = int((float(h)*float(wpercent)))
        img = img.resize((longedge,hsize), PIL.Image.ANTIALIAS)
    else:
        hpercent = (longedge/float(h))
        wsize = int((float(w)*float(hpercent)))
        img = img.resize((wsize,longedge), PIL.Image.ANTIALIAS)

    for val in imgOriList:
        if val in orientDict:
            deg, flip = orientDict[val]
            img = img.rotate(deg)
            if flip != 1:
                img = img.transpose(flip)

    img.save(IMGPATH+newName, format='JPEG')
    os.remove(IMGPATH+fn)
    
    return newName

def chunks(l, n):
    """Yield successive n-sized chunks from l."""
    for i in xrange(0, len(l), n):
        yield l[i:i+n]

def get_tags(fp):
    fileObj = open(fp)
    result = clarifai_api.tag_images(fileObj)
    resultObj = result['results'][0]
    tags = resultObj['result']['tag']['classes']
    return tags

def genius_search(tags):
    access_token = 'd2IuV9fGKzYEWVnzmLVtFnm-EYvBQKR8Uh3I1cfZOdr8j-BGVTPThDES532dym5a'
    payload = {
        'q': ' '.join(tags),
        'access_token': access_token
    }
    endpt = 'http://api.genius.com/search'
    response = requests.get(endpt, params=payload)
    results = response.json()
    hits = results['response']['hits']
    
    artists_titles = []
    
    for h in hits:
        hit_result = h['result']
        if hit_result['url'].endswith('lyrics'):
            artists_titles.append(
                (hit_result['primary_artist']['name'], hit_result['title'])
            )
    
    return artists_titles

def spotify_search(query):
    endpt = "https://api.spotify.com/v1/search"
    payload = {
        'q': query,
        'type': 'track'
    }
    response = requests.get(endpt, params=payload)
    result = response.json()
    result_zero = result['tracks']['items'][0]
    
    return result_zero['uri']

def main(fn):
    tags = get_tags(IMGPATH+fn)
    for tag_chunk in chunks(tags,3):
        artists_titles = genius_search(tag_chunk)
        for artist, title in artists_titles:
            try:
                result_uri = spotify_search(artist+' '+title)
            except IndexError:
                pass
            else:
                return result_uri


if __name__ == "__main__":
    app.run()

It uses the same algorithm discussed in my prior post. Now that I have the opportunity to test it more, I am not quite satisfied with the results it is providing. First of all, they are not entirely deterministic (you can upload the same photo twice and end up with two different songs in some cases). Moreover, the results from a human face — which I expect to be a common use case — are not very personal. For the next steps in this project, I plan to integrate additional data including GPS, weather, time of day, and possibly even facial expressions in order to improve the output.

The broken cameras I ordered from eBay have arrived, and I have been considering how to use them as cases for the new models. I also purchased a GPS module for my Raspberry Pi, so the next Sound Camera prototype, with new features integrated, will likely be a physical version. I’m planning to use this Kodak Brownie camera (c. 1916):

Sound Camera

rg — Mon, 14 Sep 2015 04:06:42 +0000

I have been looking for ways to push the conceptual framework behind word.camera to another domain. This week, I have been prototyping a script that chooses music based on photographs. Ideally, the end result will be a wearable camera / music player that selects tracks for you based on your environment. Unfortunately, the domain sound.camera has been claimed, but I’m still planning to use the name “Sound Camera” for this project.

My code:
iPython Notebook

The script I wrote gets concept words from the image via Clarifai, then searches song lyrics for those words on Genius, then finds the song on Spotify. Below are some images I put through the algorithm. You can click on each one to hear the song that resulted, though you will need to login to Spotify to do so.

The next step will be to get this code working on a Raspberry Pi inside one of the film camera bodies I just received via eBay.