Sunday, September 15, 2013

BeagleBone Black based voice recognition on an LED Matrix.

A little over a month ago I was at the BrainSilo hacker space in Portland with some friends,
we were playing around with our HackRF JawBreaker boards, after a while we got board and started chatting and throwing crazy ideas in the air, I got a BeagleBone Black at Defcon and I really wanted to do something with it.

And so one of the ideas was:
"Let's have the beagle bone do speech recognition and output it on an LED matrix and see how it messes up and laugh at it"
so far so good, seem like a fun thing to do for us geeks, so I decided to try it, it is kind of a challenge that won't take too much time off my hands.

At this point I only had the BeagleBone and nothing else, so I started with the largest hurdle, running voice recognition on the BBB (from this point on I will refer to the BeagleBone Black as BBB).
So I search the web looking for solutions, one of them was Texas Instruments Embedded Speech Recognition solution that recently went on the open source path, which oddly requires you to register and wait to be approved as a member before you even get to see a byte of code.
That turned out to be a bust, WAY too complicated to build run and expand for a lazy hacker like me, I want something that is script language friendly and will run without fancy compile tricks.

So... 
I turned to the internet again and looked for python based/friendly voice recognition options, one of them was pocket sphinx that also happen to run on the BBB almost smoothly. why almost smoothly?
Well, because the python side of the Sphinx never worked for me, so I had to do an ugly hack which I will explain later.

Anyway, Now that I got a Pocket sphinx running on the BBB with this command line:
pocketsphinx_continuous -adcdev hw:1,0 -nfft 2048 -samprate 48000 2>/dev/null
(Using a USB sound card off ebay I connected a microphone to the BBB)

So at this point I have in my hands, a BBB with a fairly decent voice recognition software that actually runs!
yay!

Now the next step is to have the BBB display stuff on an LED Matrix, 
I looked into a few solutions, an SPI based controller combined with an L:ED Matrix, which didn't really work on the BBB, and because I didn't want to spend too much time on SPI based re-coding on the BBB I moved on to an I2C based controller, 
I found the right one in AdaFruit, the I2C LED Backpack was perfect, it had code examples in python and someone was using it on the BBB.

But I could not find the code for displaying scrolling text or any text for that matter...
I had no choice but to do some coding, My code is at the bottom of this post, I could not upload a .py file so I just decided to paste to code at the bottom of this post.

The last step left was to combine the two, this is where the ugly hack comes in to view, I used python to run pocket sphinx in a command line ,read it's stdout stream, parse it and display it accordingly on the LED Matrix.

Technicals:
BBB running Angstrom distro unmodified at all, just download and use.
pocket sphinx 0.8 

Refrences:
Demo Video - http://youtu.be/Ssd_tyXPa5E


Note, 
All I did was hack a few things together, there has been a lot of work done on each ingredient that led to this result. Thanks to AdaFruit and their code and Pocket-Sphinx and their implementation working on the BBB This could not have worked.
I went through many links on the internets with many forum posts and suggested solution to many problems I encountered, I do apologize for not listing them all.
The code I wrote is posted here under the "do not be a douche" license, meaning you can use it, but don't try to make any profit off it.
Hope you liked this thing!







runVisibleVoice.py
<CODE>
#!/usr/bin/python

import time
import datetime
import math
from Adafruit_8x8 import EightByEight
import sys, select, subprocess

grid = EightByEight(address=0x70)

#print "Press CTRL+Z to exit"
AZ = [0x7E, 0x11, 0x11, 0x11, 0x7E, #  A
0x7F, 0x49, 0x49, 0x49, 0x36,   #  B
0x3E, 0x41, 0x41, 0x41, 0x22,   #  C
0x7F, 0x41, 0x41, 0x22, 0x1C,   #  D
0x7F, 0x49, 0x49, 0x49, 0x41,   #  E
0x7F, 0x09, 0x09, 0x01, 0x01,   #  F
0x3E, 0x41, 0x41, 0x51, 0x32,   #  G
0x7F, 0x08, 0x08, 0x08, 0x7F,   #  H
0x00, 0x41, 0x7F, 0x41, 0x00,   #  I
0x20, 0x40, 0x41, 0x3F, 0x01,   #  J
0x7F, 0x08, 0x14, 0x22, 0x41,   #  K
0x7F, 0x40, 0x40, 0x40, 0x40,   #  L
0x7F, 0x02, 0x04, 0x02, 0x7F,   #  M
0x7F, 0x04, 0x08, 0x10, 0x7F,   #  N
0x3E, 0x41, 0x41, 0x41, 0x3E,   #  O
0x7F, 0x09, 0x09, 0x09, 0x06,   #  P
0x3E, 0x41, 0x51, 0x21, 0x5E,   #  Q
0x7F, 0x09, 0x19, 0x29, 0x46,   #  R
0x46, 0x49, 0x49, 0x49, 0x31,   #  S
0x01, 0x01, 0x7F, 0x01, 0x01,   #  T
0x3F, 0x40, 0x40, 0x40, 0x3F,   #  U
0x1F, 0x20, 0x40, 0x20, 0x1F,   #  V
0x7F, 0x20, 0x18, 0x20, 0x7F,   #  W
0x63, 0x14, 0x08, 0x14, 0x63,   #  X
0x03, 0x04, 0x78, 0x04, 0x03,   #  Y
0x61, 0x51, 0x49, 0x45, 0x43]   #  Z

az = [0x20, 0x54, 0x54, 0x54, 0x78,
0x7F, 0x48, 0x44, 0x44, 0x38,
0x38, 0x44, 0x44, 0x44, 0x20,
0x38, 0x44, 0x44, 0x48, 0x7F,
0x38, 0x54, 0x54, 0x54, 0x18,
0x08, 0x7E, 0x09, 0x01, 0x02,
0x08, 0x14, 0x54, 0x54, 0x3C,
0x7F, 0x08, 0x04, 0x04, 0x78,
0x00, 0x44, 0x7D, 0x40, 0x00,
0x20, 0x40, 0x44, 0x3D, 0x00,
0x00, 0x7F, 0x10, 0x28, 0x44,
0x00, 0x41, 0x7F, 0x40, 0x00,
0x7C, 0x04, 0x18, 0x04, 0x78,
0x7C, 0x08, 0x04, 0x04, 0x78,
0x38, 0x44, 0x44, 0x44, 0x38,
0x7C, 0x14, 0x14, 0x14, 0x08,
0x08, 0x14, 0x14, 0x18, 0x7C,
0x7C, 0x08, 0x04, 0x04, 0x08,
0x48, 0x54, 0x54, 0x54, 0x20,
0x04, 0x3F, 0x44, 0x40, 0x20,
0x3C, 0x40, 0x40, 0x20, 0x7C,
0x1C, 0x20, 0x40, 0x20, 0x1C,
0x3C, 0x40, 0x30, 0x40, 0x3C,
0x44, 0x28, 0x10, 0x28, 0x44,
0x0C, 0x50, 0x50, 0x50, 0x3C,
0x44, 0x64, 0x54, 0x4C, 0x44]
    
space = [0x00,0x00,0x00,0x00,0x00] # ord = 32
dot = [0x00, 0x60, 0x60, 0x00, 0x00] # .  ord = 46
   

def main():
    while(True):
      i=0
      while (i<130):
          grid.writeRowRaw(5,AZ[i])
          grid.writeRowRaw(4,AZ[i+1])
          grid.writeRowRaw(3,AZ[i+2])
          grid.writeRowRaw(2,AZ[i+3])
          grid.writeRowRaw(1,AZ[i+4])
          time.sleep(1)
          grid.writeRowRaw(5,az[i])
          grid.writeRowRaw(4,az[i+1])
          grid.writeRowRaw(3,az[i+2])
          grid.writeRowRaw(2,az[i+3])
          grid.writeRowRaw(1,az[i+4])
          i=i+5
          time.sleep(1)
          print i
      
      grid.clear()
      time.sleep(0.05)

def runstring(text):
    #ord a = 97 ==> first element in the array, 97 == 0 98 == 5
    #print text
    grid.clear()
    scroll = [];
    #first append empty 8 columns
    scroll.append(0x00);scroll.append(0x00);scroll.append(0x00);scroll.append(0x00);
    scroll.append(0x00);scroll.append(0x00);scroll.append(0x00);scroll.append(0x00);
    for c in text:
        #print ord(c)
        num = ord(c)
        if ((num > 64) and (num < 123)): # is a letter
            #Build a scrolling string
            if (num in range(65,90)):
                i=int(math.fabs(65-num))*5
                scroll.append(AZ[i]);
                scroll.append(AZ[i+1]);
                scroll.append(AZ[i+2]);
                scroll.append(AZ[i+3]);
                scroll.append(AZ[i+4]);
                scroll.append(0x00);
                #print 'CAPITAL'
            else:
                i=int(math.fabs(97-num))*5
                scroll.append(az[i]);
                scroll.append(az[i+1]);
                scroll.append(az[i+2]);
                scroll.append(az[i+3]);
                scroll.append(az[i+4]);
                scroll.append(0x00);
                #print 'regular'
        else:
            if (num == 46):
                scroll.append(dot[0]);
                scroll.append(dot[1]);
                scroll.append(dot[2]);
                scroll.append(dot[3]);
                scroll.append(dot[4]);
            else:
                scroll.append(space[0]);
                scroll.append(space[1]);
                scroll.append(space[2]);
                scroll.append(space[3]);
                scroll.append(space[4]);
    #end with empty 8 columns
    scroll.append(0x00);scroll.append(0x00);scroll.append(0x00);scroll.append(0x00);
    scroll.append(0x00);scroll.append(0x00);scroll.append(0x00);scroll.append(0x00);
               
    i=0
    while i <= len(scroll):
        try :
            grid.writeRowRaw(7,scroll[i])
            grid.writeRowRaw(6,scroll[i+1])
            grid.writeRowRaw(5,scroll[i+2])
            grid.writeRowRaw(4,scroll[i+3])
            grid.writeRowRaw(3,scroll[i+4])
            grid.writeRowRaw(2,scroll[i+5])
            grid.writeRowRaw(1,scroll[i+6])
            grid.writeRowRaw(0,scroll[i+7])
            time.sleep(0.04)
            i=i+1
        except :
            #print 'exception'
            break
            pass
    grid.clear()  
    #print 'end of func'
   
#runstring("Hello My name is Inigo Montoya. You killed my father. Prepare to die");

proc = subprocess.Popen(['sh', '-c', 'pocketsphinx_continuous -adcdev hw:1,0 -nfft 2048 -samprate 48000 2>/dev/null'],stdout=subprocess.PIPE)
while True:
    line = proc.stdout.readline()
    if line != '':
        #the real code does filtering here
        output = line.rstrip()
        print output
        if (len(output.split("READY"))>1):
            runstring("Speak")
        if (len(output.split("please wait"))>1):
            runstring("Please wait")
        if (len(output.split(":"))>1):
            runstring(str(output.split(":")[1])+'.')
    else:
        break

</CODE> 
 

3 comments:

  1. Voice recognition represents one of the new technology which have revolutionized searching.Visit Voice Recognition Software to do anything you want without touching your keyboard.

    ReplyDelete
  2. LED display signs can be of changing sizes, contingent upon the necessities of the customer. Anyway LED display signs for open air utilize will in general be moderately substantial, particularly in correlation with the ticker estimated displays that are frequently utilized inside, in cafés and open territories.led displays

    ReplyDelete
  3. LED televisions really run the gamut in terms of issues people encounter with viewing angle. Some users report a sharp decline in both contrast and deep black tones that LEDs are renowned for when the viewing angle is a mere twenty degrees or more off-center. 43 inch best tv

    ReplyDelete