Speech Recognition using Sphinx3 And Python

First, install:
sudo apt-get install build-essential autoconf libtool automake libasound2-dev python-dev subversion sox libsox-fmt-all

Download from sourceforge:
sphinx3 - http://sourceforge.net/project/showfiles.php?group_id=1904&package_id=68406
sphinxbase - http://sourceforge.net/project/showfiles.php?group_id=1904&package_id=199515

mkdir sphinx && cd sphinx
tar xzpf sphinx3...
tar xzpf sphinxbase...
cd sphinxbase
./autogen.sh --prefix=/usr
make && make install
cd sphinx3
./autogen.sh --prefix=/usr
make && make install
cd python
nano setup.py (adjust the path)
python setup.py install

Record yourself saying "yellow" and "please":

sox yellow.wav -r 16000 -c 1 -s -w yellow16k.raw
sox please.wav -r 16000 -c 1 -s -w please16k.raw

Language/dictionary

cd sphinx
wget http://www.inference.phy.cam.ac.uk/kv227/lm_giga/lm_giga_5k_nvp_3gram.zip
unzip lm_giga_5k_nvp_3gram.zip
svn co https://cmusphinx.svn.sourceforge.net/svnroot/cmusphinx/trunk/share/lm3g2dmp
cd lm3g2dmp && make && cd ..
lm3g2dmp/lm3g2dmp lm_giga_5k_nvp_3gram/lm_giga_5k_nvp_3gram.arpa lm_giga_5k_nvp_3gram

cfgfile

-samprate 16000
-hmm /src/sphinx/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd
-dict /src/sphinx/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp.sphinx.dic
-fdict /src/sphinx/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp.sphinx.filler
-lm /src/sphinx/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp_3gram.arpa.DMP

ctlfile

hello16k
please16k
yellow16k

Test

cd sphinx
sphinx3_livepretend ctlfile . cfgfile

Python speech recognition

cd sphinx
python
import _sphinx3
print dir(_sphinx3)
_sphinx3.parse_argfile("cfgfile")
_sphinx3.init()
data = open("pleaseaudio.raw").read()
_sphinx3.decode_raw(data)

Look for:

FWDVIT: PLEASE (* 107 428Z131015)

Speech Recognition OSC

import _sphinx3, sys, os, osc

host = "localhost"    
portout = 9000
osc.init()
osc.sendMsg("/ready","1",host,portout)

_sphinx3.parse_argfile(sys.path[0]+"/cfgfile")
_sphinx3.init()
data = open("/tmp/r16k.raw").read()
words = _sphinx3.decode_raw(data)
osc.sendMsg("/words",str.lower(words[0]),host,portout)

http://people.csail.mit.edu/hubert/pyaudio/ to record audio in python

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License