Showing posts with label simon 0.4. Show all posts
Showing posts with label simon 0.4. Show all posts

06 January 2013

304. Getting started with Simon 0.4 on Debian Wheezy/Testing (very basic)

Here's how to get started with Simon 0.4 -- although be warned that I've never used Simon before, and that there are likely better resources out there.

In the few cases where I use the command line, I have presumed that you are using the same locations as shown in this post: http://verahill.blogspot.com.au/2013/01/303-building-simon-04-speech.html

In case you screw up, to wipe all previous settings, try:

find ~/.kde -name "simon*"|xargs -I {} rm {} -rf

If you're really desperate, nuke the entire ~/.kde folder, although that obviously has repercussions if you're actually using KDE and not GNOME. Also look under ~/tmp/$USER-kde -- I had simond put files there too.

Anyway.


Running simon
simond &
simon

If you need to kill simond you can do
kill %1

assuming that you don't have any other background procs in that terminal.

1. Scenarios:
Click on Open, select Download and pick the scenarios which you are interested in. To make sure that things are working I'm more or less following the video.
In Scenarios, select a couple of H4W scenarios  (e.g. keyboard, mouse etc.) -- BUT NOT THE FIREFOX ONE, which causes trouble.








2. Speech Models:
Click on Open model, select Download and pick the Speech Model which you want. Pick the  HUB4 model since from the Youtube video it appears that you should match your scenarios and Speech models e.g. VF with VF and H4W with HUB4.






3. Server
Nothing weird here:


4. Sound Devices
I run my stuff via pulseaudio -- i.e. whatever the input source is there, will be used.


5. Volume
Do a bit of talking and see how the volume pans out. Ideally you should have any amplification of your microphone turned off, since that causes higher noise levels.


Julius problem:
If after clicking Finish you see this, you will want to work out what went wrong:
Make sure that you
1. compiled Simon with -DPJULIUS correctly set, and
2. you export the directory:
export PATH=$PATH:$HOME/.simon/julius/bin

Then restart simond and simon e.g. quit simon and kill simond, then launch simond in the background and then simon. To test if it's working correctly, go to Actions/Synchronize. No error means that it's working.


6. Get dictionary
Do e.g.
cd ~/.simon/
mkdir files
cd files
wget http://www.repository.voxforge1.org/downloads/SpeechCorpus/Trunk/Lexicon/VoxForge.tgz
tar xvf VoxForge.tgz

then select e.g. Standard, and click on Open "Standard"



  Click on Import Dictionary:
Select Shadow Dictionary and Next:
Pick HTK lexicon:

Import your file:
You should now see it compile the model and there should be stuff under Shadow Dictionary:



Testing
At this point you should be able to test whether you're being recognised. Make sure that the button says 'Activated' and not 'Activate'. Try speaking a few commands e.g. 'One', 'Two', 'Three','Chaos','Control' etc.
'Chaos'

'Control'
You'll see if it works by looking at "Last recognition results:".


Thoughts
Stuff that doesn't work: Configure Audio from within simon, Configure Acoustic Model from within Simon. Adaptive training with the VoxForge base model (complaining about grammar)

Stuff that does work: speech recognition, adaptive training for the HUB4 base model (and beautifully so).

While it's easy enough to get started, it does appear that there's no easy way of running the wizard again in case you want to change e.g. the base model or install more scenarios via download.

Obviously such a piece of software is fairly complex, and will have a high error rate, yet this is the type of software which should ideally just sit in the background and not really be noticed by the user -- it should (ideally) not require more work than e.g. using a mouse or a keyboard.

We're not quite there yet -- hence why we're at 0.4 and 1.0. Anyway, it's still kind of neat when my garbled utterances are correctly (most of the time) recognised by Simon. Now, how to get it to actually do stuff for me?

A. Get Scenarios later
visit http://kde-files.org/index.php?xcontentmode=692&PHPSESSID=0e48f2edd26bf70e676459a5465ec675


B. Get Base models later
visit http://kde-files.org/index.php?xcontentmode=648

05 January 2013

303. Compiling Simon on Debian Testing (Simon 0.4)

Here's how to get Simon 0.4.0 compiled. Simon does speech recognition which allows you do voice-control your computer. It does NOT do transcription, e.g. like Dragon Naturally Speaking.

See e.g. here, here and here for more information about Simon.

Simon uses cmake. I hate cmake since it makes life a lot more difficult than it needs to be.

Also, note that Simon relies heavily on KDE, so you need a number of KDE related files -- if not the whole desktop -- installed. Obviously, it runs fine under GNOME, which is where I'm using it.

This post is limited to describing how to compile it, not how to use it -- that may come later.



First get the deendencies. The list of deps is taken from http://userbase.kde.org/Simon/Development_Environment#Requirements, and expanded (e.g. libboost).

sudo apt-get install qt4-qtconfig kdelibs5-dev libxtst-dev libsamplerate-dev kdepimlibs5-dev libboost-dev
sudo apt-get install build-essential cmake gettext kdeartwork libqwt-dev libqt4-sql-sqlite libphonon-dev libattica0 libattica-dev zlib1g-dev libasound2-dev 


Next sort out Julius
I think it used to be in the repos, but it's not there anymore from what I can see).

mkdir ~/tmp/simon -p
cd ~/tmp/simon
wget http://jaist.dl.sourceforge.jp/julius/56549/julius-4.2.2.tar.gz
tar xvf julius-4.2.2.tar.gz
cd julius-4.2.2/
./configure --prefix=~/.simon/julius
**************************************************************** Julius/Julian libsent library rev.4.2.2: - Audio I/O primary mic device API : alsa (Advanced Linux Sound Architecture) available mic device API : alsa oss supported audio format : RAW and WAV only NetAudio support : no - Language Modeling class N-gram support : yes - Libraries file decompression by : zlib library - Process management fork on adinnet input : no Note: compilation time flags are now stored in "libsent-config". If you link this library, please add output of "libsent-config --cflags" to CFLAGS and "libsent-config --libs" to LIBS. ****************************************************************
make make install cd ../


HTK
You'll need HTK to use adaptive models with the Voxforge base. It's not necessary otherwise.
To get HTK, register here: http://htk.eng.cam.ac.uk/register.shtml

You should receive a password immediately afterwards. Then go to http://htk.eng.cam.ac.uk/download.shtml and download the Linux sources.Put the HTK-3.4.1.tar.gz file in ~/tmp/simon.
cd ~/tmp/simon
tar xvf HTK-3.4.1.tar.gz
cd htk/
./configure --prefix=$HOME/.simon/htk --disable-hslab
make all
mkdir ~/.simon/htk
make install

For some reason it's built as 32 bit, not 64. Also, it wants the parent install directory to exist before running make install.

Next, sphinx:

wget http://sourceforge.net/projects/cmusphinx/files/sphinxbase/0.8/sphinxbase-0.8.tar.gz
tar xvf sphinxbase-0.8.tar.gz 
cd sphinxbase-0.8/
./configure --prefix=$HOME/.simon/sphinxbase
make
make install
cd ../


wget http://sourceforge.net/projects/cmusphinx/files/pocketsphinx/0.8/pocketsphinx-0.8.tar.gz
tar xvf pocketsphinx-0.8.tar.gz
cd pocketsphinx-0.8/
./configure --prefix=$HOME/.simon/pocketsphinx SphinxBase_CFLAGS=-I$HOME/.simon/sphinxbase/include/sphinxbase SphinxBase_LIBS=-L$HOME/.simon/sphinxbase/lib
make
make install
cd ../

wget http://sourceforge.net/projects/cmusphinx/files/sphinxtrain/1.0.8/sphinxtrain-1.0.8.tar.gz
tar xvf sphinxtrain-1.0.8.tar.gz
cd sphinxtrain-1.0.8/
./configure --prefix=$HOME/.simon/sphinxtrain SphinxBase_CFLAGS=-I$HOME/.simon/sphinxbase/include/sphinxbase SphinxBase_LIBS=-L$HOME/.simon/sphinxbase/lib
make
make install


I couldn't sort out OpenCV, so this Simon won't be watching you.


Finally, Simon.
wget http://mirrors.mit.edu/kde/stable/simon/0.4.0/src/simon-0.4.0.tar.bz2
tar xvf simon-0.4.0.tar.bz2
mkdir build/
cd build/
cmake -DCMAKE_INSTALL_PREFIX=~/.simon -DPOCKETSPHINX_LIBRARIES=$HOME/.simon/pocketsphinx/lib/libpocketsphinx.so -DPOCKETSPHINX_INCLUDE_DIR=$HOME/.simon/pocketsphinx/include/ -DSphinxBase_INCLUDE_DIR=$HOME/.simon/sphinxbase/include/ -DSphinxBase_LIBRARY=$HOME/.simon/sphinxbase/lib/libsphinxbase.so -DPJULIUS=$HOME/.simon/julius -DPSPHINX=$HOME/.simon/sphinxbase ../simon-0.4.0/

at which points you should see:
----------------------------------------------------------------------------- -- The following external packages were located on your system. -- This installation will have the extra features provided by these packages. ----------------------------------------------------------------------------- * LibSampleRate - Resampling library * KDE PIM Libs - KDE Libraries for PIM * Sphinxbase - Open source toolkit for speech recognition * PocketSphinx - PocketSphinx is a small-footprint continuous speech recognition system ----------------------------------------------------------------------------- -- The following OPTIONAL packages could NOT be located on your system. -- Consider installing them to enable more features from this software. ----------------------------------------------------------------------------- * qaccessibilityclient KDE client-side accessibility library Required to enable ATSPI plugin. * OpenCV OpenCV (Open Source Computer Vision) is a library of programming functions for real time computer vision Required for Simon Vision -----------------------------------------------------------------------------


And now a bit of a hackjob -- we give the sphinx include dirs above, and while it works fine for most include files (they use e.g. sphinxbase/includeme.h), a number of files (prim_type, sphnx_config.h etc.) point towards the wrong directory (e.g. #include ). Two solutions -- either edit the files or put symmlinks where the files are looking. Symmlinks work well for me.


ln -s ~/.simon/sphinxbase/include/sphinxbase/sphinx_config.h ~/.simon/sphinxbase/include/sphinx_config.h
ln -s ~/.simon/pocketsphinx/include/pocketsphinx/pocketsphinx_export.h ~/.simon/pocketsphinx/include/pocketsphinx_export.h
ln -s ~/.simon/pocketsphinx/include/pocketsphinx/cmdln_macro.h ~/.simon/pocketsphinx/include/cmdln_macro.h
ln -s ~/.simon/pocketsphinx/include/pocketsphinx/ps_lattice.h ~/.simon/pocketsphinx/include/ps_lattice.h
ln -s ~/.simon/pocketsphinx/include/pocketsphinx/ps_mllr.h ~/.simon/pocketsphinx/include/ps_mllr.h
ln -s ~/.simon/pocketsphinx/include/pocketsphinx/fsg_set.h ~/.simon/pocketsphinx/include/fsg_set.h

make
make install
cd ~/.simon/bin
export PATH=$PATH:/$HOME/.simon/julius/bin:$HOME/.simon/sphinxtrain/bin:$HOME/.simon/htk/bin:$HOME/.simon/bin
simond &
simon

I haven't really explored it yet, but here are the first few dialogues:




There's an official video which you may want to watch before you get started:



Issues:
Stuff that doesn't work: Configure Audio from within simon, Configure Acoustic Model from within Simon
Stuff that does work: speech recognition.

Note:
If you set it up 'wrong' the first time around (or it crashed coding samples as above)) and want the wizard to run again, delete all the configuration files:
rm ~/.kde/share/apps/simon* -rf
rm ~/.kde/share/config/simon*

Links to this post:
http://www.forum-raspberrypi.de/Thread-speech-to-text?pid=13465