07 January 2013

305. make -jN -- should N equal number of cores or N+1 cores? Optimal number of threads per core

Update: I repeated this test by compiling kernel 3.7.2 using different settings (http://verahill.blogspot.com.au/2013/01/321-compiling-kernel-372-on-debian.html)  -- given the length of the compile and it's reliance on CPU grunt it is probably a better test case. It came out showing that N -- or even N-1 -- was better than over-committing.

The original commentator also offered this explanation:
Historically N+1 or even N*1.5 was used & worked better on memory / I-O constrained systems where the available cache was used as a short lasting one to feed the extra committed threads / processes while I-O was in progress. As you've observed correctly this is not the case on machines that have an abundance of RAM, where this acts as a long lasting cache, no data that got written to disk will be read back > spawning additional threads / processes has therefore a detrimental effect on efficiency due to (much) more rescheduling / TLB shootdown interrupts. In short, when available ram is larger then total disk-space needed for build N = amount of logical cpu's if not N = logical cpu's + 1 Setting the global environment variable (CONCURRENCY_LEVEL) instead of fixed values for -j for automated builds using the previously mentioned #export CONCURRENCY_LEVEL=`getconf _NPROCESSORS_ONLN` is always the safest bet, especially when using server grade machines and high speed 0 seek time solid state disks ...
I think the conclusion is the one offered above -- stick to N for optimal performance, unless you have a compelling reason not to. I should also emphasize that I don't have a background in computing of any sort, whereas the poster is a professional in the HPC field.

So if I'm allowed to paraphrase and make conclusions:
for a very short compile, like the one in this post, you may find that N+1 seemingly gives a better result since disk I/O plays a big part relative to the code generation (and whatever else a compiler does). For a longer, more 'normal' compilation disk I/O play a smaller part.

If your RAM is too small and you need to cache to disk repeatedly, then that obviously increases the disk I/O as well.

In the end, the penalty for over-committing (http://verahill.blogspot.com.au/2013/01/321-compiling-kernel-372-on-debian.html) is large enough that it's a better bet to just got for N threads.

I really shouldn't be surprised -- it's the same effect you see when launching a computational job: you do NOT want to launch more threads than cores.

Original post:
I got a comment recently regarding the number of threads that should be used for make:
make -j7 is the number of cores +1 

Stop copy paste nonsense.... sigh...

make -j1 will spawn 1 worker process
-j7 will spawn 7. 

#export CONCURRENCY_LEVEL=`getconf _NPROCESSORS_ONLN`

makes adding -jjob unnecessary 
on an i7 this is the same as -j8

When in doubt check top.....

So the question is whether for N cores, should you spawn N threads or N+1? The poster has a valid point -- there's not that much data on what really is the best configuration and while most people keep repeating the (mostly) accepted N+1 (or 1.5*N)  wisdom, we really need more hard numbers.

So here's my real-world unscientific benchmark for compiling Gromacs 4.5.5 on a six core AMD Phenom II  1055T with 8 Gb RAM and a slow 5400 rpm hard drive (disk I/O plays into things as well). I'm using gcc 4.7.2-4 and Debian Wheezy/Testing.

To get the data I used this script, maketest.sh:

make distclean
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/openmpi/lib:/opt/openblas/lib
export LDFLAGS="-L/opt/fftw/fftw-3.3.2/single/lib -L/opt/openblas/lib -lopenblas"
export CPPFLAGS="-I/opt/fftw/fftw-3.3.2/single/include -I/opt/openblas/include"
./configure --disable-mpi --enable-float --with-fft=fftw3 --with-external-blas --with-external-lapack --program-suffix=_sp --prefix=/opt/gromacs/gromacs-4.5.5
time make -j$1

which I called with e.g.
sh maketest.sh 6

Admittedly, this is a fairly short build but it is a 'real' one.

Results:
N    Time (real)
1    9 m 52 s
2    5 m 18 s
3    3 m 48 s
4    3 m 02 s
5    2 m 24 s
6    2 m 16 s
7    2 m 05 s
8    2 m 06 s
9    2 m 07 s
10   2 m 07 s
11   2 m 08 s
12   2 m 09 s
Or as a plot:
The buiild time decreases roughly exponentially with the number of threads. The blue line is at 125 seconds i.e. dx/dy=0.
I'm actually quite surprised at how N+1 turned out to be the best configuration, although in general it seems that you don't suffer any penalty for using more threads, so 1.5*N works just as well.

I also ran sar (sysstat; sar -u 1 180 |gawk '{print $3,$5,$8}' |tee n7.dat ) for -j7 to see how the load varies with time during make (I collected a little bit of data before and after make, hence the flat line at the end):
The black/blue (user/idle) lines are what are interesting here
The build is very evidently not perfectly parallel at all stages, and that will also affect the optimal number of threads/core.


Raw results
N=1
real    9m51.519s
user    6m43.316s
sys     0m44.092s
N=2
real    5m18.359s
user    7m3.548s
sys     0m46.112s
N=3
real    3m47.850s
user    7m22.732s
sys     0m47.064s
N=4
real    3m2.131s
user    7m56.068s
sys     0m41.744s
N=5
real    2m24.258s
user    7m53.140s
sys     0m34.928s
N=6
real    2m16.429s
user    8m15.088s
sys     0m27.160s
N=7
real    2m5.361s
user    7m50.200s
sys     0m28.280s
N=8
real    2m5.820s
user    7m52.380s
sys     0m27.548s
N=9
real    2m7.266s
user    7m54.344s
sys     0m28.340s
N=10
real    2m7.057s
user    7m56.628s
sys     0m27.872s
N=11
real    2m7.728s
user    7m58.276s
sys     0m27.332s
N=12
real    2m8.819s
user    8m0.600s
sys     0m27.544s

06 January 2013

304. Getting started with Simon 0.4 on Debian Wheezy/Testing (very basic)

Here's how to get started with Simon 0.4 -- although be warned that I've never used Simon before, and that there are likely better resources out there.

In the few cases where I use the command line, I have presumed that you are using the same locations as shown in this post: http://verahill.blogspot.com.au/2013/01/303-building-simon-04-speech.html

In case you screw up, to wipe all previous settings, try:

find ~/.kde -name "simon*"|xargs -I {} rm {} -rf

If you're really desperate, nuke the entire ~/.kde folder, although that obviously has repercussions if you're actually using KDE and not GNOME. Also look under ~/tmp/$USER-kde -- I had simond put files there too.

Anyway.


Running simon
simond &
simon

If you need to kill simond you can do
kill %1

assuming that you don't have any other background procs in that terminal.

1. Scenarios:
Click on Open, select Download and pick the scenarios which you are interested in. To make sure that things are working I'm more or less following the video.
In Scenarios, select a couple of H4W scenarios  (e.g. keyboard, mouse etc.) -- BUT NOT THE FIREFOX ONE, which causes trouble.








2. Speech Models:
Click on Open model, select Download and pick the Speech Model which you want. Pick the  HUB4 model since from the Youtube video it appears that you should match your scenarios and Speech models e.g. VF with VF and H4W with HUB4.






3. Server
Nothing weird here:


4. Sound Devices
I run my stuff via pulseaudio -- i.e. whatever the input source is there, will be used.


5. Volume
Do a bit of talking and see how the volume pans out. Ideally you should have any amplification of your microphone turned off, since that causes higher noise levels.


Julius problem:
If after clicking Finish you see this, you will want to work out what went wrong:
Make sure that you
1. compiled Simon with -DPJULIUS correctly set, and
2. you export the directory:
export PATH=$PATH:$HOME/.simon/julius/bin

Then restart simond and simon e.g. quit simon and kill simond, then launch simond in the background and then simon. To test if it's working correctly, go to Actions/Synchronize. No error means that it's working.


6. Get dictionary
Do e.g.
cd ~/.simon/
mkdir files
cd files
wget http://www.repository.voxforge1.org/downloads/SpeechCorpus/Trunk/Lexicon/VoxForge.tgz
tar xvf VoxForge.tgz

then select e.g. Standard, and click on Open "Standard"



  Click on Import Dictionary:
Select Shadow Dictionary and Next:
Pick HTK lexicon:

Import your file:
You should now see it compile the model and there should be stuff under Shadow Dictionary:



Testing
At this point you should be able to test whether you're being recognised. Make sure that the button says 'Activated' and not 'Activate'. Try speaking a few commands e.g. 'One', 'Two', 'Three','Chaos','Control' etc.
'Chaos'

'Control'
You'll see if it works by looking at "Last recognition results:".


Thoughts
Stuff that doesn't work: Configure Audio from within simon, Configure Acoustic Model from within Simon. Adaptive training with the VoxForge base model (complaining about grammar)

Stuff that does work: speech recognition, adaptive training for the HUB4 base model (and beautifully so).

While it's easy enough to get started, it does appear that there's no easy way of running the wizard again in case you want to change e.g. the base model or install more scenarios via download.

Obviously such a piece of software is fairly complex, and will have a high error rate, yet this is the type of software which should ideally just sit in the background and not really be noticed by the user -- it should (ideally) not require more work than e.g. using a mouse or a keyboard.

We're not quite there yet -- hence why we're at 0.4 and 1.0. Anyway, it's still kind of neat when my garbled utterances are correctly (most of the time) recognised by Simon. Now, how to get it to actually do stuff for me?

A. Get Scenarios later
visit http://kde-files.org/index.php?xcontentmode=692&PHPSESSID=0e48f2edd26bf70e676459a5465ec675


B. Get Base models later
visit http://kde-files.org/index.php?xcontentmode=648

05 January 2013

303. Compiling Simon on Debian Testing (Simon 0.4)

Here's how to get Simon 0.4.0 compiled. Simon does speech recognition which allows you do voice-control your computer. It does NOT do transcription, e.g. like Dragon Naturally Speaking.

See e.g. here, here and here for more information about Simon.

Simon uses cmake. I hate cmake since it makes life a lot more difficult than it needs to be.

Also, note that Simon relies heavily on KDE, so you need a number of KDE related files -- if not the whole desktop -- installed. Obviously, it runs fine under GNOME, which is where I'm using it.

This post is limited to describing how to compile it, not how to use it -- that may come later.



First get the deendencies. The list of deps is taken from http://userbase.kde.org/Simon/Development_Environment#Requirements, and expanded (e.g. libboost).

sudo apt-get install qt4-qtconfig kdelibs5-dev libxtst-dev libsamplerate-dev kdepimlibs5-dev libboost-dev
sudo apt-get install build-essential cmake gettext kdeartwork libqwt-dev libqt4-sql-sqlite libphonon-dev libattica0 libattica-dev zlib1g-dev libasound2-dev 


Next sort out Julius
I think it used to be in the repos, but it's not there anymore from what I can see).

mkdir ~/tmp/simon -p
cd ~/tmp/simon
wget http://jaist.dl.sourceforge.jp/julius/56549/julius-4.2.2.tar.gz
tar xvf julius-4.2.2.tar.gz
cd julius-4.2.2/
./configure --prefix=~/.simon/julius
**************************************************************** Julius/Julian libsent library rev.4.2.2: - Audio I/O primary mic device API : alsa (Advanced Linux Sound Architecture) available mic device API : alsa oss supported audio format : RAW and WAV only NetAudio support : no - Language Modeling class N-gram support : yes - Libraries file decompression by : zlib library - Process management fork on adinnet input : no Note: compilation time flags are now stored in "libsent-config". If you link this library, please add output of "libsent-config --cflags" to CFLAGS and "libsent-config --libs" to LIBS. ****************************************************************
make make install cd ../


HTK
You'll need HTK to use adaptive models with the Voxforge base. It's not necessary otherwise.
To get HTK, register here: http://htk.eng.cam.ac.uk/register.shtml

You should receive a password immediately afterwards. Then go to http://htk.eng.cam.ac.uk/download.shtml and download the Linux sources.Put the HTK-3.4.1.tar.gz file in ~/tmp/simon.
cd ~/tmp/simon
tar xvf HTK-3.4.1.tar.gz
cd htk/
./configure --prefix=$HOME/.simon/htk --disable-hslab
make all
mkdir ~/.simon/htk
make install

For some reason it's built as 32 bit, not 64. Also, it wants the parent install directory to exist before running make install.

Next, sphinx:

wget http://sourceforge.net/projects/cmusphinx/files/sphinxbase/0.8/sphinxbase-0.8.tar.gz
tar xvf sphinxbase-0.8.tar.gz 
cd sphinxbase-0.8/
./configure --prefix=$HOME/.simon/sphinxbase
make
make install
cd ../


wget http://sourceforge.net/projects/cmusphinx/files/pocketsphinx/0.8/pocketsphinx-0.8.tar.gz
tar xvf pocketsphinx-0.8.tar.gz
cd pocketsphinx-0.8/
./configure --prefix=$HOME/.simon/pocketsphinx SphinxBase_CFLAGS=-I$HOME/.simon/sphinxbase/include/sphinxbase SphinxBase_LIBS=-L$HOME/.simon/sphinxbase/lib
make
make install
cd ../

wget http://sourceforge.net/projects/cmusphinx/files/sphinxtrain/1.0.8/sphinxtrain-1.0.8.tar.gz
tar xvf sphinxtrain-1.0.8.tar.gz
cd sphinxtrain-1.0.8/
./configure --prefix=$HOME/.simon/sphinxtrain SphinxBase_CFLAGS=-I$HOME/.simon/sphinxbase/include/sphinxbase SphinxBase_LIBS=-L$HOME/.simon/sphinxbase/lib
make
make install


I couldn't sort out OpenCV, so this Simon won't be watching you.


Finally, Simon.
wget http://mirrors.mit.edu/kde/stable/simon/0.4.0/src/simon-0.4.0.tar.bz2
tar xvf simon-0.4.0.tar.bz2
mkdir build/
cd build/
cmake -DCMAKE_INSTALL_PREFIX=~/.simon -DPOCKETSPHINX_LIBRARIES=$HOME/.simon/pocketsphinx/lib/libpocketsphinx.so -DPOCKETSPHINX_INCLUDE_DIR=$HOME/.simon/pocketsphinx/include/ -DSphinxBase_INCLUDE_DIR=$HOME/.simon/sphinxbase/include/ -DSphinxBase_LIBRARY=$HOME/.simon/sphinxbase/lib/libsphinxbase.so -DPJULIUS=$HOME/.simon/julius -DPSPHINX=$HOME/.simon/sphinxbase ../simon-0.4.0/

at which points you should see:
----------------------------------------------------------------------------- -- The following external packages were located on your system. -- This installation will have the extra features provided by these packages. ----------------------------------------------------------------------------- * LibSampleRate - Resampling library * KDE PIM Libs - KDE Libraries for PIM * Sphinxbase - Open source toolkit for speech recognition * PocketSphinx - PocketSphinx is a small-footprint continuous speech recognition system ----------------------------------------------------------------------------- -- The following OPTIONAL packages could NOT be located on your system. -- Consider installing them to enable more features from this software. ----------------------------------------------------------------------------- * qaccessibilityclient KDE client-side accessibility library Required to enable ATSPI plugin. * OpenCV OpenCV (Open Source Computer Vision) is a library of programming functions for real time computer vision Required for Simon Vision -----------------------------------------------------------------------------


And now a bit of a hackjob -- we give the sphinx include dirs above, and while it works fine for most include files (they use e.g. sphinxbase/includeme.h), a number of files (prim_type, sphnx_config.h etc.) point towards the wrong directory (e.g. #include ). Two solutions -- either edit the files or put symmlinks where the files are looking. Symmlinks work well for me.


ln -s ~/.simon/sphinxbase/include/sphinxbase/sphinx_config.h ~/.simon/sphinxbase/include/sphinx_config.h
ln -s ~/.simon/pocketsphinx/include/pocketsphinx/pocketsphinx_export.h ~/.simon/pocketsphinx/include/pocketsphinx_export.h
ln -s ~/.simon/pocketsphinx/include/pocketsphinx/cmdln_macro.h ~/.simon/pocketsphinx/include/cmdln_macro.h
ln -s ~/.simon/pocketsphinx/include/pocketsphinx/ps_lattice.h ~/.simon/pocketsphinx/include/ps_lattice.h
ln -s ~/.simon/pocketsphinx/include/pocketsphinx/ps_mllr.h ~/.simon/pocketsphinx/include/ps_mllr.h
ln -s ~/.simon/pocketsphinx/include/pocketsphinx/fsg_set.h ~/.simon/pocketsphinx/include/fsg_set.h

make
make install
cd ~/.simon/bin
export PATH=$PATH:/$HOME/.simon/julius/bin:$HOME/.simon/sphinxtrain/bin:$HOME/.simon/htk/bin:$HOME/.simon/bin
simond &
simon

I haven't really explored it yet, but here are the first few dialogues:




There's an official video which you may want to watch before you get started:



Issues:
Stuff that doesn't work: Configure Audio from within simon, Configure Acoustic Model from within Simon
Stuff that does work: speech recognition.

Note:
If you set it up 'wrong' the first time around (or it crashed coding samples as above)) and want the wizard to run again, delete all the configuration files:
rm ~/.kde/share/apps/simon* -rf
rm ~/.kde/share/config/simon*

Links to this post:
http://www.forum-raspberrypi.de/Thread-speech-to-text?pid=13465