11 September 2012

232. Compile parallel (threaded) povray 3.7-rc6 on Debian Wheezy

Update 13 May 2013: This build won't work with v3.7-rc7 on debian wheezy if you have libjpeg62 installed. See http://verahill.blogspot.com.au/2013/05/413-povray-37-rc7-on-debian-wheezy.html.

Remove libjpeg62 and it works fine though.

Original post
Expanding my little cluster has got me thinking about additional uses for it. The primary purpose is obviously work i.e. MD simulations using gromacs and ab initio calcs using NWChem and Gaussian. I'm also testing it with John the Ripper to see how well the users of the linux box in the lab are choosing their passwords.

At that point I realised that it'd be sweet to have at least an OMP capable version of povray to speed things up when polishing figures for those elusive journal covers.

Debian testing currently uses v. 3.6.1 of povray but

  POV-Ray 3.6 does not support multithreaded rendering. POV-Ray 3.7 does.

So compile we will although v 3.7 is beta, so be aware.
sudo mkdir /opt/povray
sudo chown $USER /opt/povray

wget http://povray.org/redirect/www.povray.org/beta/source/povray-3.7.0.RC6.tar.gz
tar xvf povray-3.7.0.RC6.tar.gz
cd povray-3.7.0.RC6/
sudo apt-get install libboost-all-dev libpng-dev libjpeg-dev libtiff-dev build-essential libsdl-dev

Note: libboost-all-dev is big. It might be enough with libboost-thread-dev

./configure --prefix=/opt/povray --program-suffix=_3.7 COMPILED_BY="me@here"
===============================================================================
POV-Ray 3.7.0.RC5 has been configured.

Built-in features:
  I/O restrictions:          enabled
  X Window display:          disabled
  Supported image formats:   gif tga iff ppm pgm hdr png jpeg tiff
  Unsupported image formats: openexr

Compilation settings:
  Build architecture:  x86_64-unknown-linux-gnu
  Built/Optimized for: x86_64-unknown-linux-gnu (using -march=native)
  Compiler vendor:     gnu
  Compiler version:    g++ 4.7
  Compiler flags:      -pipe -Wno-multichar -Wno-write-strings -fno-enforce-eh-specs -s -O3 -ffast-math -march=native -pthread

Type 'make check' to build the program and run a test render.
Type 'make install' to install POV-Ray on your system.

The POV-Ray components will be installed in the following directories:
  Program (executable):       /opt/povray/bin
  System configuration files: /opt/povray/etc/povray/3.7
  User configuration files:   $HOME/.povray/3.7
  Standard include files:     /opt/povray/share/povray-3.7/include
  Standard INI files:         /opt/povray/share/povray-3.7/ini
  Standard demo scene files:  /opt/povray/share/povray-3.7/scenes
  Documentation (text, HTML): /opt/povray/share/doc/povray-3.7
  Unix man page:              /opt/povray/share/man
===============================================================================

The way it is configured we can keep our debian version of povray and install the newer version (povray_3.7)

make
make install

Seems like -geometry 1000x1000 doesn't work anymore. Instead use -H1000 -W1000

I've played around with it a little bit and it does parallel (threaded) execution nicely.

wget http://www.ms.uky.edu/~lee/visual05/povray/fourcube7.pov
./povray_3.7 -H1000 -W1000 fourcube7.pov +A0.1
takes 9 seconds on an AMD II X3. The standard, serial Debian version takes 21 seconds.

231. Compiling john the ripper: single/serial, parallel/OMP and MPI

Update: updated for v1.7.9-jumbo-7 since hccap2john in 1.7.9-jumbo-6 was broken

For no particular reason at all, here's how to compile John the Ripper on Debian Testing (Wheezy). It's very easy, and this post is probably a bit superfluous. The standard version only supports serial and parallel (OMP). See below for MPI.


The regular version: 

mkdir ~/tmp
cd ~/tmp
wget http://www.openwall.com/john/g/john-1.7.9.tar.gz
tar xvf john-1.7.9.tar.gz
cd john-1.7.9/src

If you don't edit the Makefile you build a serial/single-threaded binary.
If you want to build a threaded version for a single node with a multicore processor (OMP) do:
Edit Makefile and uncomment row 19 or 20

 18 # gcc with OpenMP
 19 OMPFLAGS = -fopenmp
 20 OMPFLAGS = -fopenmp -msse2
make clean linux-x86-64
cd ../run

You now have a binary called john in your ../run folder.


The Jumbo version:
If you want to build a distributed version with MPI (can split jobs across several nodes) you need the enhanced, community version:

sudo apt-get install openmpi-bin libopenmpi-dev
cd ~/tmp
wget http://www.openwall.com/john/g/john-1.7.9-jumbo-7.tar.gz
tar xvf john-1.7.9-jumbo-7.tar.gz 
cd john-1.7.9-jumbo-7/src

Edit the Makefile
  20 ## Uncomment the TWO lines below for MPI (can be used together with OMP as well)
  21 ## For experimental MPI_Barrier support, add -DJOHN_MPI_BARRIER too.
  22 ## For experimental MPI_Abort support, add -DJOHN_MPI_ABORT too.
  23 CC = mpicc -DHAVE_MPI
  24 MPIOBJ = john-mpi.o

and do
make clean linux-x86-64-native
cd ../run

I had a look at the passwords on one of our lab boxes -- it immediately discovered that someone had used 'password' as the password...


These test were run on my old AMD II X3 445. Processes which don't speed up with MP are highlighted in red. LM DES is borderline -- it's faster, but doesn't scale well.

Here's the single thread/serial version:
./john --test
Benchmarking: Traditional DES [128/128 BS SSE2-16]... DONE
Many salts:     2906K c/s real, 2918K c/s virtual
Only one salt:  2796K c/s real, 2807K c/s virtual
Benchmarking: BSDI DES (x725) [128/128 BS SSE2-16]... DONE
Many salts:     95564 c/s real, 95948 c/s virtual
Only one salt:  93593 c/s real, 93781 c/s virtual
Benchmarking: FreeBSD MD5 [32/64 X2]... DONE
Raw:    14094 c/s real, 14122 c/s virtual
Benchmarking: OpenBSD Blowfish (x32) [32/64 X2]... DONE
Raw:    918 c/s real, 919 c/s virtual
Benchmarking: Kerberos AFS DES [48/64 4K]... DONE
Short:  474316 c/s real, 475267 c/s virtual
Long:   1350K c/s real, 1356K c/s virtual
Benchmarking: LM DES [128/128 BS SSE2-16]... DONE
Raw:    39843K c/s real, 39923K c/s virtual
Benchmarking: generic crypt(3) [?/64]... DONE
Many salts:     262867 c/s real, 263393 c/s virtual
Only one salt:  260121 c/s real, 260642 c/s virtual
Benchmarking: Tripcode DES [48/64 4K]... DONE
Raw:    369843 c/s real, 370584 c/s virtual
Benchmarking: dummy [N/A]... DONE
Raw:    99512K c/s real, 99712K c/s virtual
Here's the OMP version:
Benchmarking: Traditional DES [128/128 BS SSE2-16]... DONE
Many salts:     6706K c/s real, 2555K c/s virtual
Only one salt:  5015K c/s real, 2091K c/s virtual
Benchmarking: BSDI DES (x725) [128/128 BS SSE2-16]... DONE
Many salts:     205670 c/s real, 85411 c/s virtual
Only one salt:  238524 c/s real, 86720 c/s virtual
Benchmarking: FreeBSD MD5 [32/64 X2]... DONE
Raw:    38400 c/s real, 13812 c/s virtual
Benchmarking: OpenBSD Blowfish (x32) [32/64 X2]... DONE
Raw:    2306 c/s real, 845 c/s virtual
Benchmarking: Kerberos AFS DES [48/64 4K]... DONE
Short:  474675 c/s real, 476581 c/s virtual
Long:   1332K c/s real, 1335K c/s virtual
Benchmarking: LM DES [128/128 BS SSE2-16]... DONE
Raw:    49046K c/s real, 16785K c/s virtual
Benchmarking: generic crypt(3) [?/64]... DONE
Many salts:     721670 c/s real, 246640 c/s virtual
Only one salt:  699168 c/s real, 239605 c/s virtual
Benchmarking: Tripcode DES [48/64 4K]... DONE
Raw:    367444 c/s real, 369657 c/s virtual
Benchmarking: dummy [N/A]... DONE
Raw:    100351K c/s real, 100552K c/s virtual
And here's the MPI version:
mpirun -n 3 ./john --test
(note that this includes a great many more tests than the default version)
Benchmarking: Traditional DES [128/128 BS SSE2-16]... (3xMPI) DONE
Many salts:     8533K c/s real, 8707K c/s virtual
Only one salt:  7705K c/s real, 8110K c/s virtual
Benchmarking: BSDI DES (x725) [128/128 BS SSE2-16]... (3xMPI) DONE
Many salts:     279808 c/s real, 282634 c/s virtual
Only one salt:  273362 c/s real, 276096 c/s virtual
Benchmarking: FreeBSD MD5 [128/128 SSE2 intrinsics 12x]... (3xMPI) DONE
Raw:    65124 c/s real, 65781 c/s virtual
Benchmarking: OpenBSD Blowfish (x32) [32/64 X2]... (3xMPI) DONE
Raw:    2722 c/s real, 2749 c/s virtual
Benchmarking: Kerberos AFS DES [48/64 4K]... (3xMPI) DONE
Short:  1387K c/s real, 1415K c/s virtual
Long:   3880K c/s real, 3959K c/s virtual

Benchmarking: LM DES [128/128 BS SSE2-16]... (3xMPI) DONERaw:    114781K c/s real, 115940K c/s virtual

I don't quite understand the Kerberos results.



Other targets of interest are:

linux-x86-64-avx         Linux, x86-64 with AVX (2011+ Intel CPUs)
linux-x86-64-xop         Linux, x86-64 with AVX and XOP (2011+ AMD CPUs)
linux-x86-64             Linux, x86-64 with SSE2 (most common)
linux-x86-avx            Linux, x86 32-bit with AVX (2011+ Intel CPUs)
linux-x86-xop            Linux, x86 32-bit with AVX and XOP (2011+ AMD CPUs)
linux-x86-sse2           Linux, x86 32-bit with SSE2 (most common, if 32-bit)
linux-x86-mmx            Linux, x86 32-bit with MMX (for old computers)
linux-x86-any            Linux, x86 32-bit (for truly ancient computers)

The FX 8150 does AVX and XOP, while my 1055T doesn't.

The community version has more options:

linux-x86-64-native      Linux, x86-64 'native' (all CPU features you've got)
linux-x86-64-gpu         Linux, x86-64 'native', CUDA and OpenCL (experimental)
linux-x86-64-opencl      Linux, x86-64 'native', OpenCL (experimental)
linux-x86-64-cuda        Linux, x86-64 'native', CUDA (experimental)
linux-x86-64-avx         Linux, x86-64 with AVX (2011+ Intel CPUs)
linux-x86-64-xop         Linux, x86-64 with AVX and XOP (2011+ AMD CPUs)
linux-x86-64[i]          Linux, x86-64 with SSE2 (most common)
linux-x86-64-icc         Linux, x86-64 compiled with icc
linux-x86-64-clang       Linux, x86-64 compiled with clang
linux-x86-gpu            Linux, x86 32-bit with SSE2, CUDA and OpenCL (experimental)
linux-x86-opencl         Linux, x86 32-bit with SSE2 and OpenCL (experimental)
linux-x86-cuda           Linux, x86 32-bit with SSE2 and CUDA (experimental)
linux-x86-sse2[i]        Linux, x86 32-bit with SSE2 (most common, 32-bit)
linux-x86-native         Linux, x86 32-bit, with all CPU features you've got (not necessarily best)
linux-x86-mmx            Linux, x86 32-bit with MMX (for old computers)
linux-x86-any            Linux, x86 32-bit (for truly ancient computers)
linux-x86-clang          Linux, x86 32-bit with SSE2, compiled with clang
linux-alpha              Linux, Alpha
linux-sparc              Linux, SPARC 32-bit
linux-ppc32-altivec      Linux, PowerPC w/AltiVec (best)
linux-ppc32              Linux, PowerPC 32-bit
linux-ppc64              Linux, PowerPC 64-bit
linux-ia64               Linux, IA-64

10 September 2012

230. ROCKS 5.4.3, ATLAS and Gromacs on Xeon X3480

After doing another round of 'benchmarks' (there are so many factors that differ between the systems that it's difficult to tell exactly what I'm measuring) I'm back to looking at my BLAS/LAPACK.


So here's compiling ATLAS on a cluster made up of six dual-socket mobos with 2x quadcore XeonX3480 CPUs and 8 Gb RAM. The cluster is running ROCKS 5.4.3, which is a spin based on Centos 5.6. We then compile GROMACS using ATLAS and compare it with Openblas. Please note that I am not an expert on optimisations (or computers or anything) so comparing Openblas vs ATLAS won't tell you which one is 'better'. They are just numbers based on what someone once observed on a particular system under a particular set of circumstances.

Hurdles: I first had to deal with the lapack + bad symbols + recompile with -fPIC problem (solved by using netlib lapack and building shared libraries), then encountered the 'libgmx.so: undefined reference to _gfortran_' issue (solved by adding -lgfortran to LDFLAGS).


ATLAS
sudo mkdir /share/apps/ATLAS
sudo chown $USER /share/apps/ATLAS
cd ~/tmp
wget http://www.netlib.org/lapack/lapack-3.4.1.tgz
 wget http://downloads.sourceforge.net/project/math-atlas/Developer%20%28unstable%29/3.9.72/atlas3.9.72.tar.bz2
tar xvf atlas3.9.72.tar.bz2
cd ATLAS/
mkdir build
cd build
.././configure --prefix=/share/apps/ATLAS -Fa alg '-fPIC' --with-netlib-lapack-tarfile=$HOME/tmp/lapack-3.4.1.tgz --shared
OS configured as Linux (1)
Assembly configured as GAS_x8664 (2)
Vector ISA Extension configured as  SSE3 (6,448)
Architecture configured as  Corei1 (25)
Clock rate configured as 3059Mhz
make
DONE  STAGE 5-1-0 at 15:23
ATLAS install complete.  Examine
ATLAS/bin/<arch>/INSTALL_LOG/SUMMARY.LOG for details.

ls lib/
libatlas.a  libcblas.a  libf77blas.a  libf77refblas.a  liblapack.a  libptcblas.a  libptf77blas.a  libptlapack.a  libsatlas.so  libtatlas.so  libtstatlas.a  Makefile  Make.inc
make install

In addition to successful copying you'll also get errors along the lines of

cp: cannot stat `/home/me/tmp/ATLAS/build/lib/libsatlas.dylib': No such file or directory
make[1]: [install_lib] Error 1 (ignored)
cp /home/me/tmp/ATLAS/build/lib/libtatlas.dylib /share/apps/ATLAS/lib/.
cp: cannot stat `/home/me/tmp/ATLAS/build/lib/libtatlas.dylib': No such file or directory
make[1]: [install_lib] Error 1 (ignored)
cp /home/me/tmp/ATLAS/build/lib/libsatlas.dll /share/apps/ATLAS/lib/.
cp: cannot stat `/home/me/tmp/ATLAS/build/lib/libsatlas.dll': No such file or directory
make[1]: [install_lib] Error 1 (ignored)
cp /home/me/tmp/ATLAS/build/lib/libtatlas.dll /share/apps/ATLAS/lib/.
cp: cannot stat `/home/me/tmp/ATLAS/build/lib/libtatlas.dll': No such file or directory
make[1]: [install_lib] Error 1 (ignored)
cp /home/me/tmp/ATLAS/build/lib/libsatlas.so /share/apps/ATLAS/lib/.
cp /home/me/tmp/ATLAS/build/lib/libtatlas.so /share/apps/ATLAS/lib/.
because those files don't exist. 


Gromacs

FFTW3 was first build according to this. The only difference is the install targets (--prefix) -- I put things in /share/apps/gromacs/.fftwsingle and /share/apps/gromacs/.fftwdouble. Gromacs was downloaded and extracted as shown in that post, and /share/apps/gromacs was created.
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/openmpi/lib:/share/apps/ATLAS/lib
#single precision
export LDFLAGS="-L/share/apps/gromacs/.fftwsingle/lib -L/share/apps/ATLAS/lib -latlas -llapack -lf77blas -lcblas -lgfortran"
export CPPFLAGS="-I/share/apps/gromacs/.fftwsingle/include -I/share/apps/ATLAS/include/atlas"
./configure --disable-mpi --enable-float --with-fft=fftw3 --with-external-blas --with-external-lapack --program-suffix=_spa --prefix=/share/apps/gromacs
make -j3
make install
#double precision
make distclean
export LDFLAGS="-L/share/apps/gromacs/.fftwdouble/lib -L/share/apps/ATLAS/lib -latlas -llapack -lf77blas -lcblas -lgfortran"
export CPPFLAGS="-I/share/apps/gromacs/.fftwdouble/include -I/share/apps/ATLAS/include/atlas"
./configure --disable-mpi --disable-float --with-fft=fftw3 --with-external-blas --with-external-lapack --program-suffix=_dpa --prefix=/share/apps/gromacs
make -j3
make install
#single + mpi
make distclean
export LDFLAGS="-L/share/apps/gromacs/.fftwsingle/lib -L/share/apps/ATLAS/lib -latlas -llapack -lf77blas -lcblas -lgfortran"
export CPPFLAGS="-I/share/apps/gromacs/.fftwsingle/include -I/share/apps/ATLAS/include/atlas""
./configure --enable-mpi --enable-float --with-fft=fftw3 --with-external-blas --with-external-lapack --program-suffix=_spampi --prefix=/share/apps/gromacs
make -j3
make install
#double + mpi
make distclean
export LDFLAGS="-L/share/apps/gromacs/.fftwdouble/lib -L/share/apps/ATLAS/lib -latlas -llapack -lf77blas -lcblas -lgfortran"
export CPPFLAGS="-I/share/apps/gromacs/.fftwdouble/include -I/share/apps/ATLAS/include/atlas"
./configure --enable-mpi --disable-float --with-fft=fftw3 --with-external-blas --with-external-lapack --program-suffix=_dpampi --prefix=/share/apps/gromacs
make -j3
make install
The -lgfortran is IMPORTANT, or you'll end up with
libgmx.so: 'undefined reference to _gfortran_' type errors.

Performance
I ran a 6x6x6 nm box of water for 5 million steps (10 ns) to get a rough idea of the performance.
Make sure to put
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/share/apps/ATLAS/lib
in your ~/.bashrc, and to include it in your SGE jobs files (if that's what you use).

I allocated 8 Gb RAM and 8 cores for each run.

Double precision:
Openblas: 10.560 ns/day (11.7 GFLOPS, runtime 8182 seconds)
ATLAS  : 10.544 ns/day (11.6 GFLOPS, runtime 8194 seconds)

Single precision:
Openblas: 17.297 ns/dat (19.1 GFLOPS, runtime 4995 seconds)
ATLAS:   17.351 ns/day (19.2 GFLOPS, runtime 4980 seconds)
That's 15 seconds difference on a 1h 20 min run. I'd say they are identical.

07 September 2012

229. Compile ATLAS (+ gromacs, nwchem) on AMD FX 8150 on Debian Testing (Wheezy)

Xianyi's openblas doesn't seem to be ready for AMD FX 8150 yet. I've played with ATLAS in the past, but  for some reason didn't see the same performance with NWChem and ATLAS as I saw with NWChem and Openblas, so I never ended up using it.

I'm also having issues using openblas with CPMD and quantum espresso, and ATLAS is a well-established, respectable project, so it's time to give it another shot. As in most cases in these situations, it's probably a matter of PEBKAC.

Building ATLAS
Anyway. On we go...

mkdir /opt/ATLAS
chown ${USER} /opt/ATLAS
mkdir ~/tmp
cd ~/tmp

wget http://www.netlib.org/lapack/lapack-3.4.1.tgz
 wget http://downloads.sourceforge.net/project/math-atlas/Developer%20%28unstable%29/3.9.72/atlas3.9.72.tar.bz2
tar xvf atlas3.9.72.tar.bz2
cd ATLAS/

Edit ATLAS/Make.top
change the V on line 6 to lowercase i.e. from
- $(ICC) -V 2>&1 >> bin/INSTALL_LOG/ERROR.LOGto
- $(ICC) -v 2>&1 >> bin/INSTALL_LOG/ERROR.LOG
mkdir build/
cd build/


sudo apt-get install cpufreq-utils
cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
ondemand
sudo cpufreq-set -g performance

Unfortunately that only takes care of cpu0:

cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
performance
but

cat /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
ondemand
So...since we have 8 cores (cpu0-cpu7):

sudo cp /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
sudo cp /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
sudo cp /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
sudo cp /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor
sudo cp /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor /sys/devices/system/cpu/cpu5/cpufreq/scaling_governor
sudo cp /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor /sys/devices/system/cpu/cpu6/cpufreq/scaling_governor
sudo cp /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor

OK, we're ready to compile:
.././configure --prefix=/opt/ATLAS -Fa alg '-fPIC' --with-netlib-lapack-tarfile=$HOME/tmp/lapack-3.4.1.tgz --shared

Some of the info that's important is:
OS configured as Linux (1)
Assembly configured as GAS_x8664 (2)
Vector ISA Extension configured as  AVXFMA4 (4,496)
Architecture configured as  AMDDOZER (34)
Clock rate configured as 3600Mhz
If that checks out you don't need to manually set your architecture. To get a list over options, do
 make xprint_enums ; ./xprint_enums

If all is well,

make
make install

You should now be done.

Linking Gromacs against ATLAS

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/openmpi/lib:/opt/ATLAS/lib
#single precision
export LDFLAGS="-L/opt/fftw/fftw-3.3.2/single/lib -L/opt/ATLAS/lib -lsatlas -ltatlas -lgfortran"
export CPPFLAGS="-I/opt/fftw/fftw-3.3.2/single/include -I/opt/ATLAS/include"
./configure --disable-mpi --enable-float --with-fft=fftw3 --with-external-blas --with-external-lapack --program-suffix=_spatlas --prefix=/opt/gromacs/gromacs-4.5.5
make -j6 2>make.err 1>make.log
make install

#double precision
make distclean
export LDFLAGS="-L/opt/fftw/fftw-3.3.2/double/lib -L/opt/ATLAS/lib -lsatlas -ltatlas -lgfortran" 
export CPPFLAGS="-I/opt/fftw/fftw-3.3.2/double/include -I/opt/ATLAS/include"
./configure --disable-mpi --disable-float --with-fft=fftw3 --with-external-blas --with-external-lapack --program-suffix=_dpatlas --prefix=/opt/gromacs/gromacs-4.5.5
make -j6 2>make2.err 1>make2.log
make install

#single + mpi
make distclean
export LDFLAGS="-L/opt/fftw/fftw-3.3.2/single/lib -L/opt/ATLAS/lib -lsatlas -ltatlas -lgfortran"
export CPPFLAGS="-I/opt/fftw/fftw-3.3.2/single/include -I/opt/ATLAS/include"
./configure --enable-mpi --enable-float --with-fft=fftw3 --with-external-blas --with-external-lapack --program-suffix=_spmpiatlas --prefix=/opt/gromacs/gromacs-4.5.5
make -j6 2>make3.err 1>make3.log
make install

#double + mpi
make distclean
export LDFLAGS="-L/opt/fftw/fftw-3.3.2/double/lib -L/opt/ATLAS/lib -lsatlas  -ltatlas  -lgfortran" 
export CPPFLAGS="-I/opt/fftw/fftw-3.3.2/double/include -I/opt/ATLAS/include"
./configure --enable-mpi --disable-float --with-fft=fftw3 --with-external-blas --with-external-lapack --program-suffix=_dpmpiatlas --prefix=/opt/gromacs/gromacs-4.5.5
make -j6 2>make4.err 1>make4.log
make install

Linking NWChem against ATLAS

export LARGE_FILES=TRUE
export TCGRSH=/usr/bin/ssh
export NWCHEM_TOP=`pwd`
export NWCHEM_TARGET=LINUX64
export NWCHEM_MODULES="all"
export BLASOPT="-L/opt/ATLAS/lib -lsatlas -ltatlas"
export USE_MPI=y
export USE_MPIF=y
export USE_MPIF4=y
export MPI_LOC=/usr/lib/openmpi/lib
export MPI_INCLUDE=/usr/lib/openmpi/include
export LIBRARY_PATH="$LIBRARY_PATH:/usr/lib/openmpi/lib:/opt/ATLAS/lib"
export LIBMPI="-lmpi -lopen-rte -lopen-pal -ldl -lmpi_f77 -lpthread"
export LDFLAGS="-I/opt/ATLAS/include"
cd $NWCHEM_TOP/src
make clean
make nwchem_config
make FC=gfortran 2> make.err 1>make.log
export FC=gfortran
cd $NWCHEM_TOP/contrib
./getmem.nwchem

228. Setting up Asus (nvidia) GF 210 on Debian Testing

NOTE:  Unless I remove the legacy driver, *DM will not start. Instead I only get a blank screen with a blinking cursor. See below for solution.

Here's how to get ASUS (nvidia) GF210 up and running in debian testing (wheezy)

First edit /etc/modules, and add
blacklist nouveau

You can either reboot at this point or try
sudo rmmod nouveau

To see whether nouveau got unloaded, do
lsmod |grep nouv

If nothing is returned, then you're good.

Make sure that your card got recognised:

lspci
01:00.0 VGA compatible controller: NVIDIA Corporation GT218 [GeForce 210] (rev a2)

I like smxi, so here's how to get the drivers up and running using smxi, which is a fancy shell script.

sudo su
cd /usr/local/bin
wget -Nc smxi.org/smxi.zip
unzip smxi.zip
smxi

The first time you run smxi you have a couple of things to sort out -- lots of little questions to answer. If you don't feel comfortable yet with linux, avoid liquorix since it'll make your debian box deviate more from the standard setup (the liquorix kernel is fine and safe and I've used it in the past before I started rolling my own kernel, but it's more difficult for someone to troubleshoot your system the more it deviates from their own). Other than that most questions aren't that important. I enable non-free immediately after setting up a new box, and while there are sound political reasons for NOT doing it, there are plenty of practical reasons in favour of it.

Anyway, eventually you're done with the setup, and with making sure that your system is up to date. Select Continue to Graphics, then select debian-nvidia

If all goes well you'll get the dkms install of the nvidia driver. You're probably asked whether to generate a new xorg.conf, which you should. You may also get a message about the nvidia driver needing to be added to your xorg.conf.

Once you're done installing the driver, you're asked whether to start your desktop or to quit. While it's fine to start your desktop at this point, why not select quit and check that all went well?

lsmod|grep nvi
nvidia               8028141  0
i2c_core               24002  2 i2c_piix4,nvidia

cat /etc/X11/xorg.conf|grep nvi
        Driver  "nvidia"
        Driver  "nvidia"
        Driver  "nvidia"

Looks fine.

I often have problems with the legacy drivers (blank screen with blinking cursor), so
sudo apt-get purge nvidia-*-legacy-173xx-*

Do
aptitude search 173 
to make sure all the legacy drivers are gone.

Purge if there's still something around.

Framebuffer 
If possible I like to enable framebuffer (it gives you fancier graphics capabilities in terminal mode e.g. browsing with images using w3m). I've had all manner of headaches doing so with the newer nvidia drivers though, so don't be too surprised if it doesn't pan out.

Edit the following line in your /etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="quiet text vga=0x0318 nomodeset"
To see what code to use, look here.
This method is supposed to be deprecated, but I don't have any experience using vbetool.

As for the other options, only 'text' is important -- it will make you boot into the terminal and you will have to start your (default) desktop by doing startx. IF you want to boot into e.g. gdm3, kdm or another *dm, then DO NOT ADD text.

Reboot. To see whether your framebuffer is active do
ls /etc/fb*
/dev/fb0


04 September 2012

227. New compute node using AMD FX-8150. Gromacs, nwchem performance/benchmarks

Update: reconfiguring your nwchem binary using getmem.nwchem can speed things up considerably. Most of the runtimes are obtained without using getmem.nwchem and are thus all using the same amount of memory, regardless of what is available. Binaries which have been reconfigured are shown as such.

The short summary: I first wasn't that happy with my choice of the the AMD FX-8150, but after sorting out the ACML libs and getting things benchmarked I'm much more satisfied. The only situation in which I'm not seeing this processor outclass the other systems seems to be that using the Commercial Ab Initio Package, which arrived as a precompiled binary (Portland Fortran).

In general it seems that the FX 8150 is about 10% faster than the i5-2400 for the computations I've tested here -- but beware that the AMD processor is using the machine vendor math libs, while the intel unit is using openblas.

Note that the AMD Phenom II X6 1055T is SLOWER with the ACML libs than with Openblas.

The Lengthy Preamble
I seem to remember promising myself not to get another AMD since, while they may or may not be 'the good guys' (e.g. the Intel/Dell thingy), empirically I keep on seeing my Quadcore Intel i5-2400 3.1 GHz sweeping the floor with my Phenom II X6 1055T 2.8 GHz. Sure, part of the issue is the clock frequency, but  the difference seems to be a lot bigger than that.

At any rate, I ended up building a new node for my little cluster. Remember that these are Australian prices. Oh, how I miss you, Newegg -- not just because of the price, but because of the choice.

Luckily it seems like my choice of the FX-8150 has paid off. Also, at the moment of writing the intel i5-2400 and the AMD fx-8150 sell for the same price locally.


The Setup

It's basically an eight-core 3.6 GHz box with 16 GB RAM (expandable to 32 Gb, 4 slots) and a 7200 rpm HDD. I've heard the eight-core FX 8150 uncharitably described as a quad-core with advanced hyper-threading, but I wouldn't be qualified to comment. Interestingly, sinfo registers it as a quad core, while htop and all other programs considers it an 8-core. Finally, looking at this image it looks like the whole 8 core thing is a bit of a cheat -- the whole 4 floating point vs 8 integer processing units.

Gigabyte GA-990FXA-D3 AM3 990FX DDR3 Motherboard AU$ 128 Link
AMD AM3+ x8 FX-8150 3.6Ghz Boxed CPU AU$209 Link
PV316G186C0K 16G Kit(8Gx2) DDR3 1866 AU$ 129 Link
Hitachi 3.5" Desktar 1TB SATA3 HDD 7200rpm AU$83 Link
Corsair GS800 V2 ATX Power Supply Unit AU$ 138 Link
TP-LINK TG-3269 PCI Gigabit PCI Network Card AU$ 8Link
ASUS Vento TA-U11 without PSU AU$99 Link
ASUS 1GB GF210 PCI-E VGA Card Link

NOTE that the mobo does NOT have onboard video. I didn't pick up on that before buying the parts, but luckily had an old ATI card floating around.

The fan on the PSU is a bit annoying. It stays off for the most part (some posts say it should never be completely off, one post said it should be) but starts up in a weird way -- basically the electricity is given in small jolts. Or it's just broken. Other than that it works fine.

Preparation
It's for reasons like these I write this blog. After having installed debian testing I set up NFS, added the box as a node under Sun Grid Engine (Link), set up Gaussian (Link), and compiled Gromacs.

I encountered separate issues trying to compile Openblas (Bulldozer cores aren't supported) and Nwchem with internal libs (odd stuff). I've given up on Openblas and managed to compile nwchem against the AMD ACML. Same went for gromacs -- I eventually recompiled gromacs against ACML. Maybe it's unfair to compare ACML vs Openblas on the i5-2400, but ACML is free, MKL isn't.


Performance -- setup
Note that while I do use NFS it's not in the 'traditional way'. Each node exports a local folder to the front node so that SGE can see it. However, when you run your calcs everything is stored in a local folder, and using a locally compiled version of the number crunching software. In other words, network performance should not affect the benchmarks.

Neon is NOT using openblas, while Boron and Tantalum are. Xianyi's version of openblas won't compile on Bulldozer at the moment (it seems). I will rebuild gromacs with the ACML libs and do the benchmark again.

Also, please note that these 'benchmarks' aren't absolute -- I'm not an expert on optimising performance. You can probably use them to get an idea of the relative computational grunt of the different hardware combinations though.

FX 8150 is a lot more fun with ACML. The Phenom II 1055T is no fun with ACML.

I recompiled nwchem and gromacs on Boron (see below) to see what ACML vs Openblas would be like. I've yet to run those jobs, but will post the results when I have.

Unlike the FX-8150, the Phenom II X6 1055T does not support AVX, FMA3 or FMA4.

Configuration:
Boron (B): Phenom II X6 2.8 GHz, 8Gb RAM (2.8*6=16.8 GFLOPS predicted)
Neon (Ne): FX-8150 X8 3.6 GHz, 16 Gb RAM (3.6*8=28.8 GFLOPS predicted (int), 3.6*4=14.4 GFLOPS (fpu))
Tantalum (Ta): Quadcore i5-2400 3.1 GHz, 8 Gb RAM (3.1*4=12.4 GFLOPS predicted)
Vanadium (V):  Dual socket 2x Quadcore Xeon X3480 3.06 GHz, 8Gb. CentOS (ROCKS 5.4.3)/openblas.

Results

Gromacs --double (1 ns 6x6x6 nm tip4p water box; dynamic load balancing, double precision, 500k steps)
B  :  10.662 ns/day (11.8  GFLOPS, runtime 8104 seconds)***
B  :    9.921 ns/day ( 10.9 GFLOPS, runtime 8709 seconds)**
Ne:  10.606 ns/day (11.7  GFLOPS, runtime 8146 seconds) *
Ne:  12.375 ns/day (13.7  GFLOPS, runtime 6982 seconds)**
Ne:  12.385 ns/day (13.7  GFLOPS, runtime 6976 seconds)****
Ta:  10.825 ns/day (11.9  GFLOPS, runtime 7981 seconds)***
V :   10.560 ns/dat (11.7  GFLOPS, runtime 8182 seconds)***
*no external blas/lapack.
**using ACML libs
*** using openblas
**** using ATLAS

Gromacs --single (1 ns 6x6x6 nm tip4p water box; dynamic load balancing, single precision, 500 k steps)
B  :   17.251 ns/day (19.0 GFLOPS, runtime 5008 seconds)***
Ne:   21.874 ns/day (24.2 GFLOPS, runtime  3950 seconds)**
Ne:   21.804 ns/day (24.1 GFLOPS, runtime 3963  seconds)****
Ta:   17.345 ns/day (19.2 GFLOPS, runtime  4982 seconds)***
V :   17.297 ns/day (19.1 GFLOPS, runtime 4995 seconds)***
*no external blas/lapack.
**using ACML libs
*** using openblas
**** using ATLAS

NWChem (opt biphenyl cation, cp-md/pspw):
B  :   5951 seconds**
B  :   4084 seconds ***
B  :   1988 seconds***x
Ne:   3689 seconds**
Ta :   4102 seconds***
V :    5396 seconds***

*no external blas/lapack.
**using ACML libs
*** using openblas
x Reconfigured using getmem.nwchem

NWChem (opt biphenyl cation, geovib, 6-31G**/ub3lyp):
B  :  2841 seconds **
B  :  2410 seconds***
B  :  2101 seconds ***x
Ne: 1665 seconds **
Ta : 1785 seconds***
Ta : 1789 seconds***x
V  : 2600 seconds***

*no external blas/lapack.
**using ACML libs
*** using openblas
x Reconfigured using getmem.nwchem

A Certain Commercial Ab Initio Package (Freq calc of pre-optimised H14C19O3 at 6-31+G*/rb3lyp):
B  :    2h 00 min (CPU time 10h 37 min)
Ne:   1h 37 min (CPU time: 11h 13 min)
Ta:   1h 26 min (CPU time: 5h 27 min)
V  :   2h 15 min (CPU time 15h 50 min)
Using precompiled binaries.

More:
Since I couldn't use Xianyi's openblas with FX 8150 I downloaded the AMD ACML. I've had issues with that before, which is why I haven't been using that as a rule. This time I was motivated enough to hammer it out though. Anyway, here's the cpuid output from the acml 5.2.0:
./cpuid.exe 
Chip manufacturer: AuthenticAMD
AuthenticAMD family 15 extended family 6 model 1
Model Name: AMD FX(tm)-8150 Eight-Core Processor        
Chip supports SSE
Chip supports SSE2
Chip supports SSE3
Chip supports AVX
Chip does not support FMA3
Chip supports FMA4
See the other post from today about build nwchem with acml (hint: use the fma4_int64 libs but avoid mp).

Here's 1055T:
Chip manufacturer: AuthenticAMD
AuthenticAMD family 15 extended family 1 model 10
Model Name: AMD Phenom(tm) II X6 1055T Processor
Chip supports SSE
Chip supports SSE2
Chip supports SSE3
Chip does not support AVX
Chip does not support FMA3
Chip does not support FMA4


Issues

Openblas:
You will get SGEMM related errors trying to build openblas according to the instructions I've posted on this site before. Apparently it has to do with the way the architecture is autoselected during build. Or something. I couldn't make it work.

NwChem:
I tried building nwchem with the internal libs, but had no luck. See other posts on this blog for general instructions. Building with the AMD ACML worked fine though.


Files:

NWChem (opt biphenyl cation, cp-md/pspw):
Title "Test 1"
Start  biphenyl_cation_twisted-1
echo
charge 1
geometry autosym units angstrom
 C     0.00000     -3.54034     0.00000
 C     -1.20296     -2.84049     -0.216000
 C     -1.20944     -1.46171     -0.206253
 C     0.00000     -0.721866     0.00000
 C     1.20944     -1.46171     0.206253
 C     1.20296     -2.84049     0.216000
 C     0.00000     0.721866     0.00000
 C     1.20944     1.46171     -0.206253
 C     1.20296     2.84049     -0.216000
 C     -1.20944     1.46171     0.206253
 C     0.00000     3.54034     0.00000
 C     -1.20296     2.84049     0.216000
 H     0.00000     -4.62590     0.00000
 H     -2.12200     -3.38761     -0.395378
 H     -2.13673     -0.938003     -0.401924
 H     2.12200     -3.38761     0.395378
 H     2.12200     3.38761     -0.395378
 H     -2.13673     0.938003     0.401924
 H     0.00000     4.62590     0.00000
 H     -2.12200     3.38761     0.395378
 H     2.13673     0.938003     -0.401924
 H     2.13673     -0.938003     0.401924
end
nwpw
  simulation_cell
     lattice_vectors
      2.000000e+01 0.000000e+00 0.000000e+00
      0.000000e+00 2.000000e+01 0.000000e+00
      0.000000e+00 0.000000e+00 2.000000e+01
  end
  mult 2
  np_dimensions -1  -1
  tolerances 1e-7  1e-7
end
driver
  default
end
task pspw optimize
NWChem (opt biphenyl cation, geovib, 6-31G**/ub3lyp):
Title "Test 2"
Start  biphenyl_cation_twisted
echo
charge 1
geometry autosym units angstrom
 C     0.00000     -3.56301     0.00000
 C     -1.13927     -2.85928     -0.393841
 C     -1.13879     -1.46545     -0.394153
 C     0.00000     -0.742814     0.00000
 C     1.13879     -1.46545     0.394153
 C     1.13927     -2.85928     0.393841
 C     0.00000     0.742814     0.00000
 C     1.13879     1.46545     -0.394153
 C     1.13927     2.85928     -0.393841
 C     -1.13879     1.46545     0.394153
 C     0.00000     3.56301     0.00000
 C     -1.13927     2.85928     0.393841
 H     0.00000     -4.64896     0.00000
 H     -2.02827     -3.39662     -0.711607
 H     -2.02148     -0.928265     -0.727933
 H     2.02827     -3.39662     0.711607
 H     2.02827     3.39662     -0.711607
 H     -2.02148     0.928265     0.727933
 H     0.00000     4.64896     0.00000
 H     -2.02827     3.39662     0.711607
 H     2.02148     0.928265     -0.727933
 H     2.02148     -0.928265     0.727933
end
basis "ao basis" cartesian print
  H library "6-31G**"
  C library "6-31G**"
END
dft
  mult 2
  XC b3lyp
  mulliken
end
driver
end
task dft optimize
task dft freq numerical

226. ACML libs and nwchem -- what libs to choose to avoid 'Singularity in Pulay matrix' hang.

The problem:
If I compile nwchem against the acml libs (gfortran64_fma4 in acml-5-2-0-gfortran-64bit.tgz) everything appears to go fine, but once I try to run stuff I get


           Memory utilization after 1st SCF pass:
           Heap Space remaining (MW):       12.94            12937848
          Stack Space remaining (MW):       13.11            13107006
   convergence    iter        energy       DeltaE   RMS-Dens  Diis-err    time
 ---------------- ----- ----------------- --------- --------- ---------  ------
 d= 0,ls=0.0,diis     1    -74.9488845804 -7.49D+01  1.85D-02  1.70D-01     0.4
  Singularity in Pulay matrix. Error and Fock matrices removed.


and then the node hangs with 100% CPU.

The (obvious) solution:
To some this will be obvious, but to someone not skilled in the art, like myself, it isn't.
Of course, I could've just RTFM...but what academic does that?
"ACML and MKL can support 64-bit integers if the appropriate library is chosen. For MKL, one can choose the ILP64 Version of Intel® MKL, while for ACML the int64 libraries should be chosen, e.g. in the case of ACML 4.4.0 using a PGI compiler /opt/acml/4.4.0/pgi64_int64/lib/libacml.a"
So, when you go to download your libraries from the AMD website make to download at a minimum the 64 integer file (e.g.acml-5-2-0-gfortran-64bit-int64.tgz).

How I built nwchem:

export LARGE_FILES=TRUE
export TCGRSH=/usr/bin/ssh
export NWCHEM_TOP=`pwd`
export NWCHEM_TARGET=LINUX64
export NWCHEM_MODULES="all"
#export PYTHONVERSION=2.7
export PYTHONHOME=/usr
export BLASOPT="-L/opt/acml/acml5.2.0/gfortran64_fma4_int64/lib -lacml"
export USE_MPI=y
export USE_MPIF=y
export USE_MPIF4=y
export MPI_LOC=/usr/lib/openmpi/lib
export MPI_INCLUDE=/usr/lib/openmpi/include
export LIBRARY_PATH="$LIBRARY_PATH:/usr/lib/openmpi/lib:/opt/acml/acml5.2.0/gfortran64_fma4_int64/lib"
export LIBMPI="-lmpi -lopen-rte -lopen-pal -ldl -lmpi_f77 -lpthread"
cd $NWCHEM_TOP/src
make clean
make nwchem_config
make FC=gfortran 2> make.err 1>make.log
export FC=gfortran
cd ../contrib ./getmem.nwchem


Don't forget to add the acml libs to the LD_LIBRARY_PATH in your ~/.bashrc, e.g.
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/openmpi/lib:/opt/acml/acml5.2.0/gfortran64_fma4_int64/lib



31 August 2012

225. Sun GridEngine: commlib error: got select error (Connection refused)


The issue:
On doing qhost I get
error: commlib error: got select error (Connection refused)
error: unable to send message to qmaster using port 6444 on host "beryllium": got send error
Beryllium is my hostnode.

Same thing happens with qstat and any other imaginable SGE command.

The solution:
It's an obvious one -- just restart the services. I mean, it took me twenty minutes to re-remember that, but it should have been obvious. Most of the time, services are managed using scripts in /etc/init.d/ and that's the case here too. So, hanging my head in shame, here's the solution:

ls /etc/init.d/grid*
/etc/init.d/gridengine-exec  /etc/init.d/gridengine-master

sudo service gridengine-master restart

qhost should now slowly be populated

Done.

How I got there:


 ps aux|grep sge
sgeadmin  3173  0.0  0.0  56844  3428 ?        Sl   Aug20   6:29 /usr/lib/gridengine/sge_execd

 tree /var/spool/gridengine -L 4 -d
/var/spool/gridengine
|-- execd
|   `-- beryllium
|       |-- active_jobs
|       |-- jobs
|       `-- job_scripts
|-- qmaster
|   `-- job_scripts
`-- spooldb

Looking at /var/spool/gridengine/execd/beryllium/messages

08/20/2012 10:47:17|  main|beryllium|I|starting up GE 6.2u5 (lx26-amd64)
08/30/2012 15:06:57|  main|beryllium|E|commlib error: got read error (closing "beryllium/qmaster/1")
08/30/2012 15:06:58|  main|beryllium|W|can't register at qmaster "beryllium": abort qmaster registration due to communication errors
less /var/lib/gridengine/rupert/common/act_qmaster
beryllium

So looks ok.

Oddly, there's nothing funny in /tmp -- no execd_messages.* files.

 ps aux|grep sge
sgeadmin  3173  0.0  0.0  56844  3428 ?        Sl   Aug20   6:29 /usr/lib/gridengine/sge_execd
sudo kill 3173


start-stop-daemon --exec /usr/sbin/sge_execd --start --user sgeadmin
Which didn't seem to do anything.

start-stop-daemon --exec /usr/sbin/sge_qmaster --start --user sgeadmin
which doesn't seem to do anything either.

/usr/lib/gridengine/gethostname -aname
critical error: Please set the environment variable SGE_ROOT.
export SGE_ROOT=/var/lib/gridengine
/usr/lib/gridengine/gethostname -aname
beryllium

 service gridengine-master restart
Restarting Sun Grid Engine Master Scheduler: sge_qmasterrm: cannot remove `/var/run/gridengine/qmaster.pid': Permission denied
.
cat /var/run/gridengine/qmaster.pid
3198
ps aux|grep 3198
yields nothing
sudo rm  /var/run/gridengine/qmaster.pid

sudo service gridengine-master restart
Restarting Sun Grid Engine Master Scheduler: sge_qmaster.
ps aux|grep sge
sgeadmin 32178  2.5  0.0  69004  6112 ?        Sl   09:40   0:00 /usr/lib/gridengine/sge_qmaster
qstat
job-ID  prior   name       user         state submit/start at     queue                          slots ja-task-ID 
-----------------------------------------------------------------------------------------------------------------
    715 0.75000 submit__la me         r     08/22/2012 08:10:32 six.q@boron                        6        
    720 0.25194 submit__63 me         r     08/22/2012 11:15:02 four.q@tantalum                    4        
    716 0.74817 submit__la me         qw    08/22/2012 08:11:28                                    6        
    719 0.70429 submit__la me         qw    08/22/2012 08:38:17                                    6        
    721 0.25071 submit__63 me         qw    08/22/2012 11:15:35                                    4        
    722 0.25000 submit__32 me         qw    08/22/2012 11:16:01                                    4    

 qhost
HOSTNAME                ARCH         NCPU  LOAD  MEMTOT  MEMUSE  SWAPTO  SWAPUS
-------------------------------------------------------------------------------
global                  -               -     -       -       -       -       -
beryllium               lx26-amd64      3     -    7.8G       -   14.9G       -
boron                   lx26-amd64      6  6.10    7.6G    1.4G   14.9G  240.8M
tantalum                lx26-amd64      4  4.01    7.7G    1.6G   14.9G     0.0

sudo service gridengine-exec restart
Restarting Sun Grid Engine Execution Daemon: sge_execd.

qhost
HOSTNAME                ARCH         NCPU  LOAD  MEMTOT  MEMUSE  SWAPTO  SWAPUS
-------------------------------------------------------------------------------
global                  -               -     -       -       -       -       -
beryllium               lx26-amd64      3  0.22    7.8G    3.0G   14.9G  141.3M
boron                   lx26-amd64      6  6.09    7.6G    1.4G   14.9G  240.8M
tantalum                lx26-amd64      4  4.01    7.7G    1.6G   14.9G     0.0




22 August 2012

224. Disabling tracker-miner-fs

Yes, yes, I shouldn't have a full desktop install on a computational node, but the nodes serve as instant replacement desktops if something goes awry with my main desktop, and occasionally visitors get to use them to access the internet in order to avoid getting bored.

Anyway, tracker-miner-fs is eating up 26% of my 8 Gb RAM on one of my nodes running KDE, and I really don't need it. I mean, I don't know if it's useful to most people running a full DE, but on my node I most certainly, definitely don't need it.

Given the number of posts online with questions about tracker ('What is it?", "Why is it using up all my resources?" etc.) I think that there's a bit of a PR problem. If it's a program that is noticeable because it makes demands on your computer system, the users should be allowed to know why putting up with this extra drain on resources is desirable -- or not.

Anyway.

aptitude show tracker says:
"Tracker is an advanced framework for first class objects with associated metadata and tags. It provides a one stop solution for all metadata, tags, shared object databases, search tools and indexing."

...which means what exactly in practical terms?

man tracker-miner-fs 
NAME
       tracker-miner-fs - Used to crawl the file system to mine data.
man tracker-store
NAME
       tracker-store - database indexer and query daemon
My guess would be that tracker-miner is basically indexing files for faster search, but I really don't know. It's one daemon I'm happy to expel.

There's a standard place for stuff that's supposed to be brought up with x:

ls /etc/xdg/autostart/

-rw-r--r-- 1 root root   306 May  3 09:42 at-spi-dbus-bus.desktop
-rw-r--r-- 1 root root  6216 Jun 20 06:58 evolution-alarm-notify.desktop
-rw-r--r-- 1 root root  7404 Oct 14  2011 gdu-notification-daemon.desktop
-rw-r--r-- 1 root root  5340 May 24 08:46 gnome-keyring-gpg.desktop
-rw-r--r-- 1 root root  6711 May 24 08:46 gnome-keyring-pkcs11.desktop
-rw-r--r-- 1 root root  6282 May 24 08:46 gnome-keyring-secrets.desktop
-rw-r--r-- 1 root root  5138 May 24 08:46 gnome-keyring-ssh.desktop
-rw-r--r-- 1 root root  6681 May 30 21:02 gnome-sound-applet.desktop
-rw-r--r-- 1 root root  7018 Apr 28 09:27 gsettings-data-convert.desktop
-rw-r--r-- 1 root root   460 Oct 21  2011 guake.desktop
-rw-r--r-- 1 root root   301 Jun 24 16:52 hplip-systray.desktop
-rw-r--r-- 1 root root   238 Dec  2  2011 kerneloops-applet.desktop
-rw-r--r-- 1 root root  4673 Mar 25 08:49 nm-applet.desktop
-rw-r--r-- 1 root root   250 Sep 10  2011 notification-daemon.desktop
-rw-r--r-- 1 root root  4651 Nov 12  2011 polkit-gnome-authentication-agent-1.desktop
-rw-r--r-- 1 root root  7112 Dec 23  2011 print-applet.desktop
-rw-r--r-- 1 root root  3864 Oct  1  2011 pulseaudio.desktop
-rw-r--r-- 1 root root   633 May 20 06:08 pulseaudio-kde.desktop
-rw-r--r-- 1 root root  3288 Aug 12 12:05 tracker-miner-fs.desktop
-rw-r--r-- 1 root root  3004 Aug 12 12:05 tracker-store.desktop

-rw-r--r-- 1 root root 11041 Apr  4 22:02 user-dirs-update-gtk.desktop
-rw-r--r-- 1 root root   433 Nov  3  2011 wicd-tray.desktop
-rw-r--r-- 1 root root   150 Feb  8  2012 xfce4-settings-helper-autostart.desktop
-rw-r--r-- 1 root root   357 Aug  1  2011 xfce4-volumed.desktop
Incidentally, the folder on that particular node betrays a history of previously installed desktop environments...

To stop and remove the tracker-miner processes, do
tracker-control -r 
It removes the databases it has created as well.


To disable:
Launch tracker-preferences from the KDE menu. Uncheck all options under 'Semantics'. Uncheck all places under locations. Clicking apply doesn't seem to have any effect, but if you open tracker-preferences again you'll probably find that it worked.

To disable tracker-miner-fs and tracker-store from the terminal you can probably edit:

tracker-miner-fs.desktop:
51 Icon=
 52 Exec=/usr/lib/tracker/tracker-miner-fs
 53 Terminal=false
 54 Type=Application
 55 Categories=Utility;
 56 X-GNOME-Autostart-enabled=false
 57 X-KDE-autostart-enabled=false
 58 X-KDE-StartupNotify=false
 59 X-KDE-UniqueApplet=true
 60 NoDisplay=true

tracker-store.desktop
54 Icon=
 55 Exec=/usr/lib/tracker/tracker-store
 56 Terminal=false
 57 Type=Application
 58 Categories=Utility;
 59 X-GNOME-Autostart-enabled=false
 60 X-KDE-autostart-enabled=false
 61 X-KDE-StartupNotify=false
 62 X-KDE-UniqueApplet=true
 63 NoDisplay=true
 64 OnlyShowIn=GNOME;KDE;XFCE;

Reasons why we don't simply uninstall it:

apt-cache rdepends tracker
         tracker
Reverse Depends:
  tracker-miner-fs
  tracker-gui
  tracker-gui
  tracker-gui
  nautilus
  tracker-miner-evolution
  tracker-utils
  tracker-extract
  brasero
  shared-mime-info
  tracker-utils
  tracker-miner-fs
  tracker-miner-evolution
  tracker-gui
  tracker-gui
  tracker-gui
  tracker-extract
  tracker-explorer
  tracker-dbg
  shared-mime-info
  rygel-tracker
  nautilus (twice?)
  catfish

21 August 2012

223. Moving disks, devices from one box to another -- issues with network interfaces

Long story short: edit /etc/udev/rules.d/70-persistent-net.rules

Long story:
I have a very small beowulf cluster keeping my office warm in these antipodean winter months. For some silly reason I was using the front node, a six core + 8 Gb box, as my daily desktop. That of course meant I wasn't really using it for computations. In addition to the front node I have a four core i5-somethingorother with 8 Gb RAM (fast!) and a slovenly AMD X3 /4 Gb to actually run the jobs. They are connected via a gigabit switch (192.168.1.0/24) for nfs exports and a 10/100 router (192.168.2.0/24) for WAN access.

I finally decided that 1) I didn't need a six-core box to prepare latex documents, run octave jobs and make pretty gnuplot plots and that 2) having a slow 3-core AMD box to run heavy nwchem jobs was not fast enough. On the other hand, I didn't want to set up/reinstall/move all my stuff from one harddrive to another.

Linux is wonderful in that it's often just a case of ripping out a harddrive and moving it to a different physical. Windows will scream bloody murder, but linux normally does it pretty well. Same here.

The main issue was the three network cards that I wanted to set up (three separate subnets) and which I configure via /etc/network/interfaces. I simply couldn't call the networks cards what I wanted.

Well, as is obvious in hindsight, you should pay a visit to /etc/udev, and more specifically, /etc/udev/rules.d/70-persistent-net.rules

It looks something like this:

# This file was automatically generated by the /lib/udev/write_net_rules# program, run by the persistent-net-generator.rules rules file.## You can modify it, as long as you keep each rule on a single# line, and change only the value of the NAME= key.
# PCI device 0x10ec:0x8168 (r8169)SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:YY:XX:96:XX:32", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0"SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:XX:YY:83:0a:48", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth1"SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:XX:YY:64:0b:46", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth2"
# PCI device 0x1814:0x3062 (rt2860)SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="c8:YY:XX:cf:1f:5d", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="ra*", NAME="ra0"
# USB device 0x:0x (rt2800usb)SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="c8:YY:XX:c8:91:e6", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="wlan*", NAME="wlan0"






Basically, make sure you can figure out the mac addresses of the different network cards (ip addr helped me more than ifconfig) you can simply go in and edit the ATTR{address}=="" statements and the NAME="" variables. Make sure that there are no conflicts, obviously.

After that, everything should be fine.

If you are using network-manager (i.e. stock GNOME setup) then you will want to pay attention to the /etc/NetworkManager/system-connections/ as well -- open and edit suspiciously named files like e.g. eth0.

They'll look like this:

[802-3-ethernet]
duplex=full
mac-address=00:YY:XX:96:93:32
[connection]
id=eth0
uuid=fa5YYYY-XXXX-43a3-8502-f8ba2d28ZZZZ
type=802-3-ethernet
timestamp=1326324509
[ipv6]
method=auto
[ipv4]
method=auto




20 August 2012

222. Fighting and, eventually, installing HP laserjet P1102w on debian.

This printer is great when it works. It's just always a pain to set up, and it seems to be down to the way the install scripts are written. Setting up the printer was easy in the end -- but getting there? Not so much. Obviously, part of the reason is my own stupidity. Summary: it works on linux, but you'll need a bit of time and patience.

Update: here's an interesting post on the same topic: http://downfromthetrees.com/linux-installing-a-proprietrary-hp-printer-driver.html The only thing I would add is that instead of using sudo as described in that post you may have to use gksu in order to avoid the Magic Cookie error.

The short guide:
You NEED a binary plugin. It blows, but you do. Blowing harder: there's a lot of references to letting hp-setup download it for you -- but I'm willing to bet that it will fail for you.

Anyway, do this:
wget http://www.openprinting.org/download/printdriver/auxfiles/HP/plugins/hplip-3.12.6-plugin.run
chmod +x hplip-3.12.6-plugin.run
gksudo ./hplip-3.12.6-plugin.run

Hopefully that's good enough to take you through to system-config-printer and/or hp-setup. I did a lot of things, but it was the gksudo ./hplip-3.12.6-plugin.run that did it in the end.

And don't forget to add yourself to the lpadmin and perhaps lp groups:

cat /etc/group|grep lp
lp:x:7:verahill
lpadmin:x:111:verahill
I don't know how necessary it is, but heck, why not?


Here's the full, painful (and probably embarrassing) route I took:
It should not be this difficult to get a bloody printer to work. With older printer models it's been quick and easy, without having to download any 'binary plugins'. This is just effing ridiculous. Guess who's not buying another HP printer...

I reorganised my office, and as part of that decided to move my printer from my desktop to one of my networked boxes to free up some space on my desk. My desktop is running GNOME, while the target box is running KDE.

If you install the hplip packages in debian you end up with a very helpful script called hp-setup. You can run that as a user, but try running it as root via sudo, and you get

Invalid MIT-MAGIC-COOKIE-1 keyhp-setup: cannot connect to X server :0

OK, so you can run it as user. Everything looks great, BUT...you'll find that in this case the install will fail at the step of downloading the magical, mythological 'binary plugin'.

Also, as a user you're unlikely to be able to be able to put files in the places where they belong anyway, so there's that (see e.g. the first comment below).

You then do a bit of googling for hplip-3.12.6-plugin.run, and end up here: https://aur.archlinux.org/packages.php?ID=44646

You do
wget http://www.openprinting.org/download/printdriver/auxfiles/HP/plugins/hplip-3.12.6-plugin.run
chmod +x hplip-3.12.6-plugin.run
./hplip-3.12.6-plugin.run

This opens a gui where you get to accept the T&C. Not much seems to happen beyond that though. Oh, and you can't do the sudo-thingy because, well, you get the magic cookie error.

OK, take a step back.

sudo lpstat -a

global-mfp accepting requests since Mon 20 Aug 2012 10:46:48 EST
HP-LaserJet-Professional-P1102w-2 accepting requests since Mon 20 Aug 2012 13:49:19 EST
lpoptions -d HP-LaserJet-Professional-P1102w-2

lpq
HP-LaserJet-Professional-P1102w-2 is ready

Test it:
lp /etc/fstab
request id is HP-LaserJet-Professional-P1102w-2-1 (1 file(s))

 lpq
HP-LaserJet-Professional-P1102w-2 is ready and printing
Rank    Owner   Job     File(s)                         Total Size
active  (null)  1       untitled                        2048 bytes
Printing my foot it is.

Hmm...
cat /etc/group|grep lpadmin
lpadmin:x:114:
OK, edit /etc/group and add myself to lpadmin.

I can open system-config-printer, but I can't 1) delete any printers (which would've been nice) or 2) start it with sudo.

I can use http://localhost:631, but as a user I'm somehow not allowed to remove printers, and I use sudo -- there's no 'root' password.

At this point I was getting frustrated and nuked all printers clear of my system.

sudo lpadmin -x HP-LaserJet-Professional-P1102w-2

sudo apt-get install system-config-printer-kde

Now I had both 'printing' (= vanilla CUPS system-config-printer) and 'Printer Configuration' in the KDE menu.

Still no dice.

OK, another step back.
mkdir hp
./hplip-3.12.6-plugin.run --target hp
cd hp/
sudo cp 86-hpmud-sysfs_hp_laserjet_professional_p1102w.rules /etc/udev/rules.d

Still no effing luck.

Tried one last desperate thing:
gksudo ./hplip-3.12.6-plugin.run
Clicked the agreement. Said done.
Hit 'print test page' and...it printed!

So...you need to be root...but the MIT magic cookie issue combined with the requirement to have a friggin' gui (why???) made that painful. Well, I guess that's why we're told to use gksu whend oing graphical stuff...

OK, only things remaining: network sharing. Should be simple. Opened up all ports on print-host box to accept traffic from my desktop (and desktop only). No luck.

gksu system-config-printer
Click on server, settings -> check Publish shared printers connected to this system.

On my desktop it was then just add a matter of adding the printer in GNOME (ipp://192.168.1.101:631/printers/HP-LaserJet-Professional-P1102w)

Sweet.

SOLVED.


16 August 2012

221. First steps with Moodle (the .deb version) in Debian

This is a very brief, very basic outline.

Looking online the general recommendation seems to be to install moodle yourself and avoid the debian package version. While not a stranger to rolling my own, I just want to play with moodle for learning purposes because that's what my uni uses.

Anyway, do
sudo apt-get install moodle

Once it's installed, look at
ls /etc/apache2/sites-enabled

If there's nothing moodly, then do
sudo cp /etc/moodle/apache2.conf /etc/apache2/sites-enabled/moodle.conf
sudo service apache2 restart

Navigate to http://localhost/moodle

The actual installation takes ages, so go get yourself a cup of tea/coffee to prevent your mouse finger from getting itchy. In other words, just wait until you get a list showing that all steps have completed and that you may continue.

You need to allow cookies.


Fill out the admin stuff - country, password, city etc etc.



At this point you can log in, create a class and start exploring. Have fun.


15 August 2012

220. My first few lectures in Australia...and the importance of thermodynamics

I'm not an expert on anything. It's not a matter of false modesty, but an observation based on the fact that I come across new things on a regular basis, even within my field of research. When I do, I try to learn.

But to learn new things I need a toolbox - a set of skills that I rely on to put anything new into context. The toolbox for a chemist is basically thermodynamics. Simple, applied thermodynamics. As a chemist you can get away with memorising the expression for Gibbs free energy and the general expression for rate laws, and you can do a whole lot of fun/damage.

I'm only at the beginning of my current class, and I'm an inexperienced lecturer, but I had a bit of a shock today discovering that my class -- second semester of the second year at a 'good' university -- don't feel comfortable dealing with free energies and standard potentials.

The origin lies in the idea of students as clients here -- universities want pass rates in excess of 70-80%, while most of the faculty likely experienced a first year as undergraduates where the Gen Chem class (which did thermodynamics until you eyes were bleeding) had a failure rate of 60% or above.

We might have thought it was harsh to fail that many students at the very beginning, but the result was that the more inspired/motivated students made it through, and the ones who weren't willing to dedicate the effort necessary to become professionals got a kick in the pants to look for majors that actually interested and inspired them.

If something interests you it  becomes 'easy' -- either because you instinctually  understand it, or more likely, because you simply put in that extra effort to teach yourself.

Instead, the impetus to pass as many students as possible -- and to get 'good' student feedback which will help your promotion -- means that the students are never challenged. 'Difficult' topics are avoided and taught late in the course or ,increasingly, not at all.

Rumour has it that one of the reasons why some Australian universities are adopting a Bologna-inspired model is because they can use the masters section of the education to cover the things the students should have learned as undergraduates --- and thus still produce graduates with the skills that their chosen major indicates that they should have.

It's pretty damning.

The consequence is dire -- some scary example of PROFESSORS -- that is: professionals entrusted with teaching the next generation of scientists and engineers -- in the STEM fields who don't appear to understand basic thermodynamics or more specifically: entropy and the distinction between open and closed systems based on their use of thermodynamics to 'disprove' Evolution. They may be appearing to be capable professionals in every other sense and may well do 'good' research. The individuals may be appearing to do capable research in every other aspect and may be wonderful people, but their use of the 2nd law of thermodynamics as an argument against evolution is just misguided.

Andrew McIntosh -- Professor of Thermodynamics(!) and Combustion Theory at Leeds. Website.
Stuart Burgess -- Professor, Department of Engineering at Bristol. Website

In fact, the list here would presumably include mostly people of a similar persuasion. While I've seen Andrew's and Stuart's writings, I feel comfortable commenting on their opinions, but since I am not as confident about the rest of the people on the list, let's just highlight the fact that it include people from (the universities of) Sheffield, Cambridge, Liverpool and Cardiff.

Or what about this letter: http://www.bcseweb.org.uk/index.php/Main/EstelleMorris
The fact that the signatories mention their affiliations is an obvious way of trying ot use those affiliations to attach significance to their views.

Anyway, here's MC Hawking's take on it: http://www.mchawking.com/includes/lyrics/entropy_lyrics.php

14 August 2012

219. Seeing animals in metropolitan Melbourne

As a foreigner in Melbourne one of my first questions to my local pals was where I could see typical Aussie animals. The answer was 'The Zoo', which just doesn't cut it.

Well, it would've helped if I had asked the more outdoorsy types, because you can actually be fairly sure to catch glimpses of kangaroos, wallabies, bandicoots, corellas, rosellas, cockatoos and kookaburras without having to travel too far. This is biased towards the eastern suburbs, since that's where I roam.

I (almost) always see...

Kangaroos: As it turns out,  the most iconic animal of Australia is very easy to find around Melbourne. Sure, it's Grey Kangaroo and not the perhaps more famous Red Kangaroo, but they are everywhere if you know where to look, and in many places they are used to humans.
This is one kangaroo that isn't bothered by human presence. Lysterfield Lake Park on Tramline track.

On the best places is along Tram Line Track in Lysterfield Lake Park:
Lysterfield Lake Park

Tram line track is where the action's at

The walking's easy for the most part. Some of the tracks get soggy and muddy due to bikes and horses ruining them.

For the most part, the park's flat.
If you go on the smaller trails and keep quiet you may spot the odd Swamp Wallaby -- they are much smaller, solitary and tend to leg it whenever they see humans. Apart from their difference in size and fuzzier fur, you can recognise them by 1. they are (almost) always alone, 2. their tail is white-tipped and 3. their stance when hopping is different from that of kangaroos (lean forward more).

Other safe places to spot kangaroos would be Churchill national park, the area around Police Paddocks, and Cardinia Lake Park -- but Cardinia Reservoir Park tends to get really busy with people playing games and bbq:ing which tends to ruin the experience.

 Lysterfield Park north of Wellington road also has a lot of Kangaroos, but they aren't always easy to spot.

Wallabies: The Swamp Wallaby is quite common around Melbourne, but is much more shy than Kangaroos. It's also solitary -- if you spot a kangaroo you know that there are others around, but not so with wallabies.
The picture is heavily overexposed to show the wallaby hiding in the shade off of Possum Gully track in Cranbourne

The best place to spot wallabies seems to be the park surrounding the Botanical Park in Cranbourne. If you're quiet and attentive you will spot Wallabies in the sandier parts of the park before they run away. If you bring binoculars you're like to see wallabies hiding in the stands of ferns abutting the forests or in the middle of the fields. Again, they take a bit more patience to spot.

Cranbourne Botanic Gardens

Sometimes you can spot Wallabies among the fern in the more open part of the park

Otherwise, carefully walk on the sandier tracks among the scrub -- and be aware that any wallaby that sees you will either leg it, or stand absolutely still in the shade. So you need keen eyes.


I've seen wallabies up by Lysterfield Lake Park every time I go there, but you'll have to find the smaller tracks/breaks to spot them.

Crimson Rosellas: This is a typical forest parrot. Dandenong Ranges is a great place to see them, and because of their crimson red colour they aren't that difficult to spot. The young are green and tend to form flocks towards the end of summer. You sometimes see a single green young and its two parents as well earlier in the summer.
Just outside Churchill national park

The best places to spot them  is in shaded and wooded areas right at the edge of the woods. Rock track next to the golf course outside Olinda tends to be good, but in general, these birds are fairly common in the area. Typically you'd see a pair, but the young do flock at times.

A flock of crimson rosellas




Rock track

The lookout on Chalet road is a good place to park your car and start walking.

Another safe place is Grant's Picnic Ground north of Belgrave. There's an area where you can feed the parrots, and it's mostly cockatoos and rosellas. Just be warned -- they bring in bus-loads of tourists, so if you're looking for a 'wild' experience it can be a bit disappointing. The parrots are wild and free alright, but they are certainly used to the presence of humans.

Grant's picnic ground. It has some nice tracks as well.

There's a visitor centre where you can buy birdseed.



Sulphur-crested Cockatoos:
A cockatoo grazing on a football field in the morning sun

Apart from Grant's Picnic Ground (see Crimson Rosellas) where you are more or less guaranteed to find them, the area around Shepherd's bush in the Dandenong Valley area is pretty good. It also has the advantage of being easy to get to.

Eastern Rosellas: Probably the prettiest parrot I've seen. It's not always that easy to find, but it tends to be in more open areas than the Crimson Rosella. Also, it tend to forage on the ground. Normally, using your ears is the way to know where to look Rosellas have a much more melodic call than the other parrots and are easy to distinguish based on that alone.
They are small and fairly shy. The red head gives it away though.  Police Paddocks.
The best area so far seems to be Police paddocks just east of Stud Road in the city's far east, where there's a bit of open forest. You're likely to spot Kangaroos there as well.


Southern Brown Bandicoot: 
I've seen them two out of three time when visiting Cranbourne botanic gardens (See Wallabies above). They are normally around the visitor centre and the BBQ area. If you walk into the park you might see them cross the road. Going on the smaller tracks along the fence tends to be rewarding as well. Basically think of them as big rats in terms of what you're looking for -- they are less paranoid than rats, but you should still walk softly.
A large bandicoot over by the playground behind Cranbourne botanic gardens

Rainbow Lorikeets: You see them everywhere in the eastern suburbs. If you want to see them up close, look for e.g. flowering bottle brush and you might get lucky.
A rainbow lorikeet in our garden.
Kookaburras: Kookaburras aren't rare, but can be tricky to locate. The only place which has always  produced consistent sightings is Lysterfield Lake Park -- again on tramline track. Otherwise, the more open patches of forest (mountain ash) up in Dandenong ranges can yield lots of sightings depending on season. If you go along trig track up by the sky high observatory in the Dandong ranges you may spot lots of kookaburras and crimson rosellas. Otherwise, rock track over by Olinda occasionally has the odd kookaburra laughing away.
Laughing kookaburra at Cardinia Reservoir.


I have once seen...

Echidnas: I've seen one on a track in Cardinia Reservoir Park, and one next to the road (alive and digging) in the area surrounding the Cranbourne Botanic Gardens. You're unlikely to spot them. Instead, use your ears -- you'll hear them digging in the grass/leaves.

Normally Echidnas are camera shy, but this guy was happy to pose.

Emu: I've only seem one once, and that was just east of Cardinia Reservoir Park, while driving on Red Hill road. The emu was in the fenced off area belong to Melbourne Water.
A lone emu near Cardinia.


Corellas: from time to time I see lots of them -- they seem to be very common in the area around Monash University, but they aren't always there when I visit. Especially the corner of Blackburn road and Wellington road occasionally has flocks of them.

Little Corellas hanging out in a town along Great Ocean Road.