21 September 2012

244. Molden on debian testing

Update: avogadro can write gamess input files, but seems to offer little in the way of showing detailed output from gamess output files. Also, some of the input files contain keywords which don't exist.

Original post:
Nothing beats a good GUI, so after butting heads with gabedit again (and losing - again. Although in this case I think I tried to make it do something it wasn't designed to) I've decided to try Molden.

To download, go here, make sure to be a good citizen and register yourself as a user (will help motivate funding for development) then download: http://www.cmbi.ru.nl/molden/howtoget.html

cd ~/tmp
wget ftp://ftp.cmbi.ru.nl/pub/molgraph/molden/molden5.0.tar.gz
tar xvf molden5.0.tar.gz
cd molden5.0/

Edit makefile and remove -lXmu from line 20:

16 CC = cc
17 FC = gfortran
18 LIBS =  -lX11 -lm
19 LDR = ${FC} 
20 LIBSG = -L/usr/X11R6/lib -lGLU -lGL -lX11 -lm

cd surf/

edit Makefile and change it from

 46 depend: $(DEPEND)
 47     @ echo making dependencies...
 48     @ echo ' ' > makedep
 49     @ makedepend $(INCLUDE) -f makedep $(DEPEND)

to

 46 depend: $(DEPEND)
 47     @ echo making dependencies...
 48     @ echo ' ' > makedep
 49     @ $(CC) $(INCLUDE) -M $(DEPEND) > makedep

Save and go back up one level, and run make:
 cd ../
 make

You're pretty much done.

I like putting things in /opt, so
sudo mkdir /opt/molden
sudo chown $USER /opt/molden
cp ~/tmp/molden5.0/* -R /opt/molden

stick
export PATH=$PATH:/opt/molden
in your ~/.bashrc

Type
molden
to run

Molden can read output files from gamess -- still exploring the exact capabilities, but e.g the convergence information can be accessed:


and you can get nifty contour plots of the electron density of orbitals etc.


Error:

If you don't edit the surf/Makefile as shown above you'll get

make[1]: Leaving directory `/home/me/tmp/molden5.0/ambfor'
make -C surf depend
make[1]: Entering directory `/home/me/tmp/molden5.0/surf'
making dependencies...
make[1]: makedepend: Command not found
make[1]: *** [depend] Error 127
make[1]: Leaving directory `/home/me/tmp/molden5.0/surf'

20 September 2012

243. My own personal benchmarks for NWChem, gromacs with atlas, openblas, acml on AMD and intel

Update: you can compile against acml on intel as well, and against mkl on amd. Still need to do some performance testing to see how well it works. The artificial penalty of running mkl on AMD is well-publicised and led to a lawsuit, but I don't know how acml performs on mkl.


The title says it all, really. Since I'm back to exploring ways of improving performance for my little cluster I figured I'd break this out as a separate post. Most of this data was found here before: http://verahill.blogspot.com.au/2012/09/new-compute-node-using-amd-fx-8150.html

All units are running up-to-date debian testing (wheezy).

Configuration:
Boron (B): Phenom II X6 2.8 GHz, 8Gb RAM (2.8*6=16.8 GFLOPS predicted)
Neon (Ne): FX-8150 X8 3.6 GHz, 16 Gb RAM (3.6*8=28.8 GFLOPS predicted (int), 3.6*4=14.4 GFLOPS (fpu))
Tantalum (Ta): Quadcore i5-2400 3.1 GHz, 8 Gb RAM (3.1*4=12.4 GFLOPS predicted)
Vanadium (V):  Dual socket 2x Quadcore Xeon X3480 3.06 GHz, 8Gb. CentOS (ROCKS 5.4.3)/openblas.

Results

Gromacs --double (1 ns 6x6x6 nm tip4p water box; dynamic load balancing, double precision, 500k steps)
B  :  10.662 ns/day (11.8  GFLOPS, runtime 8104 seconds)***
B  :    9.921 ns/day ( 10.9 GFLOPS, runtime 8709 seconds)**
Ne:  10.606 ns/day (11.7  GFLOPS, runtime 8146 seconds) *
Ne:  12.375 ns/day (13.7  GFLOPS, runtime 6982 seconds)**
Ne:  12.385 ns/day (13.7  GFLOPS, runtime 6976 seconds)****
Ta:  10.825 ns/day (11.9  GFLOPS, runtime 7981 seconds)***
V :   10.560 ns/dat (11.7  GFLOPS, runtime 8182 seconds)***
*no external blas/lapack.
**using ACML libs
*** using openblas
**** using ATLAS

Gromacs --single (1 ns 6x6x6 nm tip4p water box; dynamic load balancing, single precision, 500 k steps)
B  :   17.251 ns/day (19.0 GFLOPS, runtime 5008 seconds)***
Ne:   21.874 ns/day (24.2 GFLOPS, runtime  3950 seconds)**
Ne:   21.804 ns/day (24.1 GFLOPS, runtime 3963  seconds)****
Ta:   17.345 ns/day (19.2 GFLOPS, runtime  4982 seconds)***
V :   17.297 ns/day (19.1 GFLOPS, runtime 4995 seconds)***
*no external blas/lapack.
**using ACML libs
*** using openblas
**** using ATLAS

NWChem (opt biphenyl cation, cp-md/pspw):
B  :   5951 seconds**
B  :   4084 seconds ***
B  :   5782 seconds ***xy
Ne:    3689 seconds**
Ta :   4102 seconds***
Ta :   4230 seconds***xy
V :    5396 seconds***

*no external blas/lapack.
**using ACML libs
*** using openblas
x Reconfigured using getmem.nwchem

NWChem (opt biphenyl cation, geovib, 6-31G**/ub3lyp):
B  :  2841 seconds **
B  :  2410 seconds***
B  :  2101 seconds ***x
B  :  2196 seconds ***xy
Ne: 1665 seconds **
Ta : 1785 seconds***
Ta : 1789 seconds***xy
V  : 2600 seconds***

*no external blas/lapack.
**using ACML libs
*** using openblas
x Reconfigured using getmem.nwchem
y NWChem 6.1.1

A Certain Commercial Ab Initio Package (Freq calc of pre-optimised H14C19O3 at 6-31+G*/rb3lyp):
B  :    2h 00 min (CPU time 10h 37 min)
Ne:   1h 37 min (CPU time: 11h 13 min)
Ta:   1h 26 min (CPU time: 5h 27 min)
V  :   2h 15 min (CPU time 15h 50 min)
Using precompiled binaries.


Gamess:
(I'm still working on learning how to run gamess efficiently, so take these values with a huge saucer of salt for now). bn.inp does a geometry optimisation of a biphenyl cation (mult 2) at ub3lyp/6-31G**. bn.inp has no $STATPT card while bn3.inp does and it makes a huge difference -- but is this because it does 20 steps (nsteps=20), then kills the run? The default is 50 steps and it does seem like all the runs do the maximum number of steps, then exit.

 Again, still learning. See below for input files. Will fix this post as I learn what the heck I'm doing. The relative run times on each node are still comparable though, but just don't use the numbers to compare the run speed of e.g. nwchem vs gamess.

Gamess using bn.inp with atlas
B:    9079 seconds
Ne: 7252 seconds
Ta:  9283 seconds

Gamess using bn.inp with openblas
B:   9071 seconds
Ta: 9297 seconds

Gamess using bn.inp with acml
Ne: 7062 seconds

Gamess using bn3.inp with atlas. 
B: 4016 seconds
Ne: 3162 seconds
Ta: 4114 seconds

MPQC:
Here I've used the version in the debian repos. I've created a hostfile
neon slots=8 max_slots=8
tantalum slots=4 max_slots=4
boron slots=6 max_slots=6

and then just looked at changing the order and slots assignment as well as total number of cores assigned using mpirun.

Simple test case looking at number of cores/distribution:
n cores:  Seconds: Configuration(cores,exec nodes)
4    :   11   : 4(Ta)
4    :   17   : 4(Ne)
4    :   17   : 4(B)
4    :   42   : 2(Ta)+2(B)
6    :   12   : 6(B)
6    :   13   : 6(Ne)
6    :   74   : 2(Ta)+2(B)+2(Ne)
8    :   12   : 8(Ne)
10  :   43   : 4(Ta)+6(B)
12  :   47   : 4(Ta)+8(Ne)
14  :   55   : 6(B)+8(Ne)
18  :   170 :  4(Ta)+6(B)+8(Ne)

My beowulf cluster doesn't seem to be much of a super computer. All in all, this looks like a pretty good argument in favour of upgrading to infiniband...


bn.inp:
 $CONTRL 
COORD=CART UNITS=ANGS scftyp=uhf dfttyp=b3lyp runtyp=optimize 
ICHARG=1 MULT=2 maxit=100
$END
 $system mwords=2000 $end
 $BASIS gbasis=n31 ngauss=6 ndfunc=1 npfunc=1 $END
 $guess guess=huckel $end

 $DATA
biphenyl
C1
C      6.0      0.0000000000   -3.5630100000    0.0000000000 
C      6.0     -1.1392700000   -2.8592800000   -0.3938400000 
C      6.0     -1.1387900000   -1.4654500000   -0.3941500000 
C      6.0      0.0000000000   -0.7428100000    0.0000000000 
C      6.0      1.1387900000   -1.4654500000    0.3941500000 
C      6.0      1.1392700000   -2.8592800000    0.3938400000 
C      6.0      0.0000000000    0.7428100000    0.0000000000 
C      6.0      1.1387900000    1.4654500000   -0.3941500000 
C      6.0      1.1392700000    2.8592800000   -0.3938400000 
C      6.0     -1.1387900000    1.4654500000    0.3941500000 
C      6.0      0.0000000000    3.5630100000    0.0000000000 
C      6.0     -1.1392700000    2.8592800000    0.3938400000 
H      1.0      0.0000000000   -4.6489600000    0.0000000000 
H      1.0     -2.0282700000   -3.3966200000   -0.7116100000 
H      1.0     -2.0214800000   -0.9282700000   -0.7279300000 
H      1.0      2.0282700000   -3.3966200000    0.7116100000 
H      1.0      2.0282700000    3.3966200000   -0.7116100000 
H      1.0     -2.0214800000    0.9282700000    0.7279300000 
H      1.0      0.0000000000    4.6489600000    0.0000000000 
H      1.0     -2.0282700000    3.3966200000    0.7116100000 
H      1.0      2.0214800000    0.9282700000   -0.7279300000 
H      1.0      2.0214800000   -0.9282700000    0.7279300000 
 $END


bn3.inp:
$CONTRL 
COORD=CART UNITS=ANGS scftyp=uhf dfttyp=b3lyp runtyp=optimize 
ICHARG=1 MULT=2 maxit=100
$END
 $system mwords=2000 $end
 $BASIS gbasis=n31 ngauss=6 ndfunc=1 npfunc=1 $END
 $STATPT OPTTOL=0.0001 NSTEP=20 HSSEND=.TRUE. $END
 $guess guess=huckel $end

 $DATA
biphenyl
C1
C      6.0      0.0000000000   -3.5630100000    0.0000000000 
C      6.0     -1.1392700000   -2.8592800000   -0.3938400000 
C      6.0     -1.1387900000   -1.4654500000   -0.3941500000 
C      6.0      0.0000000000   -0.7428100000    0.0000000000 
C      6.0      1.1387900000   -1.4654500000    0.3941500000 
C      6.0      1.1392700000   -2.8592800000    0.3938400000 
C      6.0      0.0000000000    0.7428100000    0.0000000000 
C      6.0      1.1387900000    1.4654500000   -0.3941500000 
C      6.0      1.1392700000    2.8592800000   -0.3938400000 
C      6.0     -1.1387900000    1.4654500000    0.3941500000 
C      6.0      0.0000000000    3.5630100000    0.0000000000 
C      6.0     -1.1392700000    2.8592800000    0.3938400000 
H      1.0      0.0000000000   -4.6489600000    0.0000000000 
H      1.0     -2.0282700000   -3.3966200000   -0.7116100000 
H      1.0     -2.0214800000   -0.9282700000   -0.7279300000 
H      1.0      2.0282700000   -3.3966200000    0.7116100000 
H      1.0      2.0282700000    3.3966200000   -0.7116100000 
H      1.0     -2.0214800000    0.9282700000    0.7279300000 
H      1.0      0.0000000000    4.6489600000    0.0000000000 
H      1.0     -2.0282700000    3.3966200000    0.7116100000 
H      1.0      2.0214800000    0.9282700000   -0.7279300000 
H      1.0      2.0214800000   -0.9282700000    0.7279300000 
 $END

19 September 2012

242. Briefly: Compiling NWChem 6.1.1 with Python on Debian Testing (Wheezy)

Back at the end of June a minor version of NWChem (bug fixes) was released.

There isn't much difference between compiling 6.1.1 and 6.1. Mainly, the difference is in what line to edit for python compatibility (NWChem 6.1 here:http://verahill.blogspot.com.au/2012/05/building-nwchem-61-on-debian.html )

1. Install dev packages
sudo apt-get install libopenmpi-dev openmpi-bin python2.7-dev zlib1g-dev libssl-dev

2. Compile openblas  or download e.g. acml.
(compiles fine, but doesn't run, with atlas)

3. NWchem goodness:

sudo mkdir /opt/nwchem
sudo chown $USER /opt/nwchem
cd /opt/nwchem
wget http://www.nwchem-sw.org/images/Nwchem-6.1.1-src.2012-06-27.tar.gz
(or go here http://www.nwchem-sw.org/index.php/Download)

tar xvf Nwchem-6.1.1-src.2012-06-27.tar.gz
cd nwchem-6.1.1-src/src/config/

Edit makefile.h and hange (line numbers are just for convenience -- don't add them)
1956 #   EXTRA_LIBS += -ltk -ltcl -L/usr/X11R6/lib -lX11 -ldl
1957      EXTRA_LIBS +=    -lnwcutil  -lpthread -lutil -ldl
1958   LDOPTIONS = -Wl,--export-dynamic

to
1956 #   EXTRA_LIBS += -ltk -ltcl -L/usr/X11R6/lib -lX11 -ldl
1957      EXTRA_LIBS +=    -lnwcutil  -lpthread -lutil -ldl -lssl -lz
1958   LDOPTIONS = -Wl,--export-dynamic

cd to /opt/nwchem/nwchem-6.1.1-src/ Create a file called buildconf.sh with the following content:
export LARGE_FILES=TRUE
export TCGRSH=/usr/bin/ssh
export NWCHEM_TOP=`pwd`
export NWCHEM_TARGET=LINUX64
export NWCHEM_MODULES="all python"
export PYTHONVERSION=2.7
export PYTHONHOME=/usr
export BLASOPT="-L/opt/openblas/lib -lopenblas"
#export BLASOPT="-L/opt/acml/acml5.2.0/gfortran64_int64/lib -lacml"
#export BLASOPT="-L/opt/ATLAS/lib -lsatlas -ltatlas"
export USE_MPI=y
export USE_MPIF=y
export USE_MPIF4=y
export MPI_LOC=/usr/lib/openmpi/lib
export MPI_INCLUDE=/usr/lib/openmpi/include
export LIBRARY_PATH=$LIBRARY_PATH:/usr/lib/openmpi/lib:/opt/openblas/lib
#export LIBRARY_PATH=$LIBRARY_PATH:/usr/lib/openmpi/lib:/opt/acml/acml5.2.0/gfortran64_int64/lib
#export LIBRARY_PATH=$LIBRARY_PATH:/usr/lib/openmpi/lib:/opt/ATLAS/lib
export LIBMPI="-lmpi -lopen-rte -lopen-pal -ldl -lmpi_f77 -lpthread"
cd $NWCHEM_TOP/src
make clean
make nwchem_config
make FC=gfortran 1> make.log 2>make.err
export FC=gfortran
cd ../contrib
./getmem.nwchem

Start the compilation:
time sh buildconf.sh 

 On a quadcore i5-2400 it took 18 minutes.