Update 27/6/2013:
Please note that Kirill Berezovsky has published a series of posts on GAMESS US, including how to compile it for both CPU and GPU use. See
http://biochemicalmatters.blogspot.com.au/2013/06/gamess-us-frequently-asked-questions_26.html
http://biochemicalmatters.blogspot.ru/2013/06/gamess-us-frequently-asked-questions_1687.html
http://biochemicalmatters.blogspot.ru/2013/06/gamess-us-frequently-asked-questions_1447.html
http://biochemicalmatters.blogspot.com.au/2013/06/gamess-us-frequently-asked-questions.html
Update 21 May 2013: See the comments below
this post. This approach most likely works -- what has been confusing me is the lack of reports of GPU timings in the output, but this doesn't necessarily mean that the GPU isn't being used. The poster below
this post, using nvidia-smi, observed GPU usage, although the speed-up was not major.
Update 10/05/2013: fixed libcchem compile.
Everything compiles fine and computations run fine and fast. To date
there's only one other detailed step-by-step example of successful compilation of GAMESS with GPU support out there. At least based on google.
For various reasons I'm beginning to suspect that ATLAS isn't working out for me -- I've had issues getting things to converge with ATLAS, but which work fine with ACML (
see post B).
I was in part following
http://combichem.blogspot.com.au/2011/02/compiling-gamess-with-cuda-gpu-support.html and ./libcchem/aaa.readme.1st
This took a while to hammer out, so the write-up is a bit messy.
Set up
sudo apt-get install libboost-all-dev build-essential g++ gfortran automake nvidia-cuda-toolkit python-cheetah openmpi-bin libopenmpi-dev zlib1g-dev checkinstall
mkdir ~/tmp
Get gamess (see e.g.
http://verahill.blogspot.com.au/2012/09/compiling-and-testing-gamess-us-on.html).
Put gamess-current.tar.gz in ~/tmp
cd ~/tmp
tar xvf gamess-current.tar.gz
sudo mv gamess /opt/gamess_cuda
sudo chown $USER:$USER /opt/gamess_cuda -R
Preparing Boost
Edit /usr/include/boost/mpl/aux_/integral_wrapper.hpp
47 // other compilers (e.g. MSVC) are not particulary happy about it
48 #if BOOST_WORKAROUND(__EDG_VERSION__, <= 238) || defined(__CUDACC__)
49 typedef struct AUX_WRAPPER_NAME type;
Edit /usr/include/boost/mpl/size_t_fwd.hpp
20
21 BOOST_MPL_AUX_ADL_BARRIER_NAMESPACE_OPEN
22 #if defined(__CUDACC__)
23 typedef std::size_t std_size_t;
24 template< std_size_t N > struct size_t;
25 #else
26 template< std::size_t N > struct size_t;
27 #endif
28
29 BOOST_MPL_AUX_ADL_BARRIER_NAMESPACE_CLOSE
Edit /usr/include/boost/mpl/size_t.hpp
19 #if defined(__CUDACC__)
20 #define AUX_WRAPPER_VALUE_TYPE std_size_t
21 #define AUX_WRAPPER_NAME size_t
22 #define AUX_WRAPPER_PARAMS(N) std_size_t N
23 #else
24 #define AUX_WRAPPER_VALUE_TYPE std::size_t
25 #define AUX_WRAPPER_NAME size_t
26 #define AUX_WRAPPER_PARAMS(N) std::size_t N
27 #endif
28
HDF5
You'll have to compile that yourself for now since H5Cpp.h missing in the debian packages.(i.e. cxx support)
mkdir ~/tmp
cd ~/tmp
wget http://www.hdfgroup.org/ftp/HDF5/current/src/hdf5-1.8.10-patch1.tar.gz
tar xvf hdf5-1.8.10-patch1.tar.gz
cd hdf5-1.8.10-patch1/
export CC=/usr/bin/gcc-4.6 && export CXX=/usr/bin/g++-4.6
./configure --prefix=/opt/gamess_cuda/hdf5 --with-pthread --enable-cxx --enable-threadsafe --enable-unsupported
make
mkdir /opt/gamess_cuda/hdf5/lib -p
mkdir /opt/gamess_cuda/hdf5/include -p
sudo checkinstall
This package will be built according to these values:
0 - Maintainer: [ root@neon ]
1 - Summary: [ hdf5-cxx]
2 - Name: [ hdf5-1.8.10 ]
3 - Version: [ 1.8.10-1 ]
4 - Release: [ 1 ]
5 - License: [ GPL ]
6 - Group: [ checkinstall ]
7 - Architecture: [ amd64 ]
8 - Source location: [ hdf5-1.8.10-patch1 ]
9 - Alternate source location: [ ]
10 - Requires: [ ]
11 - Provides: [ hdf5-1.8.10 ]
12 - Conflicts: [ ]
13 - Replaces: [ ]
Make sure to edit the Version field since Patch-1 leads to an error (must start with digit).
Openmpi 1.6
Can't remember why I ended up compiling it myself instead of using the stock debian version. From here.
sudo apt-get install build-essential gfortran
wget http://www.open-mpi.org/software/ompi/v1.6/downloads/openmpi-1.6.tar.bz2
tar xvf openmpi-1.6.tar.bz2
cd openmpi-1.6/
sudo mkdir /opt/openmpi/
sudo chown ${USER} /opt/openmpi/
./configure --prefix=/opt/openmpi/1.6/ --with-sge
make
make install
compiling libcchem
cd /opt/gamess_cuda/libcchem
edit /opt/gamess_cuda/libcchem/rysq/src/externals/boost/cuda/device_ptr.hpp
4 #include <cstdlib>
5 #include <iterator>
6 #include <stddef.h>
7
8 namespace boost {
Edit /opt/gamess_cuda/libcchem/src/externals/boost/cuda/device_ptr.hpp
4 #include <cstdlib>
5 #include <iterator>
6 #include <stddef.h>
7
8 namespace boost {
9 namespace cuda {
./configure --with-gamess --with-hdf5=/opt/gamess_cuda/hdf5 CPPFLAGS="-I/opt/gamess_cuda/hdf5/include" --with-cuda=/usr --disable-openmp --prefix=/opt/gamess_cuda/libcchem --with-gpu=fermi --with-integer8 --with-cublas
make
make install
Configure Gamess US
Mainly follow this: http://verahill.blogspot.com.au/2012/09/compiling-and-testing-gamess-us-on.html
cd /opt/gamess_cuda
./config
please enter your target machine name: linux64
GAMESS directory? [/opt/gamess_cuda] /opt/gamess_cuda
Setting up GAMESS compile and link for GMS_TARGET=linux64
GAMESS software is located at GMS_PATH=/opt/gamess_cuda
Please provide the name of the build locaation.
This may be the same location as the GAMESS directory.
GAMESS build directory? [/home/me/tmp/gamess]
Please provide a version number for the GAMESS executable.
This will be used as the middle part of the binary's name,
for example: gamess.00.x
Version? [00] 12r2
Please enter your choice of FORTRAN: gfortran
gfortran is very robust, so this is a wise choice.
Please type 'gfortran -dumpversion' or else 'gfortran -v' to
detect the version number of your gfortran.
This reply should be a string with at least two decimal points,
such as 4.1.2 or 4.6.1, or maybe even 4.4.2-12.
The reply may be labeled as a 'gcc' version,
but it is really your gfortran version.
Please enter only the first decimal place, such as 4.1 or 4.6:
4.6
Enter your choice of 'mkl' or 'atlas' or 'acml' or 'none': atlas
Please enter the Atlas subdirectory on your system: /opt/ATLAS/lib
Math library 'atlas' will be taken from /opt/ATLAS
If you have an expensive but fast network like Infiniband (IB), and
if you have an MPI library correctly installed,
choose 'mpi'.
communication library ('sockets' or 'mpi')? mpi
Enter MPI library (impi, mvapich2, mpt, sockets): openmpi
Please enter your openmpi's location: /opt/openmpi/1.6
Build Gamess US
cd /opt/gamess_cuda/ddi/
./compddi
cd ../
Edit
comp
872 # see ~/gamess/libcchem/aaa.readme.1st for more information
873 set GPUCODE=true
874 if ($GPUCODE == true) then
and
1663 # -fno-whole-file suppresses argument's data type checking
1664 set OPT='-O0'
1665 if (".$GMS_DEBUG_FLAGS" != .) set OPT="$GMS_DEBUG_FLAGS"
./compall
Edit
lked
69 #
70 set GPUCODE=true
71 #
72 # 5. optional MPQC interface
and
958 case openmpi:
959 set MPILIBS="-L$GMS_MPI_PATH/lib"
960 set MPILIBS="$MPILIBS -lmpi -lpthread"
961 breaksw
and
1214 if ($GPUCODE == true) then
1215 echo " Using 'libcchem' add-in C++ codes for Nvidia/CUDA GPUs."
1216 set GPU_LIBS="-L/opt/gamess_cuda/libcchem/lib -lcchem_gamess -lcchem -lrysq"
1217 set GPU_LIBS="$GPU_LIBS -lcudart -lcublas"
1218 ### GPU_LIBS="$GPU_LIBS -lcudart -lcublas"
1219 set GPU_LIBS="$GPU_LIBS /usr/lib/libboost_thread.a"
1220 set GPU_LIBS="$GPU_LIBS /opt/gamess_cuda/hdf5/lib/libhdf5.a"
1221 set GPU_LIBS="$GPU_LIBS /opt/gamess_cuda/hdf5/lib/libhdf5_cpp.a"
1222 set GPU_LIBS="$GPU_LIBS /opt/gamess_cuda/hdf5/lib/libhdf5_hl.a"
1223 set GPU_LIBS="$GPU_LIBS /opt/gamess_cuda/hdf5/lib/libhdf5.a"
1224 set GPU_LIBS="$GPU_LIBS /opt/ATLAS/lib/libcblas.a"
1225 set GPU_LIBS="$GPU_LIBS -lz"
1226 set GPU_LIBS="$GPU_LIBS -lstdc++"
1227 ### GPU_LIBS="$GPU_LIBS -lgomp"
1228 set GPU_LIBS="$GPU_LIBS -lpthread"
1229 echo " libcchem GPU code's libraries are"
1230 echo "$GPU_LIBS"
1231 else
./lked gamess gpu.12
Create
gpurun
#!/bin/csh
set TARGET=mpi
set SCR=$HOME/scratch
set USERSCR=/scratch
set GMSPATH=/opt/gamess_cuda
set JOB=$1
set VERNO=$2
set NCPUS=$3
@ NUMGPU=1
if ($NUMGPU > 0) then
@ NUMCPU = $NCPUS - 1
echo libcchem kernels will use $NUMCPU cores and $NUMGPU GPUs per node...
set echo
setenv CCHEM_PROFILE 1
setenv NUM_THREADS $NCPUS
#--if ($NUMGPU == 0) setenv GPU_DEVICES -1
#--if ($NUMGPU == 2) setenv GPU_DEVICES 0,1
#--if ($NUMGPU == 4) setenv GPU_DEVICES 0,1,2,3
#setenv LD_LIBRARY_PATH /share/apps/cuda/lib64:$LD_LIBRARY_PATH
###### LD_LIBRARY_PATH /usr/local/cuda/lib64:$LD_LIBRARY_PATH
unset echo
else
setenv GPU_DEVICES -1
endif
if ( $JOB:r.inp == $JOB ) set JOB=$JOB:r
echo "Copying input file $JOB.inp to your run's scratch directory..."
cp $JOB.inp $SCR/$JOB.F05
setenv TRAJECT $USERSCR/$JOB.trj
setenv RESTART $USERSCR/$JOB.rst
setenv INPUT $SCR/$JOB.F05
setenv PUNCH $USERSCR/$JOB.dat
if ( -e $TRAJECT ) rm $TRAJECT
if ( -e $PUNCH ) rm $PUNCH
if ( -e $RESTART ) rm $RESTART
source $GMSPATH/gms-files.csh
setenv LD_LIBRARY_PATH /opt/openmpi/lib:$LD_LIBRARY_PATH
set path= ( /opt/openmpi/bin $path )
mpiexec -n $NCPUS $GMSPATH/gamess.gpu.$VERNO.x|tee $JOB.out
cp $PUNCH .
echo 'export PATH=$PATH:/opt/gamess_cuda' >> ~/.bashrc
source ~/.bashrc
chmod +x gpurun
cd test/standard/
gpurun exam44 12 2
The only evidence of GPU usage in the output is e.g. in exam44.out:
388 -----------------------
389 MP2 CONTROL INFORMATION
390 -----------------------
391 NACORE = 6 NBCORE = 6
392 LMOMP2 = F AOINTS = DUP
393 METHOD = 2 NWORD = 0
394 MP2PRP = F OSPT = NONE
395 CUTOFF = 1.00E-09 CPHFBS = BASISAO
396 CODE = GPU
397
398 NUMBER OF CORE -A- ORBITALS = 6
399 NUMBER OF CORE -B- ORBITALS = 6
but in the summary only CPU utilisation is mentioned.
I modified
rungms:
me@neon:/opt/gamess_cuda/tests/standard$ diff /opt/gamess_cuda/gpurungms /opt/gamess/rungms
59,62c59,62
< set TARGET=mpi
< set SCR=$HOME/scratch
< set USERSCR=/scratch
< set GMSPATH=/opt/gamess_cuda
---
> set TARGET=sockets
> set SCR=/scr/$USER
> set USERSCR=~$USER/scr
> set GMSPATH=/u1/mike/gamess
67d66
< set NNODES=1
513c512
< set PPN=$3
---
> set PPN=$4
601c600
< @ PPN2 = $PPN
---
> @ PPN2 = $PPN + $PPN
742c741
< @ NUMGPU=1
---
> @ NUMGPU=0
752c751
< # setenv LD_LIBRARY_PATH /share/apps/cuda/lib64:$LD_LIBRARY_PATH
---
> setenv LD_LIBRARY_PATH /share/apps/cuda/lib64:$LD_LIBRARY_PATH
793c792,793
< /opt/openmpi/1.6/bin/mpiexec -n $NPROCS $GMSPATH/gamess.$VERNO.x < /dev/null
---
> mpiexec.hydra -f $PROCFILE -n $NPROCS \
> /home/mike/gamess/gamess.$VERNO.x < /dev/null