Update 27/6/2013:
Please note that Kirill Berezovsky has published a series of posts on GAMESS US, including how to compile it for both CPU and GPU use. See
http://biochemicalmatters.blogspot.com.au/2013/06/gamess-us-frequently-asked-questions_26.html
http://biochemicalmatters.blogspot.ru/2013/06/gamess-us-frequently-asked-questions_1687.html
http://biochemicalmatters.blogspot.ru/2013/06/gamess-us-frequently-asked-questions_1447.html
http://biochemicalmatters.blogspot.com.au/2013/06/gamess-us-frequently-asked-questions.html
Update 21 May 2013: See the comments below this post. This approach most likely works -- what has been confusing me is the lack of reports of GPU timings in the output, but this doesn't necessarily mean that the GPU isn't being used. The poster below, using nvidia-smi, observed GPU usage, although the speed-up was not major.
Blogspot needs versioning.
I lost the entire post when it was almost complete. Screw this.
Everything compiles fine, but no GPU output during calculation.
I see no evidence of the GPU being used at any stage. Otherwise all is good -- the calcs run fine on the CPU.
Maybe someone else will have a better idea.
I looked at libcchem/aaa.readme.1st and http://combichem.blogspot.com.au/2011/02/compiling-gamess-with-cuda-gpu-support.html to get as far as I did.
Setting up gamess
Get gamess (see e.g. http://verahill.blogspot.com.au/2012/09/compiling-and-testing-gamess-us-on.html). Put gamess-current.tar.gz in ~/tmp
sudo apt-get install libboost-all-dev build-essential g++ gfortran automake nvidia-cuda-toolkit python-cheetah openmpi-bin libopenmpi-dev zlib1g-dev checkinstall mkdir ~/tmp cd ~/tmp tar xvf gamess-current.tar.gz sudo mv gamess /opt/gamess_cuda sudo chown $USER:$USER /opt/gamess_cuda -R
ACML
Download both the 'regular' and the int64 gfortran packages from AMD:
http://developer.amd.com/tools-and-sdks/cpu-development/amd-core-math-library-acml/acml-downloads-resources/#download
tar xvf acml-5-3-1-gfortran-64bit-int64.tgz tar xvf acml-5-3-1-gfortran-64bit.tgz sh install-acml-5-3-1-gfortran-64bit-int64.shYou'll get something like this:Where do you want to install ACML? Press return to use the default location (/opt/acml5.3.1), or enter an alternative path. The directory will be created if it does not already exist. > /opt/acml/acml5.3.1sh install-acml-5-3-1-gfortran-64bit.shWhere do you want to install ACML? Press return to use the default location (/opt/acml5.3.1), or enter an alternative path. The directory will be created if it does not already exist. > /opt/acml/acml5.3.1
/opt/acml/acml5.3.1 |-- Doc |-- gfortran64 |-- gfortran64_fma4 |-- gfortran64_fma4_int64 |-- gfortran64_fma4_mp |-- gfortran64_fma4_mp_int64 |-- gfortran64_int64 |-- gfortran64_mp |-- gfortran64_mp_int64 `-- util
where
* fma4 is for cpus with FMA4 support (use util/cpuid to check)
* int64 is for double-precision float (integer*8) I think
* mp is for openmp. For MPI do not use the _mp_ libraries!
Pick your library/ies and add them to the LD_LIBRARY_PATH, e.g.:
echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/acml/acml5.3.1/gfortran64_int64/lib' >> ~/.bashrc source ~/.bashrc
CBLAS
cd /opt/netlib/ wget http://www.netlib.org/blas/blast-forum/cblas.tgz tar xvf cblas.tgz cd CBLAS/
Edit Makefile.LINUX
24 25 BLLIB = /opt/acml/acml5.3.1/gfortran64_int64/lib/libacml.a 26 CBLIB = ../lib/cblas_$(PLAT).a 27
cp Makefile.LINUX Makefile.in make
patching libboost
sudo su cd /usr/include/boost patch -p1 < /opt/gamess_cuda/libcchem/boot/ exit
Make the following changes by hand if the patch didn't work:
/usr/include/boost/mpl/aux_/integral_wrapper.hpp
/usr/include/boost/mpl/size_t_fwd.hpp47 // other compilers (e.g. MSVC) are not particulary happy about it 48 #if BOOST_WORKAROUND(__EDG_VERSION__, <= 238) || defined(__CUDACC__) 49 typedef struct AUX_WRAPPER_NAME type;
/usr/include/boost/mpl/size_t.hpp20 21 BOOST_MPL_AUX_ADL_BARRIER_NAMESPACE_OPEN 22 #if defined(__CUDACC__) 23 typedef std::size_t std_size_t; 24 template< std_size_t N > struct size_t; 25 #else 26 template< std::size_t N > struct size_t; 27 #endif 28 29 BOOST_MPL_AUX_ADL_BARRIER_NAMESPACE_CLOSE
19 #if defined(__CUDACC__) 20 #define AUX_WRAPPER_VALUE_TYPE std_size_t 21 #define AUX_WRAPPER_NAME size_t 22 #define AUX_WRAPPER_PARAMS(N) std_size_t N 23 #else 24 #define AUX_WRAPPER_VALUE_TYPE std::size_t 25 #define AUX_WRAPPER_NAME size_t 26 #define AUX_WRAPPER_PARAMS(N) std::size_t N 27 #endif 28
HDF5
mkdir ~/tmp cd ~/tmp wget http://www.hdfgroup.org/ftp/HDF5/current/src/hdf5-1.8.10-patch1.tar.gz tar xvf hdf5-1.8.10-patch1.tar.gz cd hdf5-1.8.10-patch1/ export CC=/usr/bin/gcc-4.6 && export CXX=/usr/bin/g++-4.6 ./configure --prefix=/opt/gamess_cuda/hdf5 --with-pthread --enable-cxx --enable-threadsafe --enable-unsupported make mkdir /opt/gamess_cuda/hdf5/lib -p mkdir /opt/gamess_cuda/hdf5/include -p sudo checkinstallMake sure to edit the Version field since Patch-1 leads to an error (must start with digit).This package will be built according to these values: 0 - Maintainer: [ root@neon ] 1 - Summary: [ hdf5-cxx] 2 - Name: [ hdf5-1.8.10 ] 3 - Version: [ 1.8.10-1 ] 4 - Release: [ 1 ] 5 - License: [ GPL ] 6 - Group: [ checkinstall ] 7 - Architecture: [ amd64 ] 8 - Source location: [ hdf5-1.8.10-patch1 ] 9 - Alternate source location: [ ] 10 - Requires: [ ] 11 - Provides: [ hdf5-1.8.10 ] 12 - Conflicts: [ ] 13 - Replaces: [ ]
LIBCCHEM
Edit /opt/gamess_cuda/libcchem/src/externals/boost/cuda/device_ptr.hpp and /opt/gamess_cuda/libcchem/rysq/src/externals/boost/cuda/device_ptr.hpp. Insert
somewhere at the beginning of each file.#include <stddef.h>
./configure --with-gamess --with-hdf5=/opt/gamess_cuda/hdf5 CPPFLAGS="-I/opt/gamess_cuda/hdf5/include" --with-cuda=/usr --disable-openmp --prefix=/opt/gamess_cuda/libcchem --with-gpu=fermi --with-integer8 --with-cublas make make install
Configure GAMESS US
cd /opt/gamess_cuda ./configplease enter your target machine name: linux64 GAMESS directory? [/opt/gamess_cuda] GAMESS build directory? [/opt/gamess_cuda] Version? [00] 12 Please enter your choice of FORTRAN: gfortran Please enter only the first decimal place, such as 4.1 or 4.6: 4.6 Enter your choice of 'mkl' or 'atlas' or 'acml' or 'none': acml enter this full pathname: /opt/acml/acml5.3.1 communication library ('sockets' or 'mpi')? mpi Enter MPI library (impi, mvapich2, mpt, sockets): openmpi Please enter your openmpi's location: /opt/openmpi/1.6
Compile
cd ddi/ ./compddi cd ..
Edit comp
and872 # see ~/gamess/libcchem/aaa.readme.1st for more information 873 set GPUCODE=true 874 if ($GPUCODE == true) then
1663 # -fno-whole-file suppresses argument's data type checking 1664 set OPT='-O0' 1665 if (".$GMS_DEBUG_FLAGS" != .) set OPT="$GMS_DEBUG_FLAGS"
./compall
Edit lked
and69 # 70 set GPUCODE=true 71 # 72 # 5. optional MPQC interface
and958 case openmpi: 959 set MPILIBS="-L$GMS_MPI_PATH/lib" 960 set MPILIBS="$MPILIBS -lmpi -lpthread" 961 breaksw
1214 if ($GPUCODE == true) then 1215 echo " Using 'libcchem' add-in C++ codes for Nvidia/CUDA GPUs." 1216 set GPU_LIBS="-L/opt/gamess_cuda/libcchem/lib -lcchem_gamess -lcchem -lrysq" 1217 set GPU_LIBS="$GPU_LIBS -lcudart -lcublas" 1218 ### GPU_LIBS="$GPU_LIBS -lcudart -lcublas" 1219 set GPU_LIBS="$GPU_LIBS /usr/lib/libboost_thread.a" 1220 set GPU_LIBS="$GPU_LIBS /opt/gamess_cuda/hdf5/lib/libhdf5.a" 1221 set GPU_LIBS="$GPU_LIBS /opt/gamess_cuda/hdf5/lib/libhdf5_cpp.a" 1222 set GPU_LIBS="$GPU_LIBS /opt/gamess_cuda/hdf5/lib/libhdf5_hl.a" 1223 set GPU_LIBS="$GPU_LIBS /opt/gamess_cuda/hdf5/lib/libhdf5.a" 1224 set GPU_LIBS="$GPU_LIBS /opt/acml/acml5.3.1/gfortran64_int64/lib/libacml.a /opt/netlib/CBLAS/lib/cblas_LINUX.a" 1225 set GPU_LIBS="$GPU_LIBS -lz" 1226 set GPU_LIBS="$GPU_LIBS -lstdc++" 1227 ### GPU_LIBS="$GPU_LIBS -lgomp" 1228 set GPU_LIBS="$GPU_LIBS -lpthread" 1229 echo " libcchem GPU code's libraries are" 1230 echo "$GPU_LIBS" 1231 else
./lked gamess gpu.12
Run script
Create rungpu:
chmod +x it to make it executable.#!/bin/csh -v set TARGET=mpi set SCR=$HOME/scratch set USERSCR=/scratch set GMSPATH=/opt/gamess_cuda set JOB=$1 set VERNO=$2 set NCPUS=$3 set PPN=$3 @ NUMGPU=1 if ($NUMGPU > 0) then @ NUMCPU = $NCPUS - 1 echo libcchem kernels will use $NUMCPU cores and $NUMGPU GPUs per node... set echo setenv CCHEM_PROFILE 1 setenv NUM_THREADS $NCPUS setenv GPU_DEVICES 0 #--if ($NUMGPU == 0) setenv GPU_DEVICES -1 #--if ($NUMGPU == 2) setenv GPU_DEVICES 0,1 #--if ($NUMGPU == 4) setenv GPU_DEVICES 0,1,2,3 #setenv LD_LIBRARY_PATH /share/apps/cuda/lib64:$LD_LIBRARY_PATH ###### LD_LIBRARY_PATH /usr/local/cuda/lib64:$LD_LIBRARY_PATH unset echo else echo NO GPU setenv GPU_DEVICES -1 endif if ( $JOB:r.inp == $JOB ) set JOB=$JOB:r echo "Copying input file $JOB.inp to your run's scratch directory..." cp $JOB.inp $SCR/$JOB.F05 setenv TRAJECT $USERSCR/$JOB.trj setenv RESTART $USERSCR/$JOB.rst setenv INPUT $SCR/$JOB.F05 setenv PUNCH $USERSCR/$JOB.dat if ( -e $TRAJECT ) rm $TRAJECT if ( -e $PUNCH ) rm $PUNCH if ( -e $RESTART ) rm $RESTART source $GMSPATH/gms-files.csh setenv LD_LIBRARY_PATH /opt/openmpi/1.6/lib:/opt/netlib/CBLAS/lib:/opt/acml/acml5.3.1/gfortran64_int64/lib set path= ( /opt/openmpi/1.6/bin $path ) /opt/openmpi/1.6/bin/mpiexec -n $NCPUS $GMSPATH/gamess.gpu.$VERNO.x|tee $JOB.out cp $PUNCH .
Add /opt/gamess_cuda to path:
echo 'export PATH=$PATH:/opt/gamess_cuda' source ~/.bashrc
Testing
cd /opt/gamess_cuda/tests/standard gpurun exam44 12 2