20 January 2012

54. Compiling GROMACS with GPU support using OpenMM (OpenMM 4.0 source or 3.1.1 Linux 64 binaries) on debian testing

To compile gromacs without GPU support (and that's probably what you should do) look here:
GPU support is most likely NOT going to be useful for you.

For a better way (out-of-tree) of building a newer openMM, see:

The method below describes an in-tree build of OpenMM. Do an out of tree build (e.g. see link above) to avoid headaches.

OK, enough with the 'bold'. Basically, I wrote the post below at a time when I was at a very early stage of learning how to compile my own programs. I'm obviously still learning, but I have published better methods now -- see the links above for those.
Be aware that in-tree (when you build your package in the root of the source tree) building is not supported for OpenMM and you will probably be told so if posting on forums asking for help. Out-of-tree (when you build in a directory outside the source tree structure) is supported and is described above. So, while I'm leaving this post here for posterity, use it as inspiration, but don't follow it blindly.

/25th of September 2012.

Original Post:
First read this: 1. Use EITHER the OpenMM3.1.1-Linux64 binaries
 2. OR, which is better, compile OpenMM4.0 from source -- see here http://verahill.blogspot.com/2012/01/debian-testing-64-wheezy_20.html -- if you do this you can skip step 3.

Do NOT use: 1. the Open4.0MM-Linux64 binaries or the OpenMM3.1.1-Source source -- at least I have had issues with those versions.

The following guide was last tested: 20/01/2012
It has not been checked for typos -- whenever you use a command with 'sudo' in it, try to understand what it does first.

Finally: if it doesn't work the first time, try a few more times. I've never managed to build on the first try, but restarting with make clean, cmake CMakeList.txt, make works in the end every time. Sigh...

You may also want to remove CMakCache.txt in the source root.

1. Things to install first

Install cmake and autoconf if you haven't already

sudo apt-get install cmake autoconf

2. edit ~/.bashrc  or /etc/profiles
Put this at the end of your ~/.bashrc  (or /etc/profiles for multi-user systems)

export LD_LIBRARY_PATH=/lib/openmm:/usr/lib/nvidia-cuda-toolkit:/usr/lib/nvidia:$LD_LIBRARY_PATH
export OPENMM_PLUGIN_DIR=/usr/local/openmm/lib/plugins
export OPENMM_ROOT_DIR=/usr/local/openmm 

load its content by
source ~/.bashrc

3. OpenMM
To download OpenMM you need to register with simtk.org. Registering is all thing considered fairly easy and quick.

Either follow this guide for compiling OpenMM4.0 from source (preferred) or follow the steps below to use the 3.1.1 binaries:

Download the OpenMM3.1.1-Linux64.zip file (if applicable).

Put the .zip file in ~/tmp

unzip OpenMM3.1.1-Linux64.zip
cd ~/tmp/OpenMM3.1.1-Linux64.zip

sudo mkdir -p /lib/openmm/plugins
sudo mkdir -p /lib/openmm/include

sudo cp lib/*.so /lib/openmm
sudo cp include/* -R /lib/openmm/include
sudo cp plugins/* -R /lib/openmm/plugins

This is the structure you're aiming for:

|-- include
|   |-- internal
|   |-- openmm
|   `-- serialization
`-- plugins


sudo mkdir /usr/local/openmm
sudo cp ~/tmp/OpenMM3.1.1/* -R /usr/local/openmm

So that you end up with

|-- bin
|-- docs
|   `-- api
|       `-- search
|-- include
|   `-- openmm
|       |-- internal
|       `-- serialization
|-- lib
|   `-- plugins
|-- licenses
`-- python
    |-- simtk
    |   |-- openmm
    |   `-- unit
    `-- src
        `-- swig_doxygen
            |-- doxygen
            `-- swig_lib
                `-- python
4. Gromacs-4.5.5
Get and untar gromacs:
cd ~/tmp
wget ftp://ftp.gromacs.org/pub/gromacs/gromacs-4.5.5.tar.gz
tar -xvf gromacs-4.5.5.tar.gz
mv gromacs-4.5.5/ gromacs_gpu/

Let's start preparing our build:
cd gromacs_gpu/
make clean
export OPENMM_ROOT_DIR=/usr/local/openmm

EDIT/COMMENT:  I think autoconf /automake and cmake are mutually exclusive. I only let autoconf stay here since I did invoke it when I did my build.

The output of a successful run of cmake follows below -- note the binary suffix, as well as the Found CUDA and Found OpenMM.

-- Using default binary suffix: "-gpu"
-- Using default library suffix: "_gpu"
-- No external FFT libraries needed for the OpenMM build, switching to fftpack!
-- Switching off CPU-based acceleration, the OpenMM build does not support/need any!
-- Found CUDA: /usr (Required is at least version "3.1")
-- Found OpenMM: /usr/local/openmm
-- Looking for fseeko
-- Looking for fseeko - found
-- Using internal FFT library - fftpack
-- Configuring done
-- Generating done
-- Build files have been written to: /home/me/tmp/gromacs_mopac

Next, some editing - first we copy two of the files that the DGMX_OPENMM switch on the cmake command above created, then we remove two offending lines.

This command should be on a single line:
cp gromacs_gpu/src/kernel/gmx_gpu_utils/CMakeFiles/gmx_gpu_utils_generated_gmx_gpu_utils.cu.o.cmake ~/tmp

This command is also a single line:
cp gromacs_gpu/src/kernel/gmx_gpu_utils/CMakeFiles/gmx_gpu_utils_generated_memtestG80_core.cu.o.cmake ~/tmp

We now have two *.cu.o files in our ~/tmp
Open each one and remove all instances of -fexcess-precision=fast in them. Then copy the files back to their original location:

(This command should be on a single line:)
cp gmx_gpu_utils_generated_gmx_gpu_utils.cu.o.cmake gromacs_gpu/src/kernel/gmx_gpu_utils/CMakeFiles/

(This command is also a single line:)
cp gmx_gpu_utils_generated_memtestG80_core.cu.o.cmake gromacs_gpu/src/kernel/gmx_gpu_utils/CMakeFiles/

We cope and edit since these files get regenerated every time you do the cmake -DGMX_OPENMM=ON command.
Quick aside: a clue to  -fexcess-precision:
Here: "On x86 targets, code containing floating-point calculations may run significantly slower when compiled with GCC 4.5 in strict C99 conformance mode than they did with earlier GCC versions. This is due to stricter standard conformance of the compiler and can be avoided by using the option -fexcess-precision=fast." * Here: "-fexcess-precision=fast # disables the GCC 4.5 strict floating point C99 standards conformance for improved floating point performance "* I'm running gcc (Debian) 4.6.2-12 and g++ 4.6.2-4* I've tried with both -fexcess-precision=fast and -fexcess-precision=standard. Neither works.
* Not all excess-precision options work with all languages.
* Using g++ 4.5 someone got "cc1plus: sorry, unimplemented: -fexcess-precision=standard for C++" and comes to the conclusion that "Bah. Still, at least the -ffloat-store workaround still helps, for this
case at any rate. Also, if you get GCC to use SSE instructions, there's no
issue with excess precision".
I think the conclusion is that removing -fexcess-precision=fast MAY lead to a speed penalty (up to 2x) but will not change the results of the sim/calc.
OK, time to build!
cd gromacs_gpu/

and then install:
sudo make install

If all went well you'll now have mdrun-gpu in your /usr/local/gromacs/bin folder together with 93 other -gpu bin files!

Files for benchmarking can be downloaded from here: http://www.gromacs.org/@api/deki/files/128/=gromacs-gpubench-dhfr.tar.gz

If you get:

Program mdrun-gpu, VERSION 4.5.5
Source code file: /home/me/tmp/gromacs_gpu/src/kernel/openmm_wrapper.cpp, line: 1324
Fatal error:
The selected GPU (#0, GeForce GT 430) is not supported by Gromacs! Most probably you have a low-end GPU which would not perform well, or new hardware that has not been tested with the current release. If you still want to try using the device, use the force-device=yes option.
For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors

when trying to benchmark, use

mdrun-gpu -device "OpenMM:deviceid=0, force-device=yes" -v

to force execution.

Make sure to monitor your GPU temperature -- it will easily reach 80 degrees for a fan-less video card.

Early 'results':
Test                             GPU: GT 430    GT520      CPU (6 core AMD Phenom II, 2.8 GHz)
dhfr-impl-1nm.bench           31.8          19.5            27.2 ns/day
dhfr-impl-2nm.bench           31.9          19.9            5.8 ns/day
dhfr-solv-PME.bench           2.7             1.7             8.6 ns/day

Two significant problems are present:
1. The card heats up quickly to 80 degrees Celsius. It may thus be throttled.
2. The video card is in use by GNOME at the same time. Will rerun later without x.

The difference in speed for the 2nm text is significant.

5. Optional
If you haven't already, add this to ~/.bashrc (or /etc/profile)

then run
source ~/.bashrc
to load the new settings

Addendum: Errors and solutions
Do a fresh cycle of make clean, cmake, make and make install after each fix. Also, remove CMakeCache.txt

1. Missing libxml2-dev

[ 89%] Building C object src/mdlib/CMakeFiles/md.dir/gmx_qhop_xml.c.o
/home/me/tmp/gromacs_gpu/src/mdlib/gmx_qhop_xml.c:48:27: fatal error: libxml/parser.h: No such file or directory
compilation terminated.
make[3]: *** [src/mdlib/CMakeFiles/md.dir/gmx_qhop_xml.c.o] Error 1
make[2]: *** [src/mdlib/CMakeFiles/md.dir/all] Error 2
make[1]: *** [src/kernel/CMakeFiles/mdrun.dir/rule] Error 2
make: *** [mdrun] Error 2

sudo apt-get install libxml2-dev

2. Missing OpenMM.h
Linking C shared library libgmxpreprocess_gpu.so
[ 98%] Built target gmxpreprocess
Scanning dependencies of target openmm_api_wrapper
[ 98%] Building CXX object src/kernel/CMakeFiles/openmm_api_wrapper.dir/openmm_wrapper.cpp.o
/home/me/tmp/gromacs_gpu/src/kernel/openmm_wrapper.cpp:59:20: fatal error: OpenMM.h: No such file or directory
compilation terminated.
make[3]: *** [src/kernel/CMakeFiles/openmm_api_wrapper.dir/openmm_wrapper.cpp.o] Error 1
make[2]: *** [src/kernel/CMakeFiles/openmm_api_wrapper.dir/all] Error 2
make[1]: *** [src/kernel/CMakeFiles/mdrun.dir/rule] Error 2
make: *** [mdrun] Error 2

 sudo mkdir /usr/local/openmm/lib/include
sudo cp ~/tmp/OpenMM3.1.1-Source/openmmapi/include/* -R /usr/local/openmm/include/

3. Missing Kernel.h
Linking CXX static library libgmx_gpu_utils.a
[  0%] Built target gmx_gpu_utils
[  0%] Building CXX object src/kernel/CMakeFiles/openmm_api_wrapper.dir/openmm_wrapper.cpp.o
In file included from /usr/local/openmm/include/OpenMM.h:36:0,
                 from /home/me/tmp/gromacs_gpu/src/kernel/openmm_wrapper.cpp:59:
/usr/local/openmm/include/openmm/BrownianIntegrator.h:36:27: fatal error: openmm/Kernel.h: No such file or directory
compilation terminated.
make[3]: *** [src/kernel/CMakeFiles/openmm_api_wrapper.dir/openmm_wrapper.cpp.o] Error 1
make[2]: *** [src/kernel/CMakeFiles/openmm_api_wrapper.dir/all] Error 2
make[1]: *** [src/kernel/CMakeFiles/mdrun.dir/rule] Error 2
make: *** [mdrun] Error 2
make: *** No rule to make target `mdrun-install'.  Stop.

Kernel.h was missing from the openmmapi folder when I compiled OpenMM myself. Use the pre-compiled version

4. cc1plus: Unrecognised option -fexcess-precision=fast

 (the % depends on whether you used make, make mdrun, make gmx_gpu_utils etc)
[  0%] Building NVCC (Device) object src/kernel/gmx_gpu_utils/./gmx_gpu_utils_generated_memtestG80_core.cu.o
cc1plus: error: unrecognized command line option "-fexcess-precision=fast"
CMake Error at CMakeFiles/gmx_gpu_utils_generated_memtestG80_core.cu.o.cmake:198 (message):
  Error generating

make[3]: *** [src/kernel/gmx_gpu_utils/./gmx_gpu_utils_generated_memtestG80_core.cu.o] Error 1
make[2]: *** [src/kernel/gmx_gpu_utils/CMakeFiles/gmx_gpu_utils.dir/all] Error 2
make[1]: *** [src/kernel/CMakeFiles/mdrun.dir/rule] Error 2
make: *** [mdrun] Error 2

You need to edit gmx_gpu_utils_generated_memtestG80_core.cu.o.cmake and gmx_gpu_utils_generated_gmx_gpu_utils.cu.o.cmake and remove all instances of -fexcess-precision=fast
The files in src/kernel/gmx_gpu_utils/ will be overwritten every time you run cmake so make copies of the two edited files to keep around if you need to re-build.

Here's the diff (edit vs unedited) for gmx_gpu_utils_generated_gmx_gpu_utils.cu.o.cmake:

< set(CMAKE_HOST_FLAGS  -Wall -Wno-unused   )
> set(CMAKE_HOST_FLAGS  -fexcess-precision=fast -Wall -Wno-unused   )

Here's the diff (edit vs unedited) for gmx_gpu_utils_generated_memtestG80_core.cu.o.cmake:

< set(CMAKE_HOST_FLAGS  -Wall -Wno-unused   )
> set(CMAKE_HOST_FLAGS  -fexcess-precision=fast -Wall -Wno-unused   )

5. Nonbonded kernel
[ 78%] Building C object src/gmxlib/CMakeFiles/gmx.dir/nonbonded/nb_kernel_x86_64_sse/nb_kernel400_x86_64_sse.c.o
/home/me/tmp/gromacs_gpu/src/gmxlib/nonbonded/nb_kernel_x86_64_sse/nb_kernel400_x86_64_sse.c: In function ‘nb_kernel400nf_x86_64_sse’:
/home/me/tmp/gromacs_gpu/src/gmxlib/nonbonded/nb_kernel_x86_64_sse/nb_kernel400_x86_64_sse.c:629:32: error: ‘gmx_invsqrt_exptab’ undeclared (first use in this function)
/home/me/tmp/gromacs_gpu/src/gmxlib/nonbonded/nb_kernel_x86_64_sse/nb_kernel400_x86_64_sse.c:629:32: note: each undeclared identifier is reported only once for each function it appears in
/home/me/tmp/gromacs_gpu/src/gmxlib/nonbonded/nb_kernel_x86_64_sse/nb_kernel400_x86_64_sse.c:629:59: error: ‘gmx_invsqrt_fracttab’ undeclared (first use in this function)
make[3]: *** [src/gmxlib/CMakeFiles/gmx.dir/nonbonded/nb_kernel_x86_64_sse/nb_kernel400_x86_64_sse.c.o] Error 1
make[2]: *** [src/gmxlib/CMakeFiles/gmx.dir/all] Error 2
make[1]: *** [src/kernel/CMakeFiles/mdrun.dir/rule] Error 2
make: *** [mdrun] Error 2

Solution: instead of just using cmake, use cmake ../gromacs_gpu/ -DGMX_OPENMM=ON -DGMX_THREADS=OFF

Links to this page:

1 comment:

  1. Other people have had success with the method above -- http://www.mail-archive.com/gmx-users@gromacs.org/msg47525.html

    Also, in that post what sounds like an arguably simpler method is suggested.