26 April 2013

397. Briefly: compiling ck (Con Kolivas) kernel 3.8.9 in arch linux


mkdir ~/tmp
cd ~/tmp 
wget http://www.kernel.org/pub/linux/kernel/v3.0/linux-3.8.9.tar.bz2
tar xvf linux-3.8.9.tar.bz2
cd linux-3.8.9/
wget http://ck.kolivas.org/patches/3.0/3.8/3.8-ck1/patch-3.8-ck1.bz2
bunzip2 patch-3.8-ck1.bz2
patch -p1 < patch-3.8-ck1
cp /proc/config.gz .
gunzip config.gz
mv config .config
make oldconfig
make -j2
make -j2 modules
sudo make modules_install
sudo make headers_install INSTALL_HDR_PATH=/usr/src/linux-3.8.9-ck1-ARCH
sudo cp arch/x86_64/boot/bzImage /boot/vmlinuz-3.8.9-ck1-ARCH
sudo cp System.map /boot/System-3.8.9-ck1-ARCH.map
sudo mkinitcpio -k 3.8.9-ck1-ARCH -c /etc/mkinitcpio.conf -g /boot/initramfs-3.8.9-ck1-ARCH.img
sudo grub-mkconfig -o /boot/grub/grub.cfg

396. Compiling gromacs 4.6 with gpu support, openblas and fftw3 on debian wheezy

NOTE: with ACML my performance on my FX8150 and FX8350 nodes is only 25% of that with Openblas (double precision). Yes, for some reason gromacs is four times faster with openblas than with the machine vendor libraries in my tests.

Here are the release notes: http://www.gromacs.org/About_Gromacs/Release_Notes/Versions_4.6.x
As far as I understand you don't have to rely on openmm anymore for CUDA. Yes, the PITA of compiling openmm is gone!

Note that GPU calcs only speed things up under certain, specific conditions  -- and not all nvidia cards are supported (or equal). My own set-up, using statically cooled graphics cards, is definitely not appropriate for a GPU cluster. Once nwchem comes out with GPU support I might upgrade to fancier $200 graphics cards (maybe COSMO in NWChem will finally become more reasonable in terms of computational cost), but there's little reason for that at the moment.

Not all cards are created equal either -- e.g. GT210, which has GPU compute capability 1.2, is too poor to run with gromacs. GT430 (compute cap GT430) works. Both are obviously not viable for professional work.

Also note that it seems that you still need to use OPENMM if you want GPU support for implicit solvation.

Gromacs used to be easy to install. It's become a fair bit more complicated between 4.5.5 and 4.6. See here for gromacs 4.5.5: http://verahill.blogspot.com.au/2012/05/gromacs-with-external-fftw3-and-blas-on.html

CUDA: If you want to build with cuda you need gcc-4.6, which is still available in the wheezy repos. 4.7 won't work. Luckily, you can have both on your system, but you'll need to specify CC and CXX as shown below.

Openblas
Note that the links to the openblas file tends to die after a while, so you might have to download it manually.

sudo mkdir /opt/openblas
sudo chown ${USER} /opt/openblas
cd ~/tmp
wget http://github.com/xianyi/OpenBLAS/tarball/v0.2.6
tar xvf v0.2.6
cd xianyi-OpenBLAS-87b4d0c/
wget http://www.netlib.org/lapack/lapack-3.4.1.tgz
make all BINARY=64 CC=/usr/bin/gcc FC=/usr/bin/gfortran USE_THREAD=0 INTERFACE64=1 1> make.log 2>make.err
make PREFIX=/opt/openblas install
cp lib*.*  /opt/openblas/lib

add
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/openblas/lib
to your ~/.bashrc [for later use with nwchem and ecce, add /opt/openblas/lib to /etc/ld.so.conf and do sudo ldconfig -- you might want to make libopenblas.so and libopenblas.so.0 sym links to the main lib, libopenblas_bulldozer-r0.2.6.so]

single-precision gromacs 4.6 with both CPU and GPU

CUDA
If you have an nvidia card and want to enable GPU calcs, do
sudo apt-get install nvidia-cuda-toolkit gcc-4.6 g++-4.6

If /usr/lib/libcuda.so is nothing by a symmlink to /usr/lib/libcuda.so.1, and the file /usr/lib/libcuda.so.1 is missing (this was the case on my wheezy amd64), then do
sudo rm /usr/lib/libcuda.so
sudo ln -s /usr/lib/x86_64-linux-gnu/libcuda.so.1 /usr/lib/libcuda.so

You can also simply make sure that there/s no /usr/lib/libcuda.so.

Continue with the gromacs compilation:
cd ~/tmp
sudo apt-get install cmake
wget ftp://ftp.gromacs.org/pub/gromacs/gromacs-4.6.tar.gz
tar xvf gromacs-4.6.tar.gz
mkdir build_gromacs46
cd build_gromacs46
sudo mkdir /opt/gromacs
sudo chown ${USER} /opt/gromacs
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/openmpi/lib:/opt/openblas/lib
export LDFLAGS="-L/opt/openblas/lib -lopenblas"
export CPPFLAGS="-I/opt/openblas/include"
export CC=/usr/bin/gcc-4.6 && export CXX=/usr/bin/g++-4.6 && cmake -DGMX_FFT_LIBRARY=fftw3 -DGMX_BUILD_OWN_FFTW=On -DGMX_DOUBLE=off -DCMAKE_INSTALL_PREFIX=/opt/gromacs/gromacs4.6_single -DGMX_EXTERNAL_BLAS=/opt/openblas/lib ../gromacs-4.6
make
make install

Note: for acml I used this instead:

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/openmpi/lib:/opt/acml/acml5.2.0/gfortran64_fma4_int64/lib
export LDFLAGS="-L/opt/acml/acml5.2.0/gfortran64_fma4_int64/lib -lacml"
export CPPFLAGS="-I/opt/acml/acml5.2.0/gfortran64_fma4_int64/include"
export CC=/usr/bin/gcc-4.6 && export CXX=/usr/bin/g++-4.6 && cmake -DGMX_FFT_LIBRARY=fftw3 -DGMX_BUILD_OWN_FFTW=On -DGMX_DOUBLE=off -DCMAKE_INSTALL_PREFIX=/opt/gromacs/gromacs4.6_single -DGMX_EXTERNAL_BLAS=/opt/acml/acml5.2.0/gfortran64_fma4_int64/lib ../gromacs-4.6


Double-precision gromacs without GPU acceleration:

cd ~/tmp/build_gromacs46
rm * -rf
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/openmpi/lib:/opt/openblas/lib
export LDFLAGS="-L/opt/openblas/lib -lopenblas"
export CPPFLAGS="-I/opt/openblas/include"
export CC=/usr/bin/gcc-4.6 && export CXX=/usr/bin/g++-4.6 && cmake -DGMX_FFT_LIBRARY=fftw3 -DGMX_BUILD_OWN_FFTW=On -DGMX_DOUBLE=on -DGMX_GPU=off -DCMAKE_INSTALL_PREFIX=/opt/gromacs/gromacs4.6_double -DGMX_EXTERNAL_BLAS=/opt/openblas/lib ../gromacs-4.6
make
make install

Note: for acml I used this instead:
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/openmpi/lib:/opt/acml/acml5.2.0/gfortran64_fma4_int64/lib
export LDFLAGS="-L/opt/acml/acml5.2.0/gfortran64_fma4_int64/lib -lacml"
export CPPFLAGS="-I/opt/acml/acml5.2.0/gfortran64_fma4_int64/include"
export CC=/usr/bin/gcc-4.6 && export CXX=/usr/bin/g++-4.6 && cmake -DGMX_FFT_LIBRARY=fftw3 -DGMX_BUILD_OWN_FFTW=On -DGMX_DOUBLE=on -DGMX_GPU=off -DCMAKE_INSTALL_PREFIX=/opt/gromacs/gromacs4.6_double -DGMX_EXTERNAL_BLAS=/opt/acml/acml5.2.0/gfortran64_fma4_int64/lib ../gromacs-4.6

Add gromacs to path:
echo 'export PATH=$PATH:/opt/gromacs/gromacs4.6_single/bin:/opt/gromacs/gromacs4.6_double/bin' >> ~/.bashrc

Switching between GPU and CPU
You can use the same binary for both, but remember that only the single precision binaries have GPU support to begin with. To set gpu vs cpu, use the -nb option in mdrun:
-nb enum auto Calculate non-bonded interactions on: auto, cpu, gpu or gpu_cpu


Quick test:
cd ~/tmp
wget http://www.gromacs.org/@api/deki/files/128/=gromacs-gpubench-dhfr.tar.gz
tar xvf \=gromacs-gpubench-dhfr.tar.gz
cd dhfr/GPU/dhfr-solv-PME.bench
mdrun -nb cpu -s topol.tpr -testverlet

Hit ctrl+c to stop and get statistics. Then try
mdrun -nb gpu -s topol.tpr -testverlet

I got
XPU ns/day -------------- auto 7.4 GPU 7.7 CPU 4.1 gpu_cpu 7.5

where I have a 3 core 3.1 GHz AMD Athlon II X3 445 CPU and an NVIDIA GeForce GT 430 graphics card -- neither of which is anything special.

Note also that the ns/day values depended highly on how long I let the calc run, and as I didn't time it and make them run the same amount of time, I suspect that auto, GPU and gpu_cpu are all about the same.

25 April 2013

395. Compiling the ck-kernel 3.8.x on debian (and patching i915)


What is the ck kernel? In short it is a patch for the mainline kernel which is supposed to render it more responsive for desktop use. See here: https://wiki.archlinux.org/index.php/Linux-ck

I'm not going to try to 'benchmark' this against the stock kernel, since I'm not clear what a good metric would be -- how do you measure 'responsiveness'? Come to think of it -- what's a good, objective definition?

Download the kernel version that you need from http://kernel.org and the ck path from Con's web page at http://users.on.net/~ckolivas/kernel/

I also ended up applying this patch to sort out recent i915 issues:
http://verahill.blogspot.com.au/2013/03/368-slow-mouse-and-keyboard-triggered.html
https://bbs.archlinux.org/viewtopic.php?pid=1248190
http://forums.gentoo.org/viewtopic-p-7278760.html
https://bbs.archlinux.org/viewtopic.php?pid=1254285#p1254285

I did it by creating a file called i915.patch and placing it in the kernel source root:
--- drivers/gpu/drm/drm_crtc_helper.c 2013-04-26 10:24:07.987942008 +1000 +++ ../../linux-3.8.8/drivers/gpu/drm/drm_crtc_helper.c 2013-04-17 15:11:28.000000000 +1000 @@ -1067,7 +1067,7 @@ void drm_helper_hpd_irq_event(struct drm enum drm_connector_status old_status; bool changed = false; - if (!dev->mode_config.poll_enabled || !drm_kms_helper_poll) + if (!dev->mode_config.poll_enabled) return; mutex_lock(&dev->mode_config.mutex);
Note that as far as I know it doesn't solve the issue, but it does allow you to disable polling by editing /etc/default/grub and adding
GRUB_CMDLINE_LINUX_DEFAULT="quiet drm_kms_helper.poll=N"
followed by running
sudo update-grub

So far it's working fine (after testing for ca one hour).

Anyway, here's kernel 3.8.5 (should work on all 3.8.x):
sudo apt-get install kernel-package fakeroot build-essential ncurses-dev
mkdir ~/tmp
cd ~/tmp
wget https://www.kernel.org/pub/linux/kernel/v3.x/linux-3.8.5.tar.xz
tar xvf linux-3.8.5.tar.xz
cd linux-3.8.5/
wget http://ck.kolivas.org/patches/3.0/3.8/3.8-ck1/patch-3.8-ck1.bz2
bunzip2 patch-3.8-ck1.bz2
patch -p1 < patch-3.8-ck1
patching file arch/powerpc/platforms/cell/spufs/sched.c patching file Documentation/scheduler/sched-BFS.txt patching file Documentation/sysctl/kernel.txt patching file fs/proc/base.c patching file include/linux/init_task.h patching file include/linux/ioprio.h patching file include/linux/sched.h patching file init/Kconfig patching file init/main.c patching file kernel/delayacct.c patching file kernel/exit.c patching file kernel/posix-cpu-timers.c patching file kernel/sysctl.c patching file lib/Kconfig.debug patching file include/linux/jiffies.h patching file drivers/cpufreq/cpufreq.c patching file drivers/cpufreq/cpufreq_ondemand.c patching file drivers/cpufreq/cpufreq_conservative.c patching file kernel/sched/bfs.c patching file kernel/sched/Makefile patching file include/uapi/linux/sched.h patching file include/linux/swap.h patching file mm/memory.c patching file mm/swapfile.c patching file mm/vmscan.c patching file arch/x86/Kconfig patching file kernel/Kconfig.hz patching file kernel/Kconfig.preempt patching file Makefile
patch -p0 -R < i915.patch
patching file drivers/gpu/drm/drm_crtc_helper.c
make-kpgk clean cat /boot/config-`uname -r`>.config make oldconfig time fakeroot make-kpkg -j4 --initrd kernel_image kernel_headers sudo dpkg -i ../linux*3.8.5-ck*.deb

where 4 is the number of cores on your machine (note: it only has to do with compiling -- you can use the compiled binaries on any number of cores).

That's it. Now all you have to do is reboot and you'll be using your new kernel. Simple, eh?