23 May 2013

430. Strange issue with NWChem, openmpi, SGE and ECCE

This one's a bit odd.

Odd in the sense that

  • the math libs (acml) I'm using should be suitable for the processors that I'm using them for.
  • it only happens when I submit with ECCE + SGE. Calcs on the input files are fine if I launch the by hand



The problem:
I'm having issues launching jobs on two nodes where the nwchem 6.3. binaries were compiled against acml 5.3.1 (gfortran, int64). I'm launching the jobs from ECCE and I've got SGE set up and working since a long time. My two other nodes, one i5-2400 linked against openblas, and one AMD FX 8150 linked against acml 5.3.1 (gfortran, fma4, int64) work absolutely fine.

Both binaries were linked with acml using
export BLASOPT="-L/opt/acml/acml5.3.1/gfortran64_int64/lib -lacml"
export LIBRARY_PATH="$LIBRARY_PATH:/usr/lib/openmpi/lib:/opt/acml/acml5.3.1/gfortran64_int64/lib"

The first node is an AMD phenom II X6 1055, while the second one is an ancient, recently-revived AMD Athlon X2 3800+. The acml util cpuid.exe gives
Chip manufacturer: AuthenticAMD AuthenticAMD family 15 extended family 1 model 10 Model Name: AMD Phenom(tm) II X6 1055T Processor Chip supports SSE Chip supports SSE2 Chip supports SSE3 Chip does not support AVX Chip does not support FMA3 Chip does not support FMA4
and
Model Name: AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ Chip supports SSE Chip supports SSE2 Chip supports SSE3 Chip does not support AVX Chip does not support FMA3 Chip does not support FMA4
respectively. On the AMD Phenom II X6 1055T I kept getting
Scaling coordinates for geometry "geometry" by 1.889725989 (inverse scale = 0.529177249) 0:Illegal Instruction error, status=: 4 (rank:0 hostname:boron pid:12386):ARMCI DASSERT fail. ../../ga-5-2/armci/src/ common/signaltrap.c:SigIllHandler():276 cond:0
. On the Athlon 64 X2 3800+ the job would just exit at
Directory information --------------------- 0 permanent = . 0 scratch = /home/me/scratch
There would be no other errors (in e.g. .po or .o files).

If I launch the job by hand, e.g.
mpirun -n 6 nwchem nwch.nw
it works fine.



The Partial solution
The errors for the AMD Phenom II X6 1055T went away when I instead of acml used openblas:
export BLASOPT="-L/opt/openblas/lib -lopenblas"
export LIBRARY_PATH="$LIBRARY_PATH:/usr/lib/openmpi/lib:/opt/openblas/lib"

See e.g. http://verahill.blogspot.com.au/2013/05/424-nwchem-63-on-debian-wheezy.html for general compilation instructions.

The odd thing:
With openblas the AMD Athlon X2 3800+ suddenly gives
Scaling coordinates for geometry "geometry" by 1.889725989 (inverse scale = 0.529177249) 0:Illegal Instruction error, status=: 4 (rank:0 hostname:beryllium pid:9267):ARMCI DASSERT fail. ../../ga-5-2/armci/src/common/signaltrap.c:SigIllHandler():276 cond:0

429. Briefly: Wine, libGL/liubGLU, Blender and nvidia -- 3D acceleration under Wine

The short version:
sudo apt-get install libgl1-nvidia-glx:i386

The longer version:
I looked into the possibility of running Blender for 32 bit windows under wine due to a series of comments to a post: http://verahill.blogspot.com.au/2013/05/416-wine-1530-in-chroot.html?showComment=1369252731918#c3560166574961895965

On this particular system I have a Geforce 430. It's running Debian Wheezy 64 bit, and has 32 bit wine 1.5.30 compiled as shown in this post: http://verahill.blogspot.com.au/2013/05/416-wine-1530-in-chroot.html


False start
First I downloaded and installed Blender. Running it using
wine ~/.wine/drive_c/Program\ Files/Blender\ Foundation/Blender/blender.exe
led to
err:ole:CoGetClassObject class {24e669e1-e90f-4595-a012-b0fd3ccc5c5a} not registered err:ole:CoGetClassObject no class object {24e669e1-e90f-4595-a012-b0fd3ccc5c5a} could be created for context 0x1 err:module:load_builtin_dll failed to load .so lib for builtin L"GLU32.dll": libGL.so.1: cannot open shared object file: No such file or directory err:module:import_dll Loading library GLU32.dll (which is needed by L"C:\\Program Files\\Blender Foundation\\Blender\\blender.exe") failed (error c000007a). err:module:LdrInitializeThunk Main exe initialization for L"C:\\Program Files\\Blender Foundation\\Blender\\blender.exe" failed, status c0000135
Using locate I found something that sounded right-ish
sudo ln -s /usr/lib/i386-linux-gnu/libGLU.so.1 /lib32/libGLU.so.1

Tried again:
wine: Call from 0x7b83c562 to unimplemented function ntoskrnl.exe.IoAssignResources, aborting wine: Unimplemented function ntoskrnl.exe.IoAssignResources called at address 0x7b83c562 (thread 003c), starting debugger... err:module:load_builtin_dll failed to load .so lib for builtin L"GLU32.dll": libGL.so.1: cannot open shared object file: No such file or directory err:module:import_dll Loading library GLU32.dll (which is needed by L"C:\\Program Files\\Blender Foundation\\Blender\\blender.exe") failed (error c000007a). err:module:LdrInitializeThunk Main exe initialization for L"C:\\Program Files\\Blender Foundation\\Blender\\blender.exe" failed, status c0000135
And more
sudo ln -s /usr/lib/mesa-diverted/i386-linux-gnu/libGL.so.1.2 /lib32/libGL.so.1

And tried again
wine: Call from 0x7b83c562 to unimplemented function ntoskrnl.exe.IoAssignResources, aborting wine: Unimplemented function ntoskrnl.exe.IoAssignResources called at address 0x7b83c562 (thread 003c), starting debugger... err:module:load_builtin_dll failed to load .so lib for builtin L"GLU32.dll": libGL.so.1: cannot open shared object file: No such file or directory err:module:import_dll Loading library GLU32.dll (which is needed by L"C:\\Program Files\\Blender Foundation\\Blender\\blender.exe") failed (error c000007a). err:module:LdrInitializeThunk Main exe initialization for L"C:\\Program Files\\Blender Foundation\\Blender\\blender.exe" failed, status c0000135
OK, so /lib32 isn't in the ld path:
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/lib32

And tried again.
ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM default ALSA lib dlmisc.c:254:(snd1_dlobj_cache_get) Cannot open shared library /usr/lib/i386-linux-gnu/alsa-lib/libasound_module_pcm_pulse.so err:winediag:X11DRV_WineGL_InitOpenglInfo Direct rendering is disabled, most likely your OpenGL drivers haven't been installed correctly (using GL renderer "GeForce GT 430/PCIe/SSE2", version "1.4 (2.1.2 NVIDIA 304.88)"). Writing: /tmp/\blender.crash.txt
Fine. Back to the drawing board.

(Btw, libasound2-plugins:i386 would lead to broken packages re the libasound_module_pcm_pulse.so message.)

Solution:
aptitude show libgl1-nvidia-glx-ia32
Package: libgl1-nvidia-glx-ia32 New: yes [..] Replaces: nvidia-glx-ia32 (< 195.36.31), nvidia-glx-ia32 (< 195.36.31) Description: please switch to multiarch libgl1-nvidia-glx:i386

Half the time these things break when installing the i386 packages, but sure, let's try:
sudo apt-get install libgl1-nvidia-glx:i386
Reading package lists... Done Building dependency tree Reading state information... Done The following package was automatically installed and is no longer required: libgl1-nvidia-alternatives-ia32 Use 'apt-get autoremove' to remove it. Recommended packages: libxvmcnvidia1:i386 The following packages will be REMOVED: libgl1-nvidia-glx-ia32 The following NEW packages will be installed: libgl1-nvidia-glx:i386

Try again:
wine ~/.wine/drive_c/Program\ Files/Blender\ Foundation/Blender/blender.exe

It works!

Note that this means that you can run Facio 17.1.1 on Wine as well (it's an unofficial GUI for e.g. GAMESS US/WinGAMESS, Firefly, etc.)

22 May 2013

428. system-config-kickstart on debian

Don't know much about kickstart, but you can compile the redhat tool using the Canonical-patched sources. This is in response to this post: http://forums.debian.net/viewtopic.php?f=17&t=104286

Note that:
1. The Debian way to create pre-configured installations is using Preseed
2. Debian has python-pykickstart for those who want kickstart.
3. I haven't tested system-config-kickstart beyond making sure that it runs

Dependencies and preparation:
sudo apt-get install build-essential gfortran checkinstall python-all-dev cdbs debhelper quilt intltool python-central rarian-compat pkg-config gnome-doc-utils samba python-libuser libuser1 python-glade2 console-setup hwdata python-apt
sudo apt-get install isoquery
mkdir ~/tmp
cd ~/tmp

localechooser
cd ~/tmp
wget http://archive.ubuntu.com/ubuntu/pool/main/l/localechooser/localechooser_2.49ubuntu4.tar.gz
tar xvf localechooser_2.49ubuntu4.tar.gz
cd localechooser/
dpkg-buildpackage -uc -us
sudo dpkg -i ../localechooser-data_2.49ubuntu4_all.deb 

system-config-kickstart
cd ~/tmp
wget http://archive.ubuntu.com/ubuntu/pool/main/s/system-config-kickstart/system-config-kickstart_2.5.20.orig.tar.gz
tar xvf system-config-kickstart_2.5.20.orig.tar.gz
wget http://archive.ubuntu.com/ubuntu/pool/main/s/system-config-kickstart/system-config-kickstart_2.5.20-0ubuntu22.diff.gz
gunzip system-config-kickstart_2.5.20-0ubuntu22.diff.gz
patch -p0 < system-config-kickstart_2.5.20-0ubuntu22.diff
cd system-config-kickstart-2.5.20/
dpkg-buildpackage -uc -us
sudo apt-get install ../system-config-kickstart_2.5.20-0ubuntu22_all.deb

Edit line 46 in /usr/share/system-config-kickstart/packageGroupList.py
availparse = apt_pkg.TagFile(availfile)

ParseTagFile is deprecated in Debian, so you'll need to do a bit of impromptu patching:
sudo sed -i 's/ParseTagFile/TagFile/g' /usr/share/system-config-kickstart/*.py
sudo sed -i 's/availparse.Step/availparse.step/g' /usr/share/system-config-kickstart/*.py
sudo sed -i 's/availparse.Section/availparse.section/g' /usr/share/system-config-kickstart/*.py

Then start with
gksu system-config-kickstart