23 July 2013

483. MS data, part III: generating a matrix by combining several spectra, and plotting it in gnuplot

This post is primarily intended for two particular students. However, the problem it addresses is something that a lot of spectrometrists/scopists who want to take their data presentation to the next level have encountered.

My presumption:
You're running linux.

You've already exported your data as csv files as shown in this post: http://verahill.blogspot.com.au/2013/07/474-exporting-data-from-wsearch32-and.html

In addition, for the specifics in the commands below I will presume that this data is based on a cone voltage sweep from 0 to 300 in 10 volt steps. I thus have a series of files named: 0.csv, 10.csv, 20.csv..190,300.csv.

You should be able to easily customize the approach to e.g. time or concentration dependent data.

Let's get started:
0. Pre-reqs
Make sure you have gawk, sed, xargs, gnuplot, paste, python installed. On debian do
sudo apt-get install gawk sed xargs gnuplot paste python

1. Convert the csv files to dat files
Create the following script and call it csv2dat.sh
#!/bin/bash for e in 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 230 240 250 260 270 280 290 300 do tail -n +8 $e.csv | sed 's/\,/\t/g'| gawk '{print $2,$4}' > $e.dat done
and run it
sh csv2dat.sh

If all went well, you'll have a series of tab-separated .dat files which contain the m/z and the relative abundance (not absolute).

2. Extract ALL m/z values from all files
Create a file called homogenize.sh and put the following in it:
#!/bin/bash for e in {0..200..10} {210..300..10} do cat $e.dat | gawk '{print $1}' echo "" done
We'll run the homogenize script (it does nothing of the sort though), and then use the unix tools unique and sort to get rid of all non-unique m/z values, and to sort them in reverse numerical order:
sh homogenize.sh > allmz.dat
uniq allmz.dat temp.dat
sort -gr temp.dat > mz.dat

3. Pad the data with zeroes
Create a file called makelist.py, and put the following in it. Watch out for tab lengths etc. It's written for python 2.x, and probably won't work under python 3. It was also hacked together from an earlier script which didn't quite work the way I hoped it would.

#!/usr/bin/env python
import sys
from numpy import linspace
infile=sys.argv[1]

f=open(infile,'r')
arr=[]

print "Read %s" %infile
for line in f:
 line=line.rstrip('\n')
 try:
  arr+=[round(float(line),3)]
 except:
  pass
  #print line
f.close

mylist=arr
mylist.sort(reverse=True)

print "Calculating spacing"
spacing=1.0
old=max(mylist)
for i in range(0,len(mylist)):
 if round(abs(old-mylist[i]),3)<spacing and not (abs(old-mylist[i])==0):
  spacing=round(abs(old-mylist[i]),3)  
 old=mylist[i]

values=1+(max(mylist)-min(mylist))/spacing
print "Max, min, resolution: ",max(mylist),min(mylist),spacing
completelist=linspace(max(mylist),min(mylist),values).tolist()
mylist=completelist

voltages=[0,10,20,30,40,50,60,70,80,90,100,110,120,130,140,150,160,170,180,190,200,210,220,230,240,250,260,270,280,290,300]
myys=[0]*len(mylist)

for n in voltages:
 print "voltage: ",n,'\n'
 f=open(str(n)+'.dat','r')
 g=open(str(n)+'pad.dat','w')
 arrx=[]
 arry=[]

 for line in f:
  line=line.rstrip('\n')
  line=line.split(' ')
  try:
   line[0]=round(float(line[0]),3)
   line[1]=float(line[1])
   arrx+=[line[0]]
   arry+=[line[1]]
  except:
   pass

 for i in range(0,len(arrx)-1):
  try:
   myys[mylist.index(arrx[i])]=arry[i]
  except:
   a=0   

 for i in range(0,len(myys)-1):
  g.write(str(myys[i])+'\n')
f.close
g.close

h=open('mz.x','w')

for i in range(0,len(mylist)-1):
 h.write(str(mylist[i])+'\n')
h.close

Run
python makelist.py allmz.dat

Getting 'fail' messages is ok -- most likely it's due to an empty line. You can check that everything worked out by doing e.g.
wc 0pad.dat 220pad.dat

The numbers in the first column should be the same if the files have the same number of lines.

4. Make a matrix
Paste all the ms data side-by-side.
paste 0pad.dat 10pad.dat 20pad.dat 30pad.dat 40pad.dat 50pad.dat 60pad.dat 70pad.dat 80pad.dat 90pad.dat 100pad.dat 110pad.dat 120pad.dat 130pad.dat 140pad.dat 150pad.dat 160pad.dat 170pad.dat 180pad.dat 190pad.dat 200pad.dat 210pad.dat 220pad.dat 230pad.dat 240pad.dat 250pad.dat 260pad.dat 270pad.dat 280pad.dat 290pad.dat 300pad.dat > allpad.dat

5. Rotate the matrix
Create a script called rotate.sh:
gawk ' { for (i=1; i<=NF; i++) { a[NR,i] = $i } } NF>p { p = NF } END { for(j=1; j<=p; j++) { str=a[1,j] for(i=2; i<=NR; i++){ str=str" "a[i,j]; } print str } }' $1
and run
sh rotate.sh allpad.dat > matrix.rot.dat

6. Plot using gnuplot
See the following script for an example. Note that plotting in gnuplot using 'matrix' you don't get the benefit of proper axes labels. Instead we do a bit of on-the-fly maths to get the axes right. Specifically:
using (2999.3-(($1-1)/10)):(($2-1)*10):($3)

means that for the m/z axes ($1) we take the highest value (in our case 2999.3) and remove 0.1 m/z (our resolution) for each data point. This data is in each row. For the CV axes ($2), which goes down the columns in our matrix.rot.dat, we have thirty values. Each one corresponds to an increase in 10V starting at 0V, hence we multiply by 10. $3 is the intensity, which we don't need to fiddle with.

Save the following as cntr.gplt
set term png size 1000,1000 set output 'map.png' set zrange [-10:110] set yrange [0:300] unset surface set contour base set cntrparam levels 15 set view 0,0 unset ztics unset key splot 'matrix.rot.dat' matrix using (2999.3-($1/10)):($2*10):($3) with lines palette

Running
gnuplot cntr.gplt

VERY SLOWLY (in my case I had 0.9 M data points) gives us

If you're confused as to why the data doesn't go beyond 250 Volt (y axis) it's because I made a mistake at one point.

Changing the ranges a bit we get
And even more zoomed in:

Soon to come as a separate post:
 the same data, but as a stacked plot. Here's what it looks like though:

482. kernel 3.10.2 with CK patch

NOTE: the 304.88 nvidia kernel modules DO NOT BUILD on this kernel. I've also tried 3.10.5 and it also does not work.

NOTE II: I'm getting random slowdowns on my SL410 laptop with intel graphics. Not sure if it's the same issue as this: http://verahill.blogspot.com.au/2013/03/368-slow-mouse-and-keyboard-triggered.html
Once kworker shows up in top everything grinds to a slow crawl.

Nothing odd here. For a list of what questions to expect when going from 3.9 to 3.10, see e.g. http://verahill.blogspot.com.au/2013/07/468-kernel-310-on-debian.html

The CK patch set supposedly improves desktop performance of the kernel. As it seems like Con doesn't update that page anymore, go directly to the patches: http://ck.kolivas.org/patches/3.0/

sudo apt-get install xz-utils kernel-package fakeroot ncurses-dev
mkdir ~/tmp
cd ~/tmp
wget https://www.kernel.org/pub/linux/kernel/v3.x/linux-3.10.2.tar.xz
tar xvf linux-3.10.2.tar.xz
cd linux-3.10.2/
wget http://ck.kolivas.org/patches/3.0/3.10/3.10-ck1/patch-3.10-ck1.bz2
bunzip2 patch-3.10-ck1.bz2
patch -p1 < patch-3.10-ck1
patching file arch/powerpc/platforms/cell/spufs/sched.c patching file Documentation/scheduler/sched-BFS.txt patching file Documentation/sysctl/kernel.txt patching file fs/proc/base.c patching file include/linux/init_task.h patching file include/linux/ioprio.h patching file include/linux/sched.h patching file init/Kconfig patching file init/main.c patching file kernel/delayacct.c patching file kernel/exit.c patching file kernel/posix-cpu-timers.c patching file kernel/sysctl.c patching file lib/Kconfig.debug patching file include/linux/jiffies.h patching file drivers/cpufreq/cpufreq.c patching file drivers/cpufreq/cpufreq_ondemand.c patching file kernel/sched/bfs.c patching file include/uapi/linux/sched.h patching file include/linux/sched/rt.h patching file kernel/stop_machine.c patching file drivers/cpufreq/cpufreq_conservative.c patching file kernel/sched/Makefile patching file kernel/time/Kconfig patching file kernel/Kconfig.preempt patching file kernel/Kconfig.hz patching file arch/x86/Kconfig patching file Makefile
make-kpkg clean cat /boot/config-`uname -r`>.config make oldconfig time fakeroot make-kpkg -j3 --initrd kernel_image kernel_headers sudo dpkg -i ../*3.10.2-ck*.deb sudo rm /lib/modules/3.10.2-ck1/build sudo ln -s /usr/src/linux-headers-3.10.2-ck1/ /lib/modules/3.10.2-ck1/build sudo dkms autoinstall -k 3.10.2-ck1

22 July 2013

481. A little bit of samba on the command line

I have a bit of a problem with samba currently.

My problem is that my computers are sitting behind a router (on a 192.168.2.0/24 subnet) and the computers that I want to access sit on the university network, to which the router is connected. The address range is, say, 131.172.x.x.

In other words, I (think I) want to use samba across two subnets.

I've opened up ports 13-139,445 to tcp and udp on both the router and in iptables on my desktop.

My problem:
1. I can't see the network shares of the other computers using
   a) nautilus (Network/Windows Network)
   b) nmblookup
   c) sambascanner

2. I can't connect to network shares using their netbios names. For example, I'd like to connect to e.g. smb://avance400/data, but I have to use the IP address instead. For some curious reason not even that works using nautilus.

Workaround:
So here's not a solution, but a workaround.

I can connect to other computers from the command line as long as I know the IP address, and here's how
smbclient //131.172.123.30/data -U myuni/me

If you actually want to mount the share, which is password protected, and you do, then do
sudo mount -t cifs -o user=me //131.172.123.30/data /media/smbmounts/

where /media/smbmounts belong to you (e.g. sudo mkdir /media/smbmounts && sudo chown $USER /media/smbmounts).

And that's more or less it.

Some additional information:
If you don't get prompted for the password, and get
mount: block device //131.172.123.30/data is write-protected, mounting read-only
mount: cannot mount block device //131.172.123.30/data read-only

but supplying the password as part of the command line works, then you are missing cifs-utils, so install them.

Note that mount.cifs can handle credentials from a special file, e.g. like this , which you chmod to 600. My chief issue with that is that ~/.bash_history has exactly the same permissions (u+rw, go-rwx) and so I don't see how it's that's any safer than exposing everything by supplying your password as part of the mount command. Both should be avoided if possible.

On the other hand you could argue that since the password is transmitted over the network in cleartext you're inviting trouble either way...




480. MS data, part II. Plotting and comparing with predicted isotopic enveloped

NOTE: I've heard rumours about problems with Matt Monroe's calculator on Windows 7 Home, and on Windows 8. I've heard reports of it working on Windows 7 Professional. Given that wsearch also has issues,  this may be linked to VB.

This post is, like this one, is written with two particular students in mind.

MS here stands for Mass Spectrometry.

I'll be presuming that you have exported your data as a csv file as shown in http://verahill.blogspot.com.au/2013/07/474-exporting-data-from-wsearch32-and.html

Our scenario:
So you've exported your data as e.g. data.csv, and you have assigned a signal in your spectrum to a species, and you'd now like to plot the predicted and observed isotopic envelopes in a way that will help you compare them.

The signal we have identified is at 211.90 m/z and we think it belongs to [Ga(CH3OH)2(OH)(NO3)]+.


The Linux way:
You'll need: sed, gawk, gnuplot, pyisocalc or Matt Monroe's Molecular Weight calculator

1. Generating the isotopic envelope:

A. Using pyisocalc:
Set the charge to 1 and output the data to 1.dat, with a gaussian broadening factor of 0.3:
isocalc -f 'Ga(CH3OH)2(OH)(NO3)' -c 1 -o 1.dat -g 0.3

B. Using Matt Monroe's molecular weight calculator
Go to Tools/Isotopic Distribution Modelling
In the spectrum window, go to Edit, Copy Data Points, and paste into e.g. a Gedit window. Save as 1.dat


2. Formatting the data.csv for gnuplot (can skip for spreadsheet programs):
In a single line we remove the first eight lines, replace all commans (,) with tabs, only keep the m/z and relative isotopic abundance columns (2 an 4) and save the output to data.dat
tail -n +8 data.csv |sed 's/\,/\t/g'|gawk '{print $2,$4}' > data.dat

3. Plotting:

A. Using gnuplot:
Create a file called 1.gplt which contains the gnuplot commands:
set term postscript eps enhanced color set output '1.eps' set xrange [206:220] plot '1.dat' u ($1-0.05):($2*0.092) w lines ti 'Calculated' lc -1 lw 2,\ 'data.dat' u 1:2 w lines ti 'Observed' lc 1 lw 2
($1-0.05) means we're offsetting the calculated data by 0.05 m/z. ($2*0.092) means that we're scaling the calculated data intensity to match that of the observed. lc sets line colour and lw sets the width


If you want the output as png instead of eps, just change the first two lines to
set term png size 1000,667 set output '1.png'
Using pyisocalc
Using Matthew Monroe's calculator

B. Using QtiPlot
Qtiplot is in the debian repos and is 'origin'-like (as in Microcal Origin).

You'll need to rescaled your calculated data first, which is a major drawback:
cat 1.dat|gawk '{print $1-0.05,$2*0.095}'> 1_scaled.dat

Start QtiPlot and select Open. Make sure you select 'all files' as the file type. Open 1_scaled.dat.

Next, make sure that the spreadsheet is active, and go to File, Import, Import Ascii

Change the type of column 3

Select all columns and go to Plot, Line. Change the axes (double click on the axes and set the new ranges), set the top and right axes no to show, edit the titles etc.



The Windows way: 
You'll probably need: excel or open/libreoffice, origin, pyisocalc or Matt Monroe's Molecular Weight calculator

Doing this on windows is a PITA compared to Linux, and I don't have the time to go through it. If you do have Origin, it should be straightforward to translate the instructions above into an MS Win-like environment.

Any scaling will have to be done in Excel or a similar spreadsheet program. Not difficult, but it'll add a few extra steps.

19 July 2013

479. Compiling Wine 1.6 on Debian (using a chroot)

Update:
I noticed
configure: libOSMesa 32-bit development files not found (or too old), OpenGL rendering in bitmaps won't be supported.

popping up at the end of ./configure. I've added a fix for it based on http://forum.winehq.org/viewtopic.php?f=2&t=17713

Original post:
Here's a generic way of building Wine 1.6 which is now stable. And yes, it's the instructions for 1.5.28-1.6-rcX recycled.

See here for information about 3D acceleration using libGL/U with Wine: http://verahill.blogspot.com.au/2013/05/429-briefly-wine-libglliubglu-blender.html

Getting started:
If you set up a e.g. chroot to build 1.5.28 you don't need to set up a new chroot to build 1.6. In that case, skip the set-up step below and instead re-enter your existing chroot like this:

sudo mount -o bind /proc wine32/proc
sudo cp /etc/resolv.conf wine32/etc/resolv.conf
sudo chroot wine32
su sandbox
cd ~/tmp

And skip to 'Building wine'.

Otherwise do this:
Setting up the Chroot
sudo apt-get install debootstrap
mkdir $HOME/tmp/architectures/wine32 -p
cd $HOME/tmp/architectures
sudo debootstrap --arch i386 wheezy $HOME/tmp/architectures/wine32 http://ftp.au.debian.org/debian/
sudo mount -o bind /proc wine32/proc
sudo cp /etc/resolv.conf wine32/etc/resolv.conf
sudo chroot wine32

You're now in the chroot:
apt-get update
apt-get install locales sudo vim
echo 'export LC_ALL="C"'>>/etc/bash.bashrc
echo 'export LANG="C"'>>/etc/bash.bashrc
echo '127.0.0.1 localhost beryllium' >> /etc/hosts
source /etc/bash.bashrc
adduser sandbox
usermod -g sudo sandbox
echo 'Defaults !tty_tickets' >> /etc/sudoers
su sandbox
cd ~/

Replace 'beryllium' with the name your host system (it's just to suppress error messages)

Building Wine
While still in the chroot, continue (the i386 is ok; don't worry about it -- you don't actually need it):

sudo apt-get install libx11-dev:i386 libfreetype6-dev:i386 libxcursor-dev:i386 libxi-dev:i386 libxxf86vm-dev:i386 libxrandr-dev:i386 libxinerama-dev:i386 libxcomposite-dev:i386 libglu-dev:i386 libosmesa-dev:i386 libglu-dev:i386 libosmesa-dev:i386 libdbus-1-dev:i386 libgnutls-dev:i386 libncurses-dev:i386 libsane-dev:i386 libv4l-dev:i386 libgphoto2-2-dev:i386 liblcms-dev:i386 libgstreamer-plugins-base0.10-dev:i386 libcapi20-dev:i386 libcups2-dev:i386 libfontconfig-dev:i386 libgsm1-dev:i386 libtiff-dev:i386 libpng-dev:i386 libjpeg-dev:i386 libmpg123-dev:i386 libopenal-dev:i386 libldap-dev:i386 libxrender-dev:i386 libxml2-dev:i386 libxslt-dev:i386 libhal-dev:i386 gettext:i386 prelink:i386 bzip2:i386 bison:i386 flex:i386 oss4-dev:i386 checkinstall:i386 ocl-icd-libopencl1:i386 opencl-headers:i386 libasound2-dev:i386 build-essential
mkdir ~/tmp
cd ~/tmp
wget http://prdownloads.sourceforge.net/wine/wine-1.6.tar.bz2

tar xvf wine-1.6.tar.bz2
cd wine-1.6/


Optional:
To avoid getting the

configure: libOSMesa 32-bit development files not found (or too old), OpenGL rendering in bitmaps won't be supported.

message, do the following:
1. Edit configure
 9450 LIBS="-lOSMesa -lGLU -lGL $X_LIBS $X_PRE_LIBS $XLIB -lm $X_EXTRA_LIBS $LIBS"

2. Also change
 9473     *) ac_cv_lib_soname_OSMesa=libOSMesa.so

Does it change anything? I don't know. But it removes the error message which is triggered by missing symbols so I think it does since the symbols are found in GLU/GL.
End optional.

Then do
./configure
time make -j3
sudo checkinstall --install=no
checkinstall 1.6.2, Copyright 2009 Felipe Eduardo Sanchez Diaz Duran This software is released under the GNU GPL. The package documentation directory ./doc-pak does not exist. Should I create a default set of package docs? [y]: Preparing package documentation...OK Please write a description for the package. End your description with an empty line or EOF. >> wine 1.6 >> ***************************************** **** Debian package creation selected *** ***************************************** This package will be built according to these values: 0 - Maintainer: [ root@beryllium ] 1 - Summary: [ wine 1.6] 2 - Name: [ wine ] 3 - Version: [ 1.6] 4 - Release: [ 1 ] 5 - License: [ GPL ] 6 - Group: [ checkinstall ] 7 - Architecture: [ i386 ] 8 - Source location: [ wine-1.6 ] 9 - Alternate source location: [ ] 10 - Requires: [ ] 11 - Provides: [ wine ] 12 - Conflicts: [ ] 13 - Replaces: [ ]
Checkinstall takes a little while (In particular this step: 'Copying files to the temporary directory...').

Installing Wine

Exit the chroot
sandbox@beryllium:~/tmp/wine-1.6$ exit
exit
root@beryllium:/# exit
exit
me@beryllium:~/tmp/architectures$ 

On your host system
 Enable multiarch* and install ia32-libs, since you've built a proper 32 bit binary:

sudo dpkg --add-architecture i386
sudo apt-get update
sudo apt-get install ia32-libs

*At some point I think ia32-libs may be replaced by proper multiarch packages, but maybe not. So we're kind of doing both here.

 Copy the .deb package and install it
sudo cp wine32/home/sandbox/tmp/wine-1.6/wine_1.6-1_i386.deb .
sudo chown $USER wine_1.6-1_i386.deb
sudo dpkg -i wine_1.6-1_i386.deb

17 July 2013

478. Briefly: proftpd on debian

I need to transfer raw mass spec files off of the computer controlling our waters zmd, and it seems like I may be the only one in the department wishing to do so.

Since the computer is running Windows NT 4 and doesn't support USB drives out of the box, and I'm a bit worried about installing new software (e.g. old versions of filezilla via oldapps) on a computer on which a lot of people rely, I have two options:

* use SMB i.e. a windows share
or
* use ftp

I'm having all sorts of trouble getting my samba to work well at work -- my computers are sitting on a 192.168.2.0/24 LAN behind a router connected to the corporate network which has proper IP addresses (i.e. not using a reserved private network address space). I haven't managed to get my computer behind the router to 'see' the other computers and their shares at work beyond my router . I can, however, connect directly to the computers using e.g. smbclient -- they just won't show up in e.g. nautilus under windows network or using nmblookup. At any rate, connection directly to the target computer prompts me for a password and it seems that there are no open, accessible shares on that computer, only password protected ones.

Win NT has a DOS ftp client, so I finally decided to set up a quick and dirty ftp server on my workstation in my office so that I could transfer a couple of data files to figure out my other issue -- whether I have any piece of software that can actually open the masslynx .raw files. Turns out that neither wsearch32 nor openchrom can, so the exercise has been somewhat futile, although it has to be said that I'd like to be in charge of any raw data that leads to publications, and so I should be able to manage the storage of it myself.

Note: ftp is an inherently unsafe method since it doesn't use encryption. Use a separate user for this with no privileges, change the password of that user regularly, and close port 21 whenever you aren't using it in order to not advertise that you are running an ftp server. Use ssh/sftp if at all possible.

Anyway, setting up an ftp server was easy.

This method follows this post, http://ubuntuforums.org/showthread.php?t=79588, almost verbatim.

First install proftpd

sudo apt-get install proftpd

Edit /etc/shells:
# /etc/shells: valid login shells /bin/csh /bin/sh #/usr/bin/es #/usr/bin/ksh #/bin/ksh #/usr/bin/rc #/usr/bin/esh /bin/dash /bin/bash /bin/rbash #/usr/bin/screen #/bin/tcsh #/usr/bin/tcsh #/bin/ksh93 /bin/false

sudo adduser ftpuser

su ftpuser
cd ~
mkdir download
mkdir upload
exit

Edit /etc/proftpd/proftpd.conf. In addition to what was already there, I added
UserAliasOnly on UserAlias spinebill ftpuser ExtendedLog /var/log/ftp.log TransferLog /var/log/xferlog SystemLog /var/log/syslog.log AllowStoreRestart on <Directory /home/ftpuser> Umask 022 022 AllowOverwrite off <Limit MKD STOR DELE XMKD RNRF RNTO RMD XRMD> DenyAll </Limit> </Directory> <Directory /home/ftpuser/download/*> Umask 022 022 AllowOverwrite off <Limit MKD STOR DELE XMKD RNEF RNTO RMD XRMD> DenyAll </Limit> </Directory> <Directory /home/ftpuser/upload/> Umask 022 022 AllowOverwrite on <Limit READ RMD DELE> DenyAll </Limit> <Limit STOR CWD MKD> AllowAll </Limit> </Directory> Include /etc/proftpd/conf.d/

su ftpuser
chsh -s /bin/false
exit

Check the syntax:
sudo proftpd -td5

Test:
ftp `hostname`
Connected to beryllium. 220 ProFTPD 1.3.4a Server (Debian) [192.168.1.1] Name (beryllium:me): spinebill 331 Password required for spinebill Password: 230 User spinebill logged in Remote system type is UNIX. Using binary mode to transfer files. ftp>
I have since tested this from the Win NT 4 computer and everything is working well. I had to familiarise myself with the windows ftp client first: http://www.nsftools.com/tips/MSFTP.htm

477. OpenChrom - Dempster

I don't want to get into writing software reviews, but given how much OpenChrom, an open source program for mass spectrometry which can open a range of proprietary formats, has evolved since I tried the release code-named Syringe (http://verahill.blogspot.com.au/2012/09/using-openchrom-to-open-aglient-d-esi.html) in September 2012, I think a brief update may be in order.

Essentially, OpenChrom now seems a lot easier and more natural to use.

The installation is the same as before (I've copied the old post below):

1. Install Java v1.7 (need > 1.6)
You can either use openjdk 7 or (Oracle) Java. See here for a general guide to installing Oracle/Sun Java.

As for openjdk, you can easily install it:
sudo apt-get install openjdk-7-jdk

(the openjdk-7-jre package is enough if you don't want the full developer's kit)

Anyway.

Make sure that you've selected the right version:
 sudo update-alternatives --config java
There are 7 choices for the alternative java (providing /usr/bin/java).

  Selection    Path                                            Priority   Status
------------------------------------------------------------
  0            /usr/lib/jvm/java-6-openjdk-amd64/jre/bin/java   1061      auto mode
  1            /usr/bin/gij-4.4                                 1044      manual mode
  2            /usr/bin/gij-4.6                                 1046      manual mode
  3            /usr/bin/gij-4.7                                 1047      manual mode
  4            /usr/lib/jvm/j2re1.6-oracle/bin/java             314       manual mode
  5            /usr/lib/jvm/j2sdk1.6-oracle/jre/bin/java        315       manual mode
  6            /usr/lib/jvm/java-6-openjdk-amd64/jre/bin/java   1061      manual mode
 *7            /usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java   1051      manual mode




2. Get openchrom
cd ~/tmp
wget http://aarnet.dl.sourceforge.net/project/openchrom/REL-0.8.0-PREV/openchrom_linux.gtk.x86_64_0.8.0-PREV.zip
unzip openchrom_linux.gtk.x86_64_0.6.0.zip
cd linux.gtk.x86_64/OpenChrom/
sudo mkdir /opt/openchrom
sudo chown $USER /opt/openchrom 
cp * -R /opt/openchrom
chmod +x /opt/openchrom/openchrom

Stick

alias openchrom='/opt/openchrom/openchrom'

in your ~/.bashrc and source it.




A few notes:
You can now start OpenChrom by running
openchrom
from the command line.

You can install plug-ins, e.g. to open Agilent files, by going to Plugins/OpenChrom Marketplace. Plugins will typically run for 30 days, but can be unlocked for perpetuity by adding a serial key, which is free. Add license keys by first registering on the openchrom website, going to http://www.openchrom.net/main/content/plugins.php, and clicking on the plugin you want a serial key for. You can then enter that key by going e.g. Window/Preferences/Converter in the OpenChrom software.

Everyone has their own idea of what a good piece of mass spec software should look like, and I suspect that OpenChrom caters to them all. Layouts are called Perspectives here. On the flip side, if you accidentally use a perspective that doesn't suit you, you may be incredibly frustrated until you figure out what's going on.

I like wsearch32, so my preferred view is to go to Window/Perspective Switcher -> Chromatogram MS (exact).

Opening spectra is much easier than before: now simply go to File/Open Chromatogram (MSD) and click on the file (or directory structure in the case of e.g. Agilent .D directories) and open:

Anyway, openchrom is getting better, is easier to use, and is still the only open source program that I know which can handle such an array of proprietary formats. Also, it can export data in both .csv and .xls formats, but you will need to install plugins for that, which is luckily very simple.


15 July 2013

476: Rehash: using a browser proxy via tunnel, through a router and with reverse ssh

I may have covered this at some point, but if so, I can't find the post.

Here's the situation:
You have a linux computer at work, which is behind a corporate firewall.
You have a router at home which runs an ssh server (e.g. running tomato).
You have a computer at home, which sits behind the router above.
You want to browse from home using the corporate network

In my case it's a little bit different -- I want to make a change to the router my office network (I have my own office) sits behind, and the easiest way to do that is by logging onto that router via http (it's a stock netgear router).

How to:
First, at work, connect to your home router using reverse ssh, so that all traffic on port 19999 on the router gets sent to port 22 on your work computer:
ssh -R 19999:localhost:22 root@myhomerouter

Later, at home, forward all traffic to port 8989 on your home computer to localhost:19999 on your router (which then gets sent to port 22 on your work computer):
ssh -L 8989:localhost:19999 root@192.168.2.1

We've assumed that the router sits on 192.168.2.1 from inside the LAN. Localhost here refers to your home computer, while localhost in the command before that refers to the router.

Then, in a different terminal, open a proxy through port 8989:
 ssh -D 8888 me@localhost -p 8989

Finally, you can now edit your browser/network settings to use a SOCKS proxy on port 8888 like you would with any other proxy.

475. How to get into a Chemistry PhD program in Australia -- or at least a reply from a prospective supervisor

Here's yet another non-linux post. I'm currently getting ready for the start of the new semester and teaching, and so haven't had much time to work on improving my computer skills.

Anyway.

I've been advertising for an international PhD student for the past 9 months and have so far only had one great applicant and three acceptable applicants. That's out of ca 200 applicants in total.

So what does 'acceptable' mean? In this case my use actually agrees with the literal meaning -- students which will stand a chance of being accepted to the PhD program. It also means students which I could imagine working with.

The formal requirements will likely differ between different institutions, and between supervisors. In addition, some supervisors may be looking for different personalities in their prospective hires, than others.

I don't think that I'm being unnecessarily harsh in evaluating applicants, as I've had colleagues review my shortlists and who have thought I've even been a bit too optimistic in my evaluations.

At any rate, if you are looking for a PhD, be aware that there are a lot of applicants out there, and only a limited amount of money and places, so you will want to spend some time on your application.

So here are a few of my thoughts:
Before reading, keep in mind that I understand that applying for a PhD, especially if you are from the developing world and applying for a PhD position in the industrialised world, can be very tough, and sometimes depressing. You don't receive a reply to most of your applications, and when you do, they responses are normally negative.


* Try to familiarise yourself with the formal requirements, and address them in the first paragraph in your email to a prospective supervisor. In the case of my uni, there are two main requirements:
-- an undergraduate degree equivalent to a first class honour degree in Australia
-- a sufficiently good score at the IELTS

That's it. However, the hubris of many universities in Australia mean that the first requirement is a significant hurdle. Typically, good grades are just the beginning. In addition to that, the applicant needs to hold a masters degree (by research) and have a couple of papers in ISI rated journals. Obviously almost none of our own undergraduate students would meet that, but there you go.

So in your first paragraph, state what unis you did your degrees at, what your cumulative GPAs (or equivalent) were, how many papers you have published and what you overall band score AND section scores on the IELTS (or TOELF) are.

At this stage, that's much more important than your background, your hobbies, or anything else. If you can't meet the minimum requirements for entry to the PhD program, everything else doesn't matter.

* Read the advertisement, and follow any instructions
I ask applicants to submit all their documents as PDFs. Yet, I get plenty of applications with .doc, .docx, jpeg etc attached. You didn't read the instructions -- will you be more careful as a PhD student? Remember that you competing against plenty of applicants that did read the instructions.

Did I ask for your IELTS results? Didn't attach or mention them in your email/CV? Not a good sign. Also, it means that you're probably not a candidate.

* Address the supervisor and the supervisor's research
I get way, way too many emails that start with  'Dear Sir', or 'Dear Professor' or even worse: 'Dear Sir/Madam'. Put my name in there. It'll show me that you spent at least a few minutes on personalising your email. If you don't make that effort, why should I make the effort of reading your email and looking at your documents?

Also, please do mention the research of the supervisor you are applying to. It doesn't need to be anything insightful or special, but just write something like: 'I find your research into catalytic activation of molecules in ionic liquids very interesting.' or 'I read your article in Green Chemistry, 2013, 10, 2345 and found it very interesting. In particular, I liked how it showed how the selectivity of blah blah blah'.

The reason is not that you are showing off your great scientific skills (you've got an undergraduate degree -- we don't expect much), but that it shows you spent a bit of effort writing your email and personalising it. Also, flatter -- in moderation -- can occasionally help (don't go overboard, so be careful -- too much makes you seem insincere).

* Don't cold-call
This should go without saying. I've had one student email me in the morning, then call me in the afternoon. That kind of behaviour is probably correct if you are applying for certain jobs in the Real World (marketing?), but not for a PhD in chemistry. It's a sure-fire way of annoying people.

* Don't send a linked-in invite
I don't have time to scroll through your profile and try to compile a CV for you. Send me your CV in pdf format instead. Also, I don't know you, and have no incentive to add you to my 'network'.

* Be careful about 'hobbies' and 'interests'.
To me as a potential supervisor they really don't matter (again, this is my personal opinion). I know that the idea is to show that you are a well-rounded individual, but knowing that you like 'travel' or that you consider 'internet browsing' a skill will not be the edge that gets you into a PhD programme.

* 'It can't help you, only harm you'.
Keep this in mind. Unless it's a piece of information required in the advertisement, or that you are absolutely certain will help your application, consider leaving it out. You may include it to highlight a particular skill or trait, but remember that a CV can be interpreted ambiguously, and your intent may not be obvious. Instead, what you feel shows how independent and committed you are, can be seen as being unfocussed, a difficult person to work with, or simply attract attention away from more important aspects of your CV.

* Attending lectures, conferences
In their CVs, some applicants include lectures by famous people that they've attended, or conferences that they've gone to.

Here's the problem for me: most first year PhD students struggle with the notion that doing the work is no longer enough. Doing the experiments, or following your supervisor's instructions, is not enough. To get a PhD you need to make that extra effort and making things work. And if it doesn't work, you put in 150% effort -- the extra 50% being extra-curricular work on finding a related project that will work. Life as a PhD student can be easy if you are lucky, but most often is not -- life is incredibly good when you project is working, but on the flip-side it can be hard, depressing and demoralising when it isn't. You supervisor can alleviate some of that, but remember that your supervisor is only there to point you in a general direction -- the PhD is all about making the transition to becoming an INDEPENDENT research.

So be careful -- if you've presented posters or given talks at conferences or at other universities, you should definitely list them, but under a suitable heading -- NOT publications. They'll detract attention from the publications, and the publications is what will get you an offer of acceptance.

* Do not make things conditional
I had an applicant who was borderline (in terms of meeting the requirements), and in those cases occasionally the supervisor putting in extra effort into cajoling the university administration MAY be enough to get a student accepted (don't count on it). If your prospective supervisor asks you to re-take IELTS, don't write something along the lines of  'I will, but only if this is the last hurdle'.

I understand it's expensive, but remember: even if you meet all the requirements I cannot guarantee that you get accepted. And I can' wait months for each student to pass through the application system -- I need to hire someone now. So be proactive.

* Face-to-face (or skype/video) interview is a good sign
If your supervisor asks for a skype interview, this is a great sign. And likely this isn't really done in order to gauge your scientific skills, but just to get a feel for your personality. Also, it's a way of making sure that your English levels are good enough that you can communicate with your supervisor. Finally, if you are borderline in terms of IELTS/TOELF, your supervisor may be able to argue that you English is good enough based on that interview. So take the opportunity.

And send an email a few hours after the interview thanking for the opportunity. 1-2 lines is enough. It will show that you're a decent human being.

* Be prompt in replying to emails
It doesn't matter what stage of the application you are at -- until the paperwork has been signed you are still on probation. If you take several days to reply to any of my emails, then you are likely to be dropped. The reason is simple: if you take a week to get things done when you are a PhD student, then you will be a disaster for me. A disaster that I'll have to live with for the next 3-4 years, and whom will be using up my research grant, and potentially ruining my career.

I understand that the reason for you being slow may be different -- maybe you are just nervous, maybe you have nothing to say, maybe you feel you are intruding. Still, be prompt.

So:
if you can show that you can read and follow instructions, and if you can make my life easy by addressing the selection criteria in a clear way, and if you seem like a person I might enjoy working with for the next 3-4 years, then you stand a fair chance of getting an offer.

If I think you'll need constant supervision, is sloppy and won't follow instructions, or that our personalities will clash, I'll probably avoid you no matter how good your grades are.


12 July 2013

474. MS data, part I: Exporting data as csv from wsearch32, and generating MS assignments using Matt Monroe's molecular weight calculator

NOTE: I've heard rumours about problems with wsearch on Windows 7 Home, and on Windows 8. I've heard reports of it working on Windows 7 Professional. Curiously, it works just fine on linux under wine.

This post is written with two particular students in mind. I could put this in a pdf and email it, but why not share with the wider world since other people may encounter the same issues?

See here for part II: http://verahill.blogspot.com.au/2013/07/480-ms-data-part-ii-plotting-and.html


1. Exporting data from wsearch32
To install wsearch32 under wine, see here: http://verahill.blogspot.com.au/2013/01/321-wsearch32-in-wine.html

In order to export data from wsearch so that you can plot it in e.g. gnuplot, octave, origin or excel, do the following:

Open a spectrum (chromatogram) and pick a slice, then click on the M/I icon in the bottom right:

 Pick Save As

 And save as e.g. csv (comma separated file)

Done.

2. Using formula finder in Matthew Monroe's Molecular Weight Calculator
To install the molecular weight calculator in wine, see here: http://verahill.blogspot.com.au/2012/09/matt-monroes-molecular-weight.html

Open the molecular weight calculator and go to edit abbreviations.

 Add an abbreviation for MeO. We'll call it Methx, and it has a charge of -1:
 Methanol:
 Nitrate:
 Hit OK to save the changes.

 Go to formula finder:

We'll be looking for Ga, NO3, MeOH, O, H, MeO. Then click on Formula Finder Options:
 Limit the charge to 1:
 And search:

You can do fancier stuff, e.g. searching directly for the m/z and bound the search to min/max amounts of different elements:
 As shown here:




10 July 2013

473. Programming a Metrohm Titrino -- not a how-to, just a ramble

Many, many years ago I learned basic programming using BASIC (the version that came with PC DOS 5, I think). I even wrote the odd game, but it was all pretty awful. A few years later I learned Turbo Pascal, which was a fantastic experience compared to Basic. It felt all sciency and grown up, and it was my first experience with a real IDE. I even ended up buying a TP book, and became somewhat proficient. This must've been when I was around 18-19.I then stopped programming completely.

At around 30 years of age I decided it was time to get serious about programming again -- I was doing mass spectrometry and needed a simple program that could generate a series of solutions to the identity of a mass/charge ratio given a range of elements. I probably had a quick look at C and C++, but ended up getting a Python book and have been happy Python programmer ever since.

The problem is that I've never been a /good/ python programmer -- and in all these years I've never fully understood the use for (or, in all fairness, use OF) OOP. And at the moment it seems to be holding me back -- all the examples that I find of the use the threading module as well as writing GUIs (using e.g. wxPython) involve using classes. And I just don't understand them well enough to sort out what I need done.

Anyway, long story short: I've written a basic program for communicating with a Metrohm Titrino 736 GP via RS 232. It's found here: https://sourceforge.net/projects/pytitrino/

Currently:
* the code is a mess (see above)
* it works fine for doing monotonic and dynamic end point titrations (MET and DET)
* it saves data to a file, but does so silently (i.e. when you run you won't get any feedback that things are working properly...)
* it uses the thread (not threading) module
* I've managed to pass parameters back and forth between the thread and the main loop using Queue

There are probably much better solutions. One day I hope to be able to stick a GUI on top of it, but the more I look at it I get the impression that one writes the GUI first, then the engine...not that I'd know.

Anyway. That's what I've been up to. Anyone with a bit of programming experience, whom is in possession of an old-school Titrino (i.e. using RS 232) and wants to save $1.5k in software licenses may be interested in taking the sources and turning them into something useful.


03 July 2013

472. Briefly: Iranian PhD students in Australia

I'm not going to leave much in the way of a comment, but this doesn't seem to have been publicised enough. Searching the web quickly didn't bring this up at all, and it's a shame since it's important, in particular if you are an Iranian national thinking about doing a PhD in Australia.

About a week ago the faculty in the chemistry department at my university were informed that heavy restrictions in terms of access to instrumentation has been put in place for students from North Korea, Syria and Iran via Federal legislation.


While I don't think there are any students from North Korea or Syria around, there are several Iranian students at different stages of their PhD. In fact, I would say around 50% of our applicants are from India, 25% are from Pakistan and 20 % are from Iran (in terms of accepted students the ratio is very different)


In practical terms, this means that Iranian students in the department are not allowed to use:
FT-IR
UV-Vis
NMR
Mass spectrometers
Raman
dosimeters
OES/AAS
etc.

All of which are standard instruments which most chemists would find necessary to do research. In addition, they can hardly be considered as being cutting edge, trade secrets or anything like that -- commercial NMR instruments have been around since the 1950s, infrared an UV/Visible spectroscopy go much further back. Mass spectrometry is a standard tool which, although many of the current designs only go back to the 1980s (e.g. ESI), is so conceptually simple and innocuous, that (to me) restrictions on it doesn't make sense. And so on.

In addition, supervisors of Iranian students have been asked to draw up a risk management plan to prevent student access to the above instruments, which is a particular problem given that they are used in teaching as well, and are available on a walk-in basis to undergraduate students doing projects in research labs.

Currently, any supervisor who has an Iranian student needing to use any of the instruments above will need to assign another student to do these measurements for the Iranian national.

While this doesn't formally preclude Iranians from coming to Australia to do a PhD, we have been advised that we should reject any applicants at this point. This may change once the university has figured out exactly where they must draw the line in terms of restricting access to Iranians to different facilities, but for now it's a blanket ban.

My personal opinion is that while you'd be led by the media to think of anyone from North Korea, Syria and Iran as potential spies, these are real people too. Many Iranians would either be completely disinterested in politics, or actively antipathetic to their regime. And the best thing about democracies -- we shouldn't have any issues with them supporting their government either. So I don't really agree with this as a security measure to prevent nuclear proliferation, which must surely be the stated goal.

And if the idea is to put in place sanctions to promote regime change, then why limit the type of instrumentation that students can access? Or are we trying to punish the children of the leadership in Iran? Then why not limit the sanctions to those specifically? Top students tend to come from all socioeconomic classes.

The timing is also very odd, given the recent election of a moderate.

And why Iran and not Belarus, China, Zimbabwe etc.?

Again, I don't like putting opinion pieces on this blog (other than as minor parts/rants of posts with actual content) but I think this should be publicized more.



02 July 2013

471. Debian Jessie -- gnome-shell bug

Update 3/7/2013:
there are now *gnome-bluetooth packages (3.8.1-2) in the jessie repos now. While I haven't looked closer at them, I presume that they fix this issue.

(on a different note: dist-upgrade currently removes gnome...)

Original post:
I've used debian testing since early 2011, and I've only had a few minor issues during that time.

However, sometimes things happen that reminds you that the Testing release is not meant for mission critical work (and makes me happy that I only use Jessie on my laptop, which I mainly use at home).

So...

Last night I did upgrade and dist-upgrade, which installed the following packages according to /var/log/apt/history:
Start-Date: 2013-07-01  22:03:17
Commandline: apt-get dist-upgrade
Install: p11-kit:amd64 (0.18.3-2, automatic), libgnome-bluetooth11:amd64 (3.8.1-1, automatic), libgcr-base-3-1:amd64 (3.8.2-3, automatic), libtasn1-6:amd64 (3.3-1, automatic), libgcr-ui-3-1:amd64 (3.8.2-3, automatic)
Upgrade: libnm-gtk0:amd64 (0.9.8.2-1, 0.9.8.2-1+b1), libgcr-3-1:amd64 (3.4.1-3, 3.8.2-3), gir1.2-gcr-3:amd64 (3.4.1-3, 3.8.2-3), network-manager-gnome:amd64 (0.9.8.2-1, 0.9.8.2-1+b1), gnome-keyring:amd64 (3.4.1-5, 3.8.2-2), gcr:amd64 (3.4.1-3, 3.8.2-3), gnome-bluetooth:amd64 (3.4.2-1, 3.8.1-1), gir1.2-gnomebluetooth-1.0:amd64 (3.4.2-1, 3.8.1-1), gir1.2-gck-1:amd64 (3.4.1-3, 3.8.2-3)
End-Date: 2013-07-01  22:03:29

Now what happens when I log in to gnome via gdm3 I get an empty desktop with no menus, no hot-spots or anything else indicating that things worked out. Alt+F2 doesn't work either, and conky doesn't start.

The only thing that does work is
* my keyboard shortcuts (I've mapped ctrl+shift+Down arrow to chromium)
* guake (which starts with gnome)

ps aux|grep gnome-shell
returns nothing, which might be a clue.

Looking at the debian forums the closest post seems to be (although erroneously labelled -- gdm3 DOES start): http://forums.debian.net/viewtopic.php?f=6&t=105393&p=504077&hilit=gnome+shell#p504077

That in turn led to this bug report: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=712861

My gnome-shell version is 3.4.2-8,

I don't understand how gnome-bluetooth causes this, especially given that I've disabled bluetooth in rcconf, but whatever it takes...

I tried applying the patch but it failed:
mkdir ~/tmp
cd ~/tmp
wget "http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=66;filename=GnomeBluetooth.patch;att=1;bug=712861" -O blue.patch
sed -i 's_js/ui/status/bluetooth.js_/usr/share/gnome-shell/js/ui/status/bluetooth.js_g' blue.patch
sudo patch -p0 < blue.patch

Instead, I ended up making the changes to /usr/share/gnome-shell/js/ui/status/bluetooth.js by hand (remember that you can always use the ttys using ctrl+Fx):
  6 const Gio = imports.gi.Gio;
  7 const GnomeBluetoothApplet = imports.gi.GnomeBluetoothApplet;
  8 const GnomeBluetooth = imports.gi.GnomeBluetooth;
  9 const Gtk = imports.gi.Gtk;

and then delete the Applet part in GnomeBluetoothApplet so that it reads
 38         this._killswitch.connect('toggled', Lang.bind(this, function() {
 39             let current_state = this._applet.killswitch_state;
 40             if (current_state != GnomeBluetooth.KillswitchState.HARD_BLOCKED &&
 41                 current_state != GnomeBluetooth.KillswitchState.NO_ADAPTER) {
 42                 this._applet.killswitch_state = this._killswitch.state ?
 43                     GnomeBluetooth.KillswitchState.UNBLOCKED:
 44                     GnomeBluetooth.KillswitchState.SOFT_BLOCKED;
 45             } else
 46                 this._killswitch.setToggleState(false);

Then do it again:
 96     _updateKillswitch: function() {
 97         let current_state = this._applet.killswitch_state;
 98         let on = current_state == GnomeBluetooth.KillswitchState.UNBLOCKED;
 99         let has_adapter = current_state != GnomeBluetooth.KillswitchState.NO_ADAPTER;
100         let can_toggle = current_state != GnomeBluetooth.KillswitchState.NO_ADAPTER &&
101                          current_state != GnomeBluetooth.KillswitchState.HARD_BLOCKED;
102 



At this point I rebooted and everything was back to normal (you can try simply doing 'sudo service gdm3 restart' instead of rebooting).
Anyway, done.

470. Very briefly: compiling nwchem 6.3 with ifort and mkl

This used to be part of http://verahill.blogspot.com.au/2013/07/469-intel-compiler-on-debian.html, but I think it makes more sense making it a separate post.

I did this on debian wheezy.

1. Installing mkl and the compiler
MKL: http://verahill.blogspot.com.au/2013/06/465-intel-mkl-math-kernel-library-on.html
Intel compiler collection: http://verahill.blogspot.com.au/2013/07/469-intel-compiler-on-debian.html

I will henceforth presume that you have put the files in the same location as shown in those posts, and that you have created /etc/ld.so.conf.d/intel.conf as shown in the second post.

2 Compiling nwchem 6.3
sudo apt-get install build-essential libopenmpi-dev openmpi-bin
sudo mkdir /opt/nwchem -p
sudo chown $USER:$USER /opt/nwchem
cd /opt/nwchem
wget http://www.nwchem-sw.org/download.php?f=Nwchem-6.3.revision1-src.2013-05-28.tar.gz -O Nwchem-6.3.revision1-src.2013-05-28.tar.gz
tar xvf Nwchem-6.3.revision1-src.2013-05-28.tar.gz
mv nwchem-6.3-src.2013-05-28 nwchem-6.3-src.2013-05-28.ifort

export NWCHEM_TOP=`pwd`
export LARGE_FILES=TRUE
export TCGRSH=/usr/bin/ssh
export NWCHEM_TOP=`pwd`
export NWCHEM_TARGET=LINUX64
export NWCHEM_MODULES="all"
export PYTHONVERSION=2.7
export PYTHONHOME=/usr

export BLASOPT="-L/opt/intel/composer_xe_2013.4.183/mkl/lib/intel64/ -lmkl_core -lmkl_sequential -lmkl_intel_ilp64"
export LIBRARY_PATH="$LIBRARY_PATH:/usr/lib/openmpi/lib:/opt/intel/composer_xe_2013.4.183/mkl/lib/intel64/"

export USE_MPI=y
export USE_MPIF=y
export USE_MPIF4=y
export MPI_LOC=/usr/lib/openmpi/lib
export MPI_INCLUDE=/usr/lib/openmpi/include


export LIBMPI="-lmpi -lopen-rte -lopen-pal -ldl -lmpi_f77 -lpthread"
export ARMCI_NETWORK=SOCKETS

cd $NWCHEM_TOP/src

make clean
make nwchem_config
make FC=ifort 1> make.log 2>make.err

cd $NWCHEM_TOP/contrib
export FC=ifort
./getmem.nwchem

And it works quite fine. See e.g. here if you want to patch to allow to compile with python, and to support gabedit.

3. Performance
This is always a bit contentious, and I want to be upfront with the fact that I haven't spent much time considering whether my test example is a good one. I simply did a geo-opt + vibrational analysis as shown in this post: http://verahill.blogspot.com.au/2013/05/430-briefly-crude-comparison-of.html

The jobs were run using all cores available on that node.
gnu= gfortran + acml 5.3.1 for the Phenom and FX8150, and openblas for the i5-2400 and the Athlon 3800+..
ifort= ifort + mkl for all architectures.

The times are in seconds and are CPU times, not wall times.

Arch|                   cores  gnu      ifort Instruction sets
-------------------------------------------------------------------------
AMD Athlon 64 X2 3800+:   2    10828*    12516  sse, sse2, sse3
AMD Phenom II X6 1055T:   6    2044       2048   sse, sse2, sse3
AMD FX8150            :   8    1611       1507   sse, sse2, sse3, AVX, FMA4
Intel i5-12400        :   4    1652*      1498   sse, sse2, sse3, sse4,AVX

In the last case I also compiled using gfortran but with mkl and got 1550s.

It's a fairly small sample set, but it does seem that there's a little bit of an advantage with mkl+ifort over gfortran+acml on the newest AMD core. One would need much more data though.

A clear downside of using mkl and ifort is the fact that they are not freely available though -- i.e. you can register and download them for free for non-commercial use, but there's no guarantee that your colleague, next-door-neighbour or distant-cousin will be able to use it.

01 July 2013

469. Intel compiler (icc, icpc, ifort) on Debian

I've heard it said that MKL is faster than ACML even on AMD cpus. I've also heard it said that Intel compile + mkl beats everything else, even on AMD cpus.

So let's test the veracity of that statement. I'm in particular looking forward to seeing how this affects my amd fx 8150.

Note that the ACML libs are available as separate packages for different compilers -- download the right libs when linking (i.e. the gnu ones for gcc, and the intel ones for intel composer)

But first we need to install and set-up the intel compiler suite, and that's what we'll do in this post.

In the example below I've installed in on an AMD Athlon II X3, hence the message about non-Intel architecture.

Installation:

Register for the Intel Parallel Studio XE as touched upon in this post: http://verahill.blogspot.com.au/2013/06/465-intel-mkl-math-kernel-library-on.html

Download. It's about 2 Gb. Then extract, and run install.sh

sudo apt-get install build-essential
sudo sh install.sh
Step no: 1 of 7 | Welcome -------------------------------------------------------------------------------- Welcome to the Intel(R) Parallel Studio XE 2013 Update 3 for Linux* installation program. -------------------------------------------------------------------------------- You will complete the steps below during this installation: Step 1 : Welcome Step 2 : License Step 3 : Activation Step 4 : Intel(R) Software Improvement Program Step 5 : Options Step 6 : Installation Step 7 : Complete -------------------------------------------------------------------------------- Press "Enter" key to continue or "q" to quit: -------------------------------------------------------------------------------- Checking the prerequisites. It can take several minutes. Please wait... -------------------------------------------------------------------------------- Step no: 1 of 7 | Options > Missing Optional Pre-requisite(s) -------------------------------------------------------------------------------- There are one or more optional unresolved issues. It is highly recommended to resolve them all before you continue the installation. You can fix them without exiting from the installation and re-check. Or you can quit from the installation, fix them and run the installation again. -------------------------------------------------------------------------------- Missing optional pre-requisites -- Intel(R) VTune(TM) Amplifier XE 2013 Update 5: unsupported OS -- Intel(R) Inspector XE 2013 Update 5: unsupported OS -- Intel(R) Advisor XE 2013 Update 2: unsupported OS -- Intel(R) Composer XE 2013 Update 3 for Linux*: unsupported OS -------------------------------------------------------------------------------- 1. Skip missing optional pre-requisites [default] 2. Show the detailed info about issue(s) 3. Re-check the pre-requisites h. Help b. Back to the previous menu q. Quit -------------------------------------------------------------------------------- Please type a selection or press "Enter" to accept default choice [1]: Step no: 2 of 7 | License -------------------------------------------------------------------------------- As noted in the Intel(R) Software Development Product End User License Agreement, the Intel(R) Software Development Product you install will send Intel [..] -------------------------------------------------------------------------------- Do you agree to be bound by the terms and conditions of this license agreement? Type "accept" to continue or "decline" to back to the previous menu: accept Step no: 3 of 7 | Activation -------------------------------------------------------------------------------- If you have purchased this product and have the serial number and a connection to the internet you can choose to activate the product at this time. Activation is a secure and anonymous one-time process that verifies your software licensing rights to use the product. Alternatively, you can choose to evaluate the product or defer activation by choosing the evaluate option. Evaluation software will time out in about one month. Also you can use license file, license manager, or remote activation if the system you are installing on does not have internet access activation options. -------------------------------------------------------------------------------- 1. I want to activate my product using a serial number [default] 2. I want to evaluate my product or activate later 3. I want to activate either remotely, or by using a license file, or by using a license manager h. Help b. Back to the previous menu q. Quit -------------------------------------------------------------------------------- Please type a selection or press "Enter" to accept default choice [1]: Note: Press "Enter" key to back to the previous menu. Please type your serial number (the format is XXXX-XXXXXXXX): -------------------------------------------------------------------------------- Activation completed successfully. -------------------------------------------------------------------------------- Press "Enter" key to continue: Step no: 4 of 7 | Intel(R) Software Improvement Program -------------------------------------------------------------------------------- Help improve your experience with Intel(R) software Participate in the design of future Intel software. Select 'Yes' to give us permission to learn about how you use your Intel software and we will do the rest. - No Personal contact information is collected - There are no surveys or additional follow-up emails by opting in - You can stop participating at any time Learn more about Intel(R) Software Improvement Program http://software.intel.com/en-us/articles/software-improvement-program With your permission, Intel may automatically receive anonymous information about how you use your current and future Intel software. -------------------------------------------------------------------------------- 1. Yes, I am willing to participate and improve Intel software. (Recommended) 2. No, I don't want to participate in the Intel(R) Software Improvement Program at this time. b. Back to the previous menu q. Quit -------------------------------------------------------------------------------- Please type a selection: Step no: 5 of 7 | Options -------------------------------------------------------------------------------- You are now ready to begin installation. You can use all default installation settings by simply choosing the "Start installation Now" option or you can customize these settings by selecting any of the change options given below first. You can view a summary of the settings by selecting "Show pre-install summary". -------------------------------------------------------------------------------- 1. Start installation Now 2. Change install directory [ /opt/intel ] 3. Change components to install [ All ] 4. Change advanced options 5. Show pre-install summary h. Help b. Back to the previous menu q. Quit -------------------------------------------------------------------------------- Please type a selection or press "Enter" to accept default choice [1]: -------------------------------------------------------------------------------- Checking the prerequisites. It can take several minutes. Please wait... -------------------------------------------------------------------------------- Step no: 5 of 7 | Options > Missing Optional Pre-requisite(s) -------------------------------------------------------------------------------- There are one or more optional unresolved issues. It is highly recommended to resolve them all before you continue the installation. You can fix them without exiting from the installation and re-check. Or you can quit from the installation, fix them and run the installation again. -------------------------------------------------------------------------------- Missing optional pre-requisites -- Intel(R) VTune(TM) Amplifier XE 2013 Update 5: The system does not have an Intel Architecture processor -------------------------------------------------------------------------------- 1. Skip missing optional pre-requisites [default] 2. Show the detailed info about issue(s) 3. Re-check the pre-requisites h. Help b. Back to the previous menu q. Quit -------------------------------------------------------------------------------- Please type a selection or press "Enter" to accept default choice [1]: Step no: 7 of 7 | Complete -------------------------------------------------------------------------------- Thank you for installing and using the Intel(R) Parallel Studio XE 2013 Update 3 for Linux* Reminder: Intel(R) VTune(TM) Amplifier XE users must be members of the "vtune" permissions group in order to use Event-based Sampling. To register your product purchase, visit https://registrationcenter.intel.com/RegCenter/registerexpress.aspx?clientsn=N43 3-4XGWTJLB To get started using Intel(R) VTune(TM) Amplifier XE 2013 Update 5: - To set your environment variables: source /opt/intel/vtune_amplifier_xe_2013/amplxe-vars.sh - To start the graphical user interface: amplxe-gui - To use the command-line interface: amplxe-cl - For more getting started resources: /opt/intel/vtune_amplifier_xe_2013/ documentation/en/welcomepage/get_started.html. To get started using Intel(R) Inspector XE 2013 Update 5: - To set your environment variables: source /opt/intel/inspector_xe_2013/inspxe-vars.sh - To start the graphical user interface: inspxe-gui - To use the command-line interface: inspxe-cl - For more getting started resources: /opt/intel/inspector_xe_2013/ documentation/en/welcomepage/get_started.html. To get started using Intel(R) Advisor XE 2013 Update 2: - To set your environment variables: source /opt/intel/advisor_xe_2013/advixe-vars.sh - To start the graphical user interface: advixe-gui - To use the command-line interface: advixe-cl - For more getting started resources: /opt/intel/advisor_xe_2013/ documentation/en/welcomepage/get_started.html. To get started using Intel(R) Composer XE 2013 Update 3 for Linux*: - Set the environment variables for a terminal window using one of the following (replace "intel64" with "ia32" if you are using a 32-bit platform). For csh/tcsh: $ source /opt/intel/bin/compilervars.csh intel64 For bash: $ source /opt/intel/bin/compilervars.sh intel64 To invoke the installed compilers: For C++: icpc For C: icc For Fortran: ifort To get help, append the -help option or precede with the man command. - For more getting started resources: /opt/intel/composer_xe_2013/Documentation/en_US/get_started_lc.htm. /opt/intel/composer_xe_2013/Documentation/en_US/get_started_lf.htm. To view movies and additional training, visit http://www.intel.com/software/products. -------------------------------------------------------------------------------- q. Quit [default] -------------------------------------------------------------------------------- Please type a selection or press "Enter" to accept default choice [q]:


The Files and Setup:
The compiler binaries are now found in /opt/intel/composer_xe_2013.3.163/bin/intel64/ . Of particular interest are ifort, icc and icpc (fortran, c and c++).

In addition, you'll need the lib and include files, which are found in /opt/intel/composer_xe_2013.3.163/compiler/lib/intel64/ and /opt/intel/composer_xe_2013.3.163/compiler/include/intel64.

You can either simply add the libs using LD_LIBRARY_PATH, but a perhaps easier and better method is to create a file: /etc/ld.so.conf.d/intel.conf
/opt/intel/composer_xe_2013.3.163/compiler/lib/intel64 /opt/intel/composer_xe_2013.4.183/mkl/lib/intel64
Once that's done, run
sudo ldconfig

Then do
echo 'PATH=$PATH:/opt/intel/composer_xe_2013.3.163/bin/intel64' >> ~/.bashrc
source ~/.bashrc

Testing:
See this post for an example of how to compile nwchem using ifort: http://verahill.blogspot.com.au/2013/07/470-very-briefly-compiling-nwchem-63.html

468. Kernel 3.10 on Debian

NOTE I: As of 3.10.2 the nvidia module will still not build.  I've also tried 3.10.5 and it also does not work.

NOTE II: I'm getting random slowdowns on my SL410 laptop with intel graphics. Not sure if it's the same issue as this: http://verahill.blogspot.com.au/2013/03/368-slow-mouse-and-keyboard-triggered.html
Once kworker shows up in top everything grinds to a slow crawl. I also notice that I never used 3.9 on that laptop, so the issue may be present there too.

There are several ways of building a kernel. The easiest (a purely subjective statement) is to use kernel-package i.e. make-kpkg. However, every now and again I see people writing that it's been deprecated.

Either way, start by doing
sudo apt-get install fakeroot build-essential ncurses-dev
mkdir ~/tmp
cd ~/tmp
wget https://www.kernel.org/pub/linux/kernel/v3.x/linux-3.10.tar.xz
tar xvf linux-3.10.tar.xz
cd linux-3.10/
cat /boot/config-`uname -r`>.config
make oldconfig
Timer tick handling 1. Periodic timer ticks (constant rate, no dynticks) (HZ_PERIODIC) (NEW) > 2. Idle dynticks system (tickless idle) (NO_HZ_IDLE) (NEW) 3. Full dynticks system (tickless) (NO_HZ_FULL) (NEW) choice[1-3]: Memory placement aware NUMA scheduler (NUMA_BALANCING) [N/y/?] (NEW) Simple CPU accounting cgroup subsystem (CGROUP_CPUACCT) [N/y/?] (NEW) Group CPU scheduler (CGROUP_SCHED) [N/y/?] (NEW) Automatic process group scheduling (SCHED_AUTOGROUP) [N/y/?] (NEW) Choose SLAB allocator 1. SLAB (SLAB) (NEW) > 2. SLUB (Unqueued Allocator) (SLUB) choice[1-2?]: Linux guest support (HYPERVISOR_GUEST) [N/y/?] (NEW) Timer frequency 1. 100 HZ (HZ_100) 2. 250 HZ (HZ_250) (NEW) 3. 300 HZ (HZ_300) > 4. 1000 HZ (HZ_1000) choice[1-4?]: Memory Hotplug (ACPI_HOTPLUG_MEMORY) [N/y/?] (NEW) AMD frequency sensitivity feedback powersave bias (X86_AMD_FREQ_SENSITIVITY) [N/m/?] (NEW) Kernel support for scripts starting with #! (BINFMT_SCRIPT) [Y/n/m/?] (NEW) InfiniBand media type support (TIPC_MEDIA_IB) [N/y/?] (NEW) Network Coding (BATMAN_ADV_NC) [N/y/?] (NEW) NETLINK: mmaped IO (NETLINK_MMAP) [N/y/?] (NEW) NETLINK: socket monitoring interface (NETLINK_DIAG) [N/m/y/?] (NEW) Dummy IRQ handler (DUMMY_IRQ) [N/m/y/?] (NEW) Generic on-chip SRAM driver (SRAM) [N/y/?] (NEW) ME Enabled Intel Chipsets (INTEL_MEI_ME) [N/m/y/?] (NEW) Block device as cache (BCACHE) [N/m/y/?] (NEW) Qualcomm Atheros AR816x/AR817x support (ALX) [N/m/y/?] (NEW) QLOGIC QLCNIC 83XX family SR-IOV Support (QLCNIC_SRIOV) [Y/n/?] (NEW) Realtek RTL8152 Based USB 2.0 Ethernet Adapters (USB_RTL8152) [N/m/?] (NEW) Atheros ath9k rate control (ATH9K_LEGACY_RATE_CONTROL) [N/y/?] (NEW) rt2800usb - Include support for rt55xx devices (EXPERIMENTAL) (RT2800USB_RT55XX) [N/y/?] (NEW) y Realtek RTL8188EE Wireless Network Adapter (RTL8188EE) [N/m/?] (NEW) IMS Passenger Control Unit driver (INPUT_IMS_PCU) [N/m/?] (NEW) Qualcomm Single-wire Serial Bus Interface (SSBI) (SSBI) [N/m/y/?] (NEW) Analog Devices ADT7310/ADT7320 (SENSORS_ADT7310) [N/m/y/?] (NEW) National Semiconductor LM95234 (SENSORS_LM95234) [N/m/?] (NEW) Nuvoton NCT6775F and compatibles (SENSORS_NCT6775) [N/m/y/?] (NEW) generic cpu cooling support (CPU_THERMAL) [N/y/?] (NEW) y ChromeOS Embedded Controller (MFD_CROS_EC) [N/m/y/?] (NEW) Silicon Laboratories 4761/64/68 AM/FM radio. (MFD_SI476X_CORE) [N/m/?] (NEW) System Controller Register R/W Based on Regmap (MFD_SYSCON) [N/y/?] (NEW) TI TPS65912 Power Management chip (MFD_TPS65912) [N/y/?] (NEW) Conexant cx25821 support (VIDEO_CX25821) [N/m/?] (NEW) Cypress firmware helper routines (CYPRESS_FIRMWARE) [N/m] (NEW) QXL virtual GPU (DRM_QXL) [N/m/?] (NEW) Apple infrared receiver (HID_APPLEIR) [N/m/?] (NEW) Enable USB persist by default (USB_DEFAULT_PERSIST) [Y/n/?] (NEW) USB-Wishbone adapter interface driver (USB_SERIAL_WISHBONE) [N/m/?] (NEW) USB Physical Layer drivers (USB_PHY) [N/y/?] (NEW) PXA 27x (USB_PXA27X) [N/m/?] (NEW) MARVELL PXA2128 USB 3.0 controller (USB_MV_U3D) [N/m/?] (NEW) LED Support for TI LP5562 LED driver chip (LEDS_LP5562) [N/m/?] (NEW) LED Camera Flash/Torch Trigger (LEDS_TRIGGER_CAMERA) [N/m/y/?] (NEW) y iSCSI Extentions for RDMA (iSER) target support (INFINIBAND_ISERT) [N/m/?] (NEW) Set system time from RTC on startup and resume (RTC_HCTOSYS) [Y/n/?] (NEW) Set the RTC time based on NTP synchronization (RTC_SYSTOHC) [Y/n/?] (NEW) RTC used to set the system time (RTC_HCTOSYS_DEVICE) [rtc0] (NEW) WIS GO7007 MPEG encoder support (VIDEO_GO7007) [N/m/?] (NEW) DesignWare USB2 DRD Core Support (USB_DWC2) [N/m/?] (NEW) pvpanic device support (PVPANIC) [N/m/y/?] (NEW) Reset Controller Support (RESET_CONTROLLER) [N/y/?] (NEW) XFS Verbose Warnings (XFS_WARN) [N/y/?] (NEW) Btrfs will run sanity tests upon loading (BTRFS_FS_RUN_SANITY_TESTS) [N/y/?] (NEW) Btrfs debugging support (BTRFS_DEBUG) [N/y/?] (NEW) EFI Variable filesystem (EFIVAR_FS) [N/m/y/?] (NEW) torture tests for RCU (RCU_TORTURE_TEST) [N/m/y/?] (NEW) Ring buffer startup self test (RING_BUFFER_STARTUP_TEST) [N/y/?] (NEW) Test functions located in the string_helpers module at runtime (TEST_STRING_HELPERS) [N/m/y] (NEW) CMAC support (CRYPTO_CMAC) [N/m/y/?] (NEW) SHA256 digest algorithm (SSSE3/AVX/AVX2) (CRYPTO_SHA256_SSSE3) [N/m/y/?] (NEW) y SHA512 digest algorithm (SSSE3/AVX/AVX2) (CRYPTO_SHA512_SSSE3) [N/m/y/?] (NEW) y Camellia cipher algorithm (x86_64/AES-NI/AVX2) (CRYPTO_CAMELLIA_AESNI_AVX2_X86_64) [N/m/y/?] (NEW) m Serpent cipher algorithm (x86_64/AVX2) (CRYPTO_SERPENT_AVX2_X86_64) [N/m/y/?] (NEW) m KVM legacy PCI device assignment support (KVM_DEVICE_ASSIGNMENT) [Y/n/?] (NEW) VHOST_SCSI TCM fabric driver (VHOST_SCSI) [N/m/?] (NEW)
make menuconfig

You can now enable any additional modules by navigating the menu structure. Note that most likely you don't have to enable anything in this step, but it can come in handy if there's a major transition (e.g. the way multimedia was handled changed between kernel 3.5 and 3.6) or if you want to enable a previously disabled option.

Then pick either method 1 or 2 below.

If you only want to compile modules that are currently in use (not a good idea if you want to use the same kernel on a range of computers, or have USB devices that aren't currently plugged in) you can do that by using make localmodconfig instead of make oldconfig. I wouldn't recommend it -- in most cases it won't make a faster kernel, and space and memory tends not to be much of in the way of issues these days.


Method 1. kernel-package
Below, change -j2 to -jX, where X is the number of cores in your CPU (not cores+1 or anything funny like that. See other posts on this blog for compilation performance tests)

sudo apt-get install kernel-package
make-kpkg clean
time fakeroot make-kpkg -j2 --initrd kernel_image kernel_headers
sudo dpkg -i ../linux-image-3.10.0_3.10.0-10.00.Custom_amd64.deb ../linux-headers-3.10.0_3.10.0-10.00.Custom_amd64.deb


Took 49 minutes on a 3-core AMD Athlon II, and used ca 7 Gb.

The files are shown below:
-rw-r--r--  1 me me 8.4M Jul  1 16:02 linux-headers-3.10.0_3.10.0-10.00.Custom_amd64.deb
-rw-r--r--  1 me me  32M Jul  1 16:00 linux-image-3.10.0_3.10.0-10.00.Custom_amd64.deb


Method 2. make deb-pkg

make clean
time make deb-pkg -j2
sudo dpkg -i ../linux-firmware-image_3.10.0-2_amd64.deb ../linux-headers-3.10.0_3.10.0-2_amd64.deb ../linux-libc-dev_3.10.0-2_amd64.deb ../linux-image-3.10.0_3.10.0-2_amd64.deb

It took ca 50 minutes, it used ca 7 Gb, and generated the following files:
-rw-r--r-- 1 me me 1.1M Jul  1 16:58 ../linux-firmware-image_3.10.0-2_amd64.deb
-rw-r--r-- 1 me me 9.7M Jul  1 16:58 ../linux-headers-3.10.0_3.10.0-2_amd64.deb
-rw-r--r-- 1 me me 458M Jul  1 17:13 ../linux-image-3.10.0_3.10.0-2_amd64.deb
-rw-r--r-- 1 me me 920K Jul  1 16:58 ../linux-libc-dev_3.10.0-2_amd64.deb

Note the size of the linux-image-3.10.0_3.10.0-2_amd64.deb package.


Difference:
There are a few differences. One method, the kpgk one, is supposedly 'deprecated', but it's been working fine in the past and will work fine in the future (at least for some time).

The make dev-pkg method also generates a much bigger image file -- by a factor of 15 or so.

Otherwise, the chief difference, from what I can see, is if you want to uninstall the kernel. Remove the libc-dev package requires you to downgrade the package using apt-get, by specifying a version e.g.

apt-cache policy linux-libc-dev
linux-libc-dev: Installed: 3.10.0-1 Candidate: 3.10.0-1 Version table: *** 3.10.0-1 0 100 /var/lib/dpkg/status 3.9.6-1~bpo70+1 0 100 http://ftp.iinet.net.au/debian/debian/ wheezy-backports/main amd64 Packages 100 http://ftp.debian.org/debian/ wheezy-backports/main amd64 Packages 3.2.46-1 0 500 http://ftp.iinet.net.au/debian/debian/ wheezy/main amd64 Packages 3.2.41-2+deb7u2 0 500 http://security.debian.org/ wheezy/updates/main amd64 Packages
sudo apt-get install linux-libc-dev=3.9.6-1~bpo70+1

I personally prefer the kernel-package approach.

467 wget and tor issue

Ever since I set up Tor on my debian workstation I've been having issues using wget:
wget https://www.kernel.org/pub/linux/kernel/v3.x/linux-3.10.tar.xz
--2013-07-01 11:32:35-- https://www.kernel.org/pub/linux/kernel/v3.x/linux-3.10.tar.xz Connecting to 127.0.0.1:9050... connected. Proxy tunneling failed: Tor is not an HTTP ProxyUnable to establish SSL connection.
For fun I also tried torify although I don't want to download the kernel via Tor:

torify wget https://www.kernel.org/pub/linux/kernel/v3.x/linux-3.10.tar.xz
--2013-07-01 11:36:20-- https://www.kernel.org/pub/linux/kernel/v3.x/linux-3.10.tar.xz Connecting to 127.0.0.1:9050... 11:36:20 libtorsocks(26463): connect: Connection is to a local address (127.0.0.1), may be a TCP DNS request to a local DNS server so have to reject to be safe. Please report a bug to http://code.google.com/p/torsocks/issues/entry if this is preventing a program from working properly with torsocks. failed: No such file or directory. Retrying.
Note that I DON'T want to use wget with Tor. I don't want to eat up bandwidth on the Tor network for stuff like this. When I use wget I want to use a direct connection.

I haven't configured /etc/wgetrc and so I was a bit surprised that this kept on happening.

The solution:
edit /etc/wgetrc and put
use_proxy=off

anywhere. And you're done.

28 June 2013

466. morph xyz -- python script to morph .xyz files

Rather naively I was hoping that by comparing two  molecule .xyz files and generating an average of them I would be able to conveniently generate a half-decent transition state guess.

Turns out that it's not quite as simple. However, I've written the software, so I might as well share it.

Note that it's written in python 2.7 (i.e. not python 3)

Run the script without arguments for help. General usage is
morphxyz -i 1.xyz 2.xyz -o morph.xyz


So here it is:

morphxyz:
#!/usr/bin/python

import sys

def getvars(arguments):
 switches={}

 version='0.1'
 
 try:
  if "-i" in arguments:
   switches['in_one']=arguments[arguments.index('-i')+1]
   switches['in_two']=arguments[arguments.index('-i')+2]
   print 'Input: %s and %s'% (switches['in_one'],switches['in_two'])
  else:
   arguments="--help";
 except:
  arguments="--help";
  
 try:
  if "-o" in arguments:
   switches['o']=arguments[arguments.index('-o')+1].lower()
   print 'Output: %s'% switches['o']
  else:
   arguments="--help";
 except:
  arguments="--help";

 try:
  if "-w" in arguments:
   switches['w']=float(arguments[arguments.index('-w')+1])
   print 'Weighting: %i'% switches['w']
  else:
   print 'Assuming no weighting'
   switches['w']=1.0;
 except:
  switches['w']=1.0;

 doexit=0
 try:
  if ("-h" in arguments) or ("--help" in arguments):
   print '\t\t bytes2words version %s' % version
   print '\t-i\t two xyz files to morph'
   print '\t-o\t output file'
   print '\t-w\t weight one structure vs the other (1=average; 0=start; 2=end)'
   print 'Exiting'
   doexit=1
 except:
  a=0 # do nothing
 if doexit==1:
  sys.exit(0)

 return switches

def getcmpds(switches):
 
 cmpds={}
 
 g=open(switches['in_one'],'r') 
 n=0
 xyz=[]
 atoms=[]
 
 for line in g:
  n+=1
  line=line.rstrip('\n')
  if n==1:
   cmpds['atoms_one']=int(line)
  elif n==2:
   cmpds['title_one']=line
  else:
   line=line.split(' ')
   line=filter(None,line)
   xyz+=[[float(line[1]),float(line[2]),float(line[3])]]
   atoms+=[line[0].capitalize()]
 cmpds['coords_one']=xyz
 cmpds['elements_one']=atoms
 
 g.close
 
 g=open(switches['in_two'],'r') 
 n=0
 xyz=[]
 atoms=[]
 
 for line in g:
  n+=1
  line=line.rstrip('\n')
  if n==1:
   cmpds['atoms_two']=int(line)
  elif n==2:
   cmpds['title_two']=line
  else:
   line=line.split(' ')
   line=filter(None,line)
   xyz+=[[float(line[1]),float(line[2]),float(line[3])]]
   atoms+=[line[0].capitalize()]
 cmpds['coords_two']=xyz
 cmpds['elements_two']=atoms
 g.close
 
 cmpds['w']=switches['w']
 
 return cmpds

def morph(cmpds):
 coords_one=cmpds['coords_one']
 coords_two=cmpds['coords_two']
 
 coords_morph=[]
 coords_diff=[]
 for n in range(0,cmpds['atoms_one']):
  morph_x=coords_one[n][0]+cmpds['w']*(coords_two[n][0]-coords_one[n][0])/2.0
  morph_y=coords_one[n][1]+cmpds['w']*(coords_two[n][1]-coords_one[n][1])/2.0
  morph_z=coords_one[n][2]+cmpds['w']*(coords_two[n][2]-coords_one[n][2])/2.0
  diff_x=coords_two[n][0]-coords_one[n][0]
  diff_y=coords_two[n][1]-coords_one[n][1]
  diff_z=coords_two[n][2]-coords_one[n][2]
  coords_morph+=[[morph_x,morph_y,morph_z]]
  coords_diff+=[[diff_x,diff_y,diff_z]]
 cmpds['coords_morph']=coords_morph
 cmpds['coords_diff']=coords_diff
 return cmpds

def genxyzstring(coords,element):
 x_str='%10.5f'% coords[0]
 y_str='%10.5f'% coords[1]
 z_str='%10.5f'% coords[2]
 
 xyz_string=element+(3-len(element))*' '+10*' '+\
 (8-len(x_str))*' '+x_str+10*' '+(8-len(y_str))*' '+y_str+10*' '+(8-len(z_str))*' '+z_str+'\n'
 
 return xyz_string

def writemorph(cmpds,outfile):
 g=open(outfile,'w') 
 h=open('diff.xyz','w')
 g.write(str(cmpds['atoms_one'])+'\n'+'\n')
 h.write(str(cmpds['atoms_one'])+'\n'+'\n')
 
 for n in range(0,cmpds['atoms_one']):
  coords=cmpds['coords_morph'][n]
  diffcoords=cmpds['coords_diff'][n]
  
  g.write(genxyzstring(coords, cmpds['elements_one'][n]))
  h.write(genxyzstring(diffcoords, cmpds['elements_one'][n]))
    
 g.close
 h.close
 return 0

if __name__=="__main__":
 arguments=sys.argv[1:len(sys.argv)]
 switches=getvars(arguments)
 cmpds=getcmpds(switches)
 
 if cmpds['atoms_one']!=cmpds['atoms_two']:
  print 'The number of atoms differ. Exiting'
  sys.exit(1)
 elif cmpds['elements_one']!=cmpds['elements_two']:
  print 'The types of atoms differ. Exiting'
  sys.exit(1)
  
 cmpds=morph(cmpds)
 success=writemorph(cmpds,switches['o'])
 if success==0:
  print 'Conversion seems successful'