29 September 2012

249. Quick but precise isotopic pattern (isotope envelope) calculator in Octave

UPDATE: Below is an accurate calculator,  but it is impractically slow for large molecules. A practical AND accurate calculator is found here:http://verahill.blogspot.com.au/2012/10/isotopic-pattern-caculator-in-python.html

Use the post below to learn about the fundamental theory, but then look at the other post to understand how to implement it.

Old post:
Getting fast and accurate isotopic patterns can be tricky using tools available online, for download or which form part of commercial packages. A particular problem is that different tools give slightly different values -- so which one to trust?

The answer: the tool for which you know that the algorithm is sound.

The extreme conclusion of that way of thinking is to write your own calculator.
Below is the conceptual process of calculating the isotopic pattern of a molecule using GNU Octave.

You need the linear algebra package:
sudo apt-get install octave octave-linear-algebra

b is the isotopic distribution for an element, and bb are the masses of those isotopes.

Once you've got a computational engine it's not too difficult to expand it for more general cases, account for charge, and instrument resolution.


Molecule: Cl4

b=[0.7578,0.2422];
bb=[34.96885,36.96885];
e=prod(cartprod(b,b,b,b),2);
ee=sum(cartprod(bb,bb,bb,bb),2);
n=4;
g=histc([ee e],linspace(min(ee),max(ee),n*(max(ee)-min(ee)+1)),2);
h=linspace(min(ee),max(ee),n*(max(ee)-min(ee)+1));
distr=e'*g;
plot(h,100.*distr/max(distr))
[h' (100.*distr/max(distr))']
Here's the output for n=1:
   139.87540    78.22048
   140.87540     0.00000
   141.87540   100.00000
   142.87540     0.00000
   143.87540    47.94141
   144.87540     0.00000
   145.87540    10.21502
   146.87540     0.00000
   147.87540     0.81620

And here's the output from Matt Monroe's calculator:
Isotopic Abundances for Cl4
  Mass/Charge Fraction  Intensity
   139.87541 0.3297755   78.22
   140.87541 0.0000000    0.00
   141.87541 0.4215974  100.00
   142.87541 0.0000000    0.00
   143.87541 0.2021197   47.94
   144.87541 0.0000000    0.00
   145.87541 0.0430662   10.22
   146.87541 0.0000000    0.00
   147.87541 0.0034411    0.82


Another molecule: Li2Cl2

Here's the code:
a=[0.0759,0.9241];
aa=[6.01512,7.01512];
b=[0.7578,0.2422];
bb=[34.96885,36.96885];
e=prod(cartprod(a,a,b,b),2);
ee=sum(cartprod(aa,aa,bb,bb),2);
n=1;
g=histc([ee e],linspace(min(ee),max(ee),n*(max(ee)-min(ee)+1)),2);
h=linspace(min(ee),max(ee),n*(max(ee)-min(ee)+1));
distr=e'*g;
plot(h,100.*distr/max(distr))
[h' (100.*distr/max(distr))']

ans =

    81.96794     0.67170
    82.96794    16.35626
    83.96794   100.00000
    84.96794    10.45523
    85.96794    63.71604
    86.96794     1.67079
    87.96794    10.17116

vs Matt Monroe's calculator:
Isotopic Abundances for Li2Cl2
  Mass/Charge Fraction  Intensity
    81.96795 0.0033082    0.67
    82.96795 0.0805564   16.36
    83.96795 0.4925109  100.00
    84.96795 0.0514932   10.46
    85.96795 0.3138084   63.72
    86.96795 0.0082288    1.67
    87.96795 0.0500941   10.17

We can then expand the code to allow for plotting
a=[0.0759,0.9241];
aa=[6.01512,7.01512];
b=[0.7578,0.2422];
bb=[34.96885,36.96885];
e=prod(cartprod(a,a,b,b),2);
ee=sum(cartprod(aa,aa,bb,bb),2);
n=1;

g=histc([ee e],linspace(min(ee),max(ee),n*(max(ee)-min(ee)+1)),2);
h=linspace(min(ee),max(ee),n*(max(ee)-min(ee)+1));
distr=e'*g;
gauss= @(x,c,r,s) r.*1./(s.*sqrt(2*pi)).*exp(-0.5*((x-c)./s).^2);
k=100.*distr/max(distr);

npts=1000;
resolution=0.25;

x=linspace(min(ee)-1,max(ee)+1,npts);
l=cumsum(gauss(x,h',k',resolution));
l=100*l./max(l(rows(l),:));
plot(x,l(rows(l),:))

which gives:

Compare with Matt Monroe's calculator:

28 September 2012

248. Matt Monroe's Molecular Weight Calculator under Wine on Linux

I've downloaded the source code to Matt Monroe's molecular weight calculator in the past, and having replaced wsearch32 (+wine) with OpenChrom I figured I'd go online, download it and see what Mono can do for me. I had a vague recollection that the source code was only freely available online for a short while, and as it turns out I couldn't find it this time.

Anyway, not finding the source code I decided to update my Molecular Weight calculator from version 6.46 to 6.49 which (finally) allows you to set the charge of an ion WITHOUT having the mass of a H+ added for each charge. It's not difficult to compensate for, but it's always confusing to new students.

1. Install Wine and winetricks, add dlls
You can either install wine from the repos (old version)
sudo apt-get install wine-bin

Or you can download a newer, unstable version from dev.carbon-project.org:
http://verahill.blogspot.com.au/2012/01/debian-testingwheezy-64-bit-installing.html

Or you can compile your own:
http://verahill.blogspot.com.au/2013/01/308-compiling-wine-1521-on-debian.html



The mono step was a right headache and would fail unless I nuked everything winetricks and wine knew about each other/

To get winetricks and set everything up:

sudo apt-get install cabextract
wget http://winetricks.org/winetricks
sudo mv winetricks /usr/local/bin/
sudo chmod +x /usr/local/bin/winetricks
wget http://downloads.sourceforge.net/project/wine/Wine%20Mono/0.0.4/wine-mono-0.0.4.msi
wine msiexec /i wine-mono-0.0.4.msi
You're now asked whether to download and install mono...sigh...more often that not this has failed in the past.
winetricks vcrun6sp6
Download the file from the browser window that just opened
cd ~/.cache/winetricks/vcrun6sp6 
mv ~/Downloads/Vs6sp6.exe .
winetricks vcrun6sp6
winetricks corefonts
winetricks riched30
wine uninstaller --remove '{E45D8920-A758-4088-B6C6-31DBB276992E}'
winetricks dotnet20
cd ~/.cache/winetricks/dotnet20/
mv ~/Downloads/dotnetfx.exe .
winetricks dotnet20
Ignore this error. Installation will take a while after that. Have patience. Like 10 minutes kind of patience.
And finally:
winetricks wsh57


2. Download the molecular weight calculator and install
If you go to http://www.alchemistmatt.com/mwtwin.html
you get redirected to here: http://omics.pnl.gov/software/MWCalculator.php


cd ~/tmp
mkdir molw
cd molw/
wget http://omics.pnl.gov/installers/MolecularWeightCalculator_Installer.zip
wget http://omics.pnl.gov/installers/MwtWinDll_SourceAndSupportingDLLs.zip
ls *.zip|xargs -I {} unzip {}
unzip MwtWinDll_Source_v3.4.4518.zip

You'll get some warning about Revisionhistory.txt etc. being overwritten. That's fine.

Launch the install with
wine msiexec /i MolecularWeightCalculator.msi




If you try to launch the mol weight calculator at this point you'll get an error about a missing MwtWinDLL.dll:

So sort that out:

cd ~/tmp/molw/bin
regsvr32 MwtWinDll.dll
Successfully registered DLL MwtWinDll.dll
[If I tried to copy the dll to the wine structure first and register that copy I got:
DllRegisterServer not implemented in DLL C:\windows\system\MwtWinDll.dll]
If it seems weird to install wine-mono and then remove it as is done above, it's to get around a bug which causes dotnet20 installation to fail/


Anyway, you're pretty much done:
 wine ~/.wine/drive_c/Program\ Files/Molecular\ Weight\ Calculator/mwtwin.exe

Yay!






Comment:
Getting there was a bit of a trek, passing though a whole lot of different sets of dlls:
winetricks msflxgrd
winetricks vcrun2005
winetricks vb6run
winetricks mdac28
winetricks comctl32ocx
winetricks comctl32

The solution above should suffice though.

I even ended up installing mol weight calc on a windows box and using dependency walker, but not even that sorted it out -- googling for scrrun did it in the end.

In particular this last error was bloody annoying:
"Object doesn't support this action." What, saving?

"Error saving default options file. Use the /X switch at the command line to prevent this error."
But it was solved by doing winetricks wsh57

27 September 2012

247. Setting up Openchrom (and using it to open Agilent .D ESI-MS files on Linux)

I've been using Wsearch (http://www.wsearch.com.au/wsearch32/wsearch32.htm) to process agilent chemstation ESI-MS spectra for the past few years. It and Matt Monroe's Molecular Weight calculator (why, oh why is there no comparable molecular weight calculator for linux?) have been the only two reasons why I've bother with Wine under Linux. Openchrom is written in java and so will run on both Good (Linux) and Evil (OS X and Win) operating systems.

Having finally discovered openchrom (v 0.6 so still early days) I can now finally retire wsearch from my own computers (it's still a good piece of software, but it's crippled to encourage the purchasing of a 'full' version, and I've had no luck purchasing a license from the author in spite of having tried several times during the past couple of years). OpenChrom can export an entire agilent experiment as a '3D' csv file which makes processing a lot more fun.

As an aside: I hate proprietary file formats since they prevent me from using my own tools (cat, sed, gawk, gnuplot, octave) when processing -- or at a minimum make it more difficult. Most universities and grant agencies now add a provision regarding data management in their grant acceptance agreements/work conduct policies. In general these provisions state that the data shall be made available publicly and /or managed by a university repository. What is REALLY missing is a clause about using open formats -- and that should be taken into account when acquiring new instrumentation. All else being equal, an instrument which is 'open' will be a lot cheaper to manage in the long run since you won't have to feel locked in in terms of software. That's incidentally a reason why I like Metrohm since they provide details of their RS-232 interface allowing you to write your own software.

Anyway, here's how to get set up:


1. Install Java v1.7 (need > 1.6)
You can either use openjdk 7 or (Oracle) Java. See here for a general guide to installing Oracle/Sun Java.

As for openjdk, you can easily install it:
sudo apt-get install openjdk-7-jdk

(the openjdk-7-jre package is enough if you don't want the full developer's kit)

Anyway.

Make sure that you've selected the right version:
 sudo update-alternatives --config java
There are 7 choices for the alternative java (providing /usr/bin/java).

  Selection    Path                                            Priority   Status
------------------------------------------------------------
  0            /usr/lib/jvm/java-6-openjdk-amd64/jre/bin/java   1061      auto mode
  1            /usr/bin/gij-4.4                                 1044      manual mode
  2            /usr/bin/gij-4.6                                 1046      manual mode
  3            /usr/bin/gij-4.7                                 1047      manual mode
  4            /usr/lib/jvm/j2re1.6-oracle/bin/java             314       manual mode
  5            /usr/lib/jvm/j2sdk1.6-oracle/jre/bin/java        315       manual mode
  6            /usr/lib/jvm/java-6-openjdk-amd64/jre/bin/java   1061      manual mode
 *7            /usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java   1051      manual mode



2. Get openchrom
cd ~/tmp
wget http://sourceforge.net/projects/openchrom/files/REL-0.6.0/openchrom_linux.gtk.x86_64_0.6.0.zip
unzip openchrom_linux.gtk.x86_64_0.6.0.zip
cd linux.gtk.x86_64/OpenChrom/
sudo mkdir /opt/openchrom
sudo chown $USER /opt/openchrom 
cp * -R /opt/openchrom
chmod +x /opt/openchrom/openchrom

Stick

alias openchrom='/opt/openchrom/openchrom'

in your ~/.bashrc and source it.




3. Get plugins
On first boot you're asked whether you want to get additional plugins using the 'Openchrom marketplace'. Since I'm mainly processing data from an Agilent ESI-MS, I wanted the plugin for Agilent files. The website says that you need a license key for plugins BUT that it's free to register for one.

This is a 30-days trial version. Afterwards, you need a valid serial key.
You can get a free serial key after registration on http://www.openchrom.net.
You can use the converter for commercial or non-commercial purposes free of charge, but you are not allowed to redistribute this software without my permission.

Note, that clicking on links on the website didn't lead me to a link to download the plugin. Instead, in OpenChrom click on the Plug-ins menu:






As always, make sure you trust your suppliers.


And then you're done installing.

There's nothing odd about registering other than this: you will receive an email with a confirmation of your registration in clear text WITH YOUR PASSWORD. So...be aware of that.


4. You can now browse in the tree to the left and select your .D folder:

There's a bit of clever thinking when it comes to the functionality of the program. The upside of this I think will eventually be that it's easy to get a consistent experience for a set of users (not unimportant for a research group). The downside is that it's a bit clunky getting started. Play with it for an hour and you'll get the hang of it, so it's not really that much of a hurdle. Also, too many options seem to be context sensitive -- I am having real trouble finding various options under the 'Accurate' perspective which I can find under the 'default' perspective.


5. Some comments:

It's still early days for OpenChrom (v 0.6) , and there are a few minor issues which may or may not affect you:

* Registration keys. They are easy enough to get (register online, log in, click on the plug in online that you want and you'll see the key), but if you have installed a new plug in and open your first spectrum right after that you'll be asked for registration keys. It won't tell you for which plug in the dialogue you're seeing is though, so if you've just installed three different plug ins you'll have to do some trial-and-error. This is fixed in the upcoming version.

* Raw/gaussian plot of mass spectrum. This took a while to figure out, but you have to use perspectives. The default (heavily zoomed in) view looks like this:
This might be good enough for those organic types...us 108 element inorganic types want more detail
If you go to the top right corner, click on 'other' (perspectives)
and select accurate you get
Bingo!
And then there's the obligatory wish list:

* A good quality isotopic pattern calculator would be nice. Anyone who has compared the output from different pieces of software will have discovered that different calculators may yield very different patterns. I think some of it boils down to truncation rather than incorrect isotopic ratios, but that just highlights how difficult it can be to implement a seemingly simple concept. The only calculator which I trust AND find useful is Matt Monroe's calculator -- the predicted patterns look good, and you get proper Gaussian broadening which means that it looks 'right'. This would be perfect as a plug-in. If only I knew how to properly implement it...

* A good quality ion generator -- some pieces of software (Hi Matt) allow you to select a handful of elements or fragments, pick a range of charges, input an m/z value and based on that spits out a list over possible identities for your signal. It's a good thing to have by your side the first time you look at a complex mixture trying to figure out what products may be present. This would be perfect as a plug-in. I've written this type of programmes before, but in python using for-loops...a vectorized version should be faster and maybe even easier to write.