22 May 2012

159. PES scanning of methanol bonds, angles, torsion using nwchem, nwgeom and python

NOTE: there's something dodgy with the potential/bond length plots -- they optimal bond lengths are way too long. I'll leave this post up here anyway, but be WARNED.

This is more of an overview of an idea of how to do it together with some explicit examples. This is more of a sketch than a step-by-step account.

Today's molecule is Methanol.

0. ecce and nwgeom/python
You need to set PYTHONPATH to /opt/nwchem/nwchem-6.1/contrib/python in order for nwchem to find the nwgeom module. That's easy enough on a local system since ~/.bashrc is read -- but it won't read ~/.bashrc on remote systems. For this you need to edit your CONFIG.<machine> files (in ~/.ECCE on your main node)  -- add
setup {
     setenv PYTHONPATH /opt/nwchem/nwchem-6.1/contrib/python
}
and/or

NWChemEnvironment {
          PYTHONPATH /opt/nwchem/nwchem-6.1/contrib/python
}      

1. Draw the molecule,
Draw the Carbon, then the oxygen, then the protons on the carbon, then the protons on the oxygen. Basically, draw the backbone first, then add protons by hand.  Turn it into a residue-based system (under 'build') and optimise the structure using e.g. RHF/6-31G* (this was written with MM/FF parametrisation in mind -- for simple scanning, just do whatever you want )

2. Calculate the partial charges (rhf/6-31g*).  Can skip this
You can e.g. constrain the methyl groups, or force all the methyl protons to be equal. It's a bit of a soft science, really. After the calc has finished, assign charges. I can't claim to understand which method is better (RESP, CRESP, CRESP2 etc.).

 (this was written with MM/FF parametrisation in mind -- for simple scanning, skip this step)

Here's CRESP2 (some variability...):
C -0.721
O -0.368
H 0.240
H 0.240
H 0.240
H 0.368

Also, assign (atom) Types (CT, OH, HC, HC, HC, HO) -- this is done by hand. Pick atom table, select residue view (or something similar), and fill in the Type column.

Then click on Tools, check Residue table, click on the menu-looking icon in the residue view, and select write fragment. Make sure you put the fragment file in a place where ecce and nwchem will find it (e.g. amber_u). For some reason I can't get ecce to actually change the name of my residues, so edit the fragment file by hand and change all instance of UNK to the same name as the fragment file, e.g. TST if you called it TST.frg.


3. Write down the bonds, angles and dihedral angles.
Bonds
H-C 1.087
C-O 1.400
O-H 0.946

Angles
H-O-C 109.467
O-C-H 112.039
H-C-H 108.682

Torsion
H-C-O-H -61.229

When you scan the parameters using python you want to be able to
1) see if the lowest energy conf make sense and
2) not deviate too much from the ideal angle/bond/torsion.

Things get weird if you do.

4. Try to determine bond strength
This is best done outside of ecce, and you really should have compiled nwchem with python-support to make this easier.

Copy the input file you used for the ESP calc. Call it bonds.nw. Remove anything about esp and all task directives, then add:


Technically I think the bond strengths only really need two data points unless you want to fit the Lennard-Jones equation to them, but it certainly looks neater getting the full behaviour.

Then run using
mpirun -n 2 nwchem bonds.nw | tee bond12.nw

You can also do it in ecce -- but the plotting will have to be done by hand (I open the out file with vim, select the energy/atom-distance columns, :w angle.dat and then plot with gnuplot)

Do the same for atom pair 1-6 and 2-3. Make sure to pull the atoms far enough apart that the energy tails out (the 1-6 pair in the figure needs to be separated more)


5. Angles
As you'll discover, it's not just a matter of throwing in random numbers and scanning -- if you don't collect enough points, or if your first point is far away from the optimal angle, the data will look very odd.




6. Torsion






21 May 2012

158. Setting up ecce with qsub at An Australian University computational cluster

EDIT: this works for G09 on that particular cluster. Come back in a week or two for a more general solution (end of May 2012/beginning of June 2012).

I don't feel comfortable revealing where I work. But imagine that you end up working at an Australian University in, say, Melbourne. I do recognise that I will be giving enough information here to make it possible to identify  who I am (and there are many reasons not to want to be identifiable -- partly because students can be mean and petty, and partly because I suffer from the delusion that IT rules apply to Other People, and not me -- and have described ways of doing things you're not supposed to be doing in this blog)

Anyway.

My old write-ups of ecce are pretty bad, if not outright inaccurate. Anyway, I presume that in spite of that you've managed to set up ECCE well enough to run stuff on nodes of your local cluster.

Now it's time for the next level -- on a remote site using SGE/qsub

So far I've only tried this out with G09 -- they are currently looking to set up nwchem on the university cluster. Not sure what the best approach to the "#$ -pe g03_smp2 2" switch is for nwchem.

--START HERE --

EVERYTHING I DESCRIBE IS DONE ON YOUR DESKTOP, NOT ON THE REMOTE SYSTEM. Sorry for shouting, but don't got a-messing with the remote computational cluster -- we only want to teach ecce how to submit jobs remotely. The remote cluster should be unaffected.

1. Creating the Machine
To set up a site with a queue manager, start
ecce -admin

Do something along the lines of what's shown in the figure above.

If you're not sure whether your qsub belongs to PBS or SGE, type qstat -help and look at the first line returned, e.g. SGE 6.2u2_1.

2. Configure the site
Now, edit your ecce-6.3/apps/siteconfig/CONFIG.msgln4  (local nodes go into ~/.ECCE  but remote SITES go in apps/siteconfig --  and that's what we're working with here).

   NWChem: /usr/local/bin/NWCHEM
   Gaussian-03: /usr/local/bin/G09
   perlPath: /usr/bin/perl
   qmgrPath: /usr/bin/qsub
 
   SGE {
   #$ -S /bin/csh
   #$ -cwd
   #$ -l h_rt=$wallTime
   #$ -l h_vmem=4G
   #$ -j y
   #$ -pe g03_smp2 2

   module load gaussian/g09
    }
A word of advice -- open the file in vim (save using :wq!) or do a chmod +w on it first since it will be set to read-only by default.


3. Queue limits
The same goes for the next file, which controls various job limits, ecce-6.3/apps/siteconfig/msgln4.Q:
# Queue details for msgln4
Queues:    squ8

squ8|minProcessors:       2
squ8|maxProcessors:       6
squ8|runLimit:       4320
squ8|memLimit:       4000
squ8|scratchLimit:       0
4. Connect
In the ecce launcher-mathingy click on Machine Browser, and Set Up Remote Access for the remote cluster. Basically, type in your user name and password.

Click on machine status to make sure that it's connecting

5.Test it out!
If all is well you should be good to go

157. Restarting gaussian (g09) job on an SGE system (qsub)

The Australian University I'm working at has a computational cluster where jobs are submitted using qsub. This post is more like a personal note to myself, but the point about resubmitting jobs may be of use to someone.

This page is useful reading for how these types of scripts with mixed shell and gaussian stuff work: http://www.gaussian.com/g_tech/g_ur/m_running.htm

0. The qsub header (Notes To Myself)

http://cf.ccmr.cornell.edu/cgi-bin/w3mman2html.cgi?qsub(1B)
#$ -S /bin/sh Shell. Can be csh, tcsh etc..
#$ -cwd execute in Current Working Directory
#$ -l h_rt=12:00:00 Maximum allowed run time in hours. -l is a list with resource limits.
#$ -l h_vmem=4G Memory limit. http://www.biostat.jhsph.edu/bit/cluster-usage.html#MemSpec. Should match %mem 4000mb

#$ -j y"-j join. Declares if the standard error stream of the job will be merged with the standard output stream of the job." It creates *.o* and *.p* files with what would've been echoed in the terminal.

#$ -pe g03_smpX XSeems to stand for parallel execution. X would be the number of slots it seems. The g03_smpX seems to be the message passing interface, but not entirely sure. 

1. Setting up simple jobs
First create a standard template, let's call it qsub.header

#!/bin/sh
#$ -S /bin/sh
#$ -cwd
#$ -l h_rt=12:00:00
#$ -l h_vmem=4G
#$ -j y
#$ -pe g03_smp2 2
module load gaussian/g09
time G09 << END > g09_output.log
and save it in your home folder ~.

Create an input file, e.g. water.in and put it in your work directory, e.g. ~/g09


%chk=water
#P ub3lyp/6-31G* opt

water energy

0  1
O
H  1  1.0
H  1  1.0  2  120.0

Put them together:
cat ~/qsub.header > water.qsub
cat ~/g09/water.in >>water.qsub

#!/bin/sh
#$ -S /bin/sh
#$ -cwd
#$ -l h_rt=12:00:00
#$ -l h_vmem=4G
#$ -j y
#$ -pe g03_smp2 2
module load gaussian/g09
time G09 << END > g09_output.log

%chk=water.chk
%nprocshared=6
#P ub3lyp/6-31G* opt

water energy

0  1
O
H  1  1.0
H  1  1.0  2  120.0

2. Restarting jobs

Most likely your home folder is shared across the nodes via nfs.

To find out, submit
#!/bin/sh
#$ -S /bin/sh
#$ -cwd
#$ -l h_rt=12:00:00
#$ -l h_vmem=4G
#$ -j y
#$ -pe g03_smp2 2
pwd
ls -lah
tree -L 1 -d
to get some directory information about the nodes.

Once you have that, just put the absolute path to your .chk file in your restart script, e.g.

#!/bin/sh
#$ -S /bin/sh
#$ -cwd
#$ -l h_rt=12:00:00
#$ -l h_vmem=4G
#$ -j y
#$ -pe g03_smp2 2
module load gaussian/g09
time G09 << END > g09_output.log

%chk=/nfs/home/hpcsci/username/g09/water.chk
%nprocshared=2
#P ub3lyp/6-31G* opt guess=read geom=allcheck