17 June 2013

456. Adding NWChem basis sets to ECCE. Part 2. A solution: nwchem2ecce.py

UPDATED!

I've moved the finished scripts to here:
https://sourceforge.net/projects/nwbas2ecce/

They work! I've also added a number of converted basis sets to the sourceforge repo under 'examples'. You'll also find example ecp and ECPOrbital files.

Phew...

Here's the README:
The programmes are not 'intelligent' -- they won't check that you are doing something reasonable. Bad input = bad output. __Installation__: Download eccepag and nwbas2ecce They are both python (2.7) programmes, so you will need to install python to run them. On linux, this is normally very easy. E.g. on debian, run 'sudo apt-get install python2.7' and you are done. If you want, you can put the files in /usr/local/bin and do 'sudo chmod +x /usr/local/bin/eccepage' 'sudo chmod +x /usr/local/bin/nwbas2ecce' and you will be able to call the scripts from any directory. __Usage__ nwbas2ecce can turn a full basis set, or a, ECP basis set, into an ECCE compatible set of basis set files. Typically, an nwchem basis set consists of a single file, e.g. 3-21g. It can also be divided into several files, e.g. def2-svp and def-ecp, where the effective core potentials (ecps) are in def2-ecp. Other basis set files, like lanl2dz_ecp, contains both the orbital and the contraction parts. Typically, a ECCE basis set suite consists of: basis.BAS basis.BAS.meta basis.POT (for ECP) basis.POT.meta (for ECP) Sometimes polarization and diffuse functions are separated from the main .BAS file. E.g. 3-21++G* consists of 3-21G.BAS 3-21GS.BAS POPLDIFF.BAS , in addition to the meta files. The meta files are just markup-language type files with e.g. references. Note that you don't HAVE to break up the basis set components like this. Since the basis set data can be broken up into smaller files, the overall basis set is defined as an entry in a category file. For example, 3-21G is defined in the category file 'pople', and points to 3-21G.BAS. 3-21G* is also defined in pople, but point to both 3-21G.BAS and 3-21GS.BAS. ECP works in a similar way, by combining a .BAS and a .POT file. Note that the .POT files look different from the .BAS files. nwbas2ecce generates .BAS and .POT files based on whether there are basis/end or ecp/end sections in the nwchem basis set file. If there are both, both POT and BAS files are generated. All these files are contained in server/data/Ecce/system/GaussianBasisSetLibrary Finally, you need to generate .pag and .dir files that go into the server/data/Ecce/system/GaussianBasisSetLibrary/.DAV directory. The .dir file is always empty, while the .pag file is unfortunately a binary file. eccepag can, however, generate it with the right input. See e.g. http://verahill.blogspot.com.au/2013/06/455-adding-nwchem-basis-sets-to-ecce.html for more detailed information __Example__ We'll use def2-svp as an example. The nwchem basis set file def2-svp contains the basis set, while def2-ecp contains the core potentials. Use def2-svp to generate DEF2_SVP.BAS, DEF2_SVP.BAS.meta. Use def2-ecp to generate DEF2_ECP.POT, DEF2_ECP.POT.meta. As part of the generation, .descriptor files are also generated. These contain information that should go into the category file(s). Then generate the .pag files for both the POT and the BAS files, and touch the .dir files into existence. Do like this: nwbas2ecce -i def2-svp -o DEF2_SVP.BAS -n 'def2-svp' nwbas2ecce -i def2-ecp -p DEF2_ECP.POT -n 'def2-ecp' eccepag -n def2-svp -t ECPOrbital -c ORBITAL -y Segmented -s Y -o DEF2_SVP.BAS.pag eccepag -n def2-ecp -t ecp -c AUXILIARY -o DEF2_ECP.POT.pag NOTE: I don't actually know if def2-svp is segmented, and spherical. I don't think it matters for the .pag file generation. Also note that most inputs are case sensitive. Look at a similar .pag file for hints. You now have the following files: DEF2_ECP.POT DEF2_ECP.POT.descriptor DEF2_ECP.POT.meta DEF2_ECP.POT.pag DEF2_SVP.BAS DEF2_SVP.BAS.descriptor DEF2_SVP.BAS.meta DEF2_SVP.BAS.pag Copy the files. Note that you need to select the correct target directory, and that will vary with where you installed ECCE. I'll assume it's in /opt/ecce cp DEF2* /opt/ecce/server/data/Ecce/system/GaussianBasisSetLibrary cd /opt/ecce/server/data/Ecce/system/GaussianBasisSetLibrary mv *.pag .DAV/ touch .DAV/DEF2_SVP.BAS.dir .DAV/DEF2_ECP.POT.dir cat DEF2_SVP.BAS.descriptor >> ECPOrbital cat DEF2_ECP.POT.descriptor >> ECPOrbital cat DEF2_ECP.POT.descriptor >> ecp Edit ECPOrbital so that it reads: name= def2-svp files= DEF2_SVP.BAS DEF2_ECP.POT atoms= H He Li Be B C N O F Ne Na Mg Al Si P S Cl Ar K Ca Sc Ti V Cr Mn Fe Co Ni Cu Zn Ga Ge As Se Br Kr Rb Sr Y Zr Nb Mo Tc Ru Rh Pd Ag Cd In Sn Sb Te I Xe Cs Ba La Hf Ta W Re Os Ir Pt Au Hg Tl Pb Bi Po At Rn atoms= Rb Sr Y Zr Nb Mo Tc Ru Rh Pd Ag Cd In Sn Sb Te I Xe Cs Ba La Hf Ta W Re Os Ir Pt Au Hg Tl Pb Bi Po At Rn
/pre>

455. Adding NWChem basis sets to ECCE. Part 1. The formats

I've written a python script that cam
1. do automatic conversion of nwchem basis set files to .BAS and .POT
2. generate entries that can be added to the category file

What it currently can't do is generate a .pag file.

The python script is not in this post. I'll release it soon though.


The structure:
ECCE stores basis sets in server/data/Ecce/system/GaussianBasisSetLibrary/.

The number of files associated with a basis set varies, and the way a basis set is set up seems to vary as well depending on who added it.

Each basis set needs at least the following files:
basis.BAS
basis.BAS.meta
.DAV/basis.BAS.pag
.DAV/basis.BAS.dir

In addition, the basis set needs to be added to the correct category by being added to one of the following files:

Charge correlation_consistent DFTOrbital diffuse ecp ECPOrbital Exchange other_generally_contracted other_segmented polarization pople rydberg
e.g. 6-31G goes to pople, while LANL2DZ/ECP goes to ECPOrbital.

Looking at the basis set tool in ECCE you have the following categories/subcategories:
Orbital: Pople Shared, Other Segmented, Corr. Consistent, Other Gen. Contr., ECP Orbital, DFT Orbital. Auxiliary: Polarization, Diffuse, Rydberg. ECP: DFT: Charge Fitting, Exchange Fitting.
What it means is that you can 'mix and match' by adding your .BAS or .POT files to different category files (e.g. you can have LANL2DZ dp both ECPOrbital, ecp and polarization, all at the same time. See below for how basis sets can be broken up.

Example: The simple cases: 3-21G, 3-21G*, 3-21++G*
For a basis set like 3-21G there are two files: 3-21G.BAS and 3-21G.BAS.meta.
In addition grep shows that there's an entry in the file pople for 3-21G.

The .BAS file:
The entry for C in 3-21G.BAS looks like this:
atom=C contraction shell=S num_primitives=3 num_coefficients=1 172.2560 0.0617669 25.91090 0.358794 5.533350 0.700713 contraction shell=SP num_primitives=2 num_coefficients=2 3.664980 -0.395897 0.236460 0.770545 1.215840 0.860619 contraction shell=SP num_primitives=1 num_coefficients=2 0.195857 1.000000 1.000000
Nothing too strange. For example, the nwchem format for C in 3-21g is:
basis "C_3-21G" CARTESIAN C S 172.2560000 0.0617669 25.9109000 0.3587940 5.5333500 0.7007130 C SP 3.6649800 -0.3958970 0.2364600 0.7705450 1.2158400 0.8606190 C SP 0.1958570 1.0000000 1.0000000 end
Writing a python script that translates between the two is simple.


The .BAS.meta file:
The 3-21G.BAS.meta file looks like this:
references Elements References -------- ---------- H - Ne: J.S. Binkley, J.A. Pople, W.J. Hehre, J. Am. Chem. Soc 102 939 (1980) Na - Ar: M.S. Gordon, J.S. Binkley, J.A. Pople, W.J. Pietro and W.J. Hehre, J. Am. Chem. Soc. 104, 2797 (1983). K - Ca: K.D. Dobbs, W.J. Hehre, J. Comput. Chem. 7, 359 (1986). Ga - Kr: K.D. Dobbs, W.J. Hehre, J. Comput. Chem. 7, 359 (1986). Sc - Zn: K.D. Dobbs, W.J. Hehre, J. Comput. Chem. 8, 861 (1987). Y - Cd: K.D. Dobbs, W.J. Hehre, J. Comput. Chem. 8, 880 (1987). Cs : A 3-21G quality set derived from the Huzinage MIDI basis sets. E.D. Glendening and D. Feller, J. Phys. Chem. 99, 3060 (1995) references info 3-21G Split Valence Basis ------------------------- Elements Contraction References H - He: (3s) -> [2s] J.S. Binkley, J.A. Pople and W.J. Hehre, Li - Ne: (6s,3p) -> [3s,2p] J. Am. Chem. Soc. 102, 939 (1980). Na - Ar: (9s,6p) -> [4s,3p] M.S. Gordon, J.S. Binkley, J.A. Pople, W.J. Pietro and W.J. Hehre, J. Am. Chem. Soc. 104, 2797 (1983) K - Ca: (12s,9p) -> [5s,4p] K.D. Dobbs, W.J. Hehre, J. Comput. Chem. 7, Ga - Kr: (12s,9p,3d) -> [5s,4p,1d] 359 (1986). Sc - Zn: (12s,9p,3d) -> [5s,4p,2d] K.D. Dobbs, W.J. Hehre, J. Comput. Chem. 8, 861 (1987). Rb - Sr: (15s,12p,3d)-> [6s,5p,1d] Y - Cd: (15s,12p,6d)-> [6s,5p,3d] K.D. Dobbs, W.J. Hehre, J. Comput. Chem. 8, In - I: (15s,12p,6d)-> [6s,5p,2d] 880 (1987). Cs : (18s,12p,6d)-> [6s,5p,2d] A 3-21G quality set derived from the Huzinage MIDI basis sets. E.D. Glendening and D. Feller, J. Phys. Chem. 99, 3060 (1995). The 3-21G basis set contains the same number of Gaussian primitives as the STO-3G basis, but the valence electrons are described with two functions per AO instead of one. In most cases the 3-21G basis set gives results which are as good as the more expensive 4-31G and 6-31G sets. 3-21G Atomic Energies ROHF State UHF (noneq) ROHF (noneq) ROHF(equiv) HF Limit (equiv) ----- ---------- ----------- ----------- --------- H 2-S -0.496199 -0.496199 -0.496199 -0.50000 He 1-S -2.835680 -2.835680 -2.835680 -2.86168 Li 2-S -7.381513 -7.381513 -7.381513 -7.43273 Be 1-S -14.486820 -14.486820 -14.486820 -14.57302 B 2-P -24.389762 -24.389634 -24.148989 -24.52906 C 3-P -37.481070 -37.480389 -37.480389 -37.68862 N 4-S -54.105390 -54.103658 -54.103658 -54.40094 O 3-P -74.393657 -74.392512 -74.391782 -74.80940 F 2-P -98.845009 -98.844645 -98.844230 -99.40935 Ne 1-S -127.132546 -127.803824 -127.803824 -128.54710 Na 2-S -160.854064 -160.854041 -160.854041 -161.85891 Mg 1-S -198.468103 -198.468103 -198.468103 -199.61463 Al 2-P -240.551046 -240.551024 -240.551010 -241.87671 Si 3-P -287.344431 -287.344419 -287.344393 -288.85436 P 4-S -339.000079 -339.000027 -339.000027 -340.71878 S 3-P -395.551336 -395.551083 -395.550591 -397.50490 Cl 2-P -457.276552 -457.276414 -457.276096 -459.48207 Ar 1-S -524.342962 -524.342962 -524.342962 -526.81751 K 2-S -596.152980 -596.152923 -596.152923 -599.16479 info comments 2/16/95 - DFF - Modify the format of the literature citation. 12/07/93 - SJB - Add Nb to Xe. 8/4/93 - DFF - Add Y and Zr. 12/2/92 - DFF - Add Rb and Sr. 7/13/90 - DFF - Original creation of this file from MIA basis set library. comments
Again, most of this can be extracted using a shell/python/perl script from the corresponding 3-21g nwchem basis set file.

The entry for 3-21G in 'pople':
name= 3-21G files= 3-21G.BAS atoms= H He Li Be B C N O F Ne Na Mg Al Si P S Cl Ar K Ca Sc Ti V Cr Mn Fe Co Ni Cu Zn Ga Ge As Se Br Kr Rb Sr Y Zr Nb Mo Tc Ru Rh Pd Ag Cd In Sn Sb Te I Xe Cs
This simple seems to be a list over the files that describe the basis set and the elements supported. Can be autogenerated using a script.

Intermission: polarization and diffuse orbitals, and ECP.
At this stage it's pretty simple. We now have a rough idea of what's needed. We just need to understand how to expand our basis sets.

For 3-21G* and 3-21++G* the polarisation and diffuse orbitals are separated into 3-21GS-AGG.BAS and 3-21GS.BAS, and 3-21PPGS-AGG.BAS and 3-21GS.BAS, and POPLDIFF.BAS. All -AGG.BAS files are empty, so I'm not sure why they are there.

Anyway, this might make it a bit clearer:
3-21G = 3-21G.BAS 3-21G* = 3-21G.BAS + 3-21GS.BAS 3-21++G* = 3-21G.BAS + 3-21GS.BAS + POPLDIFF
What happens to e.g. pople is this:
name= 3-21G* files= 3-21GS-AGG.BAS 3-21G.BAS 3-21GS.BAS atoms= H He Li Be B C N O F Ne Na Mg Al Si P S Cl Ar atoms= Na Mg Al Si P S Cl Ar name= 3-21++G* files= 3-21PPGS-AGG.BAS 3-21G.BAS POPLDIFF.BAS 3-21GS.BAS atoms= H He Li Be B C N O F Ne Na Mg Al Si P S Cl Ar atoms= H Li Be B C N O F Ne Na Mg Al Si P S Cl Ar atoms= Na Mg Al Si P S Cl
The -AGG.BAS files are empty. The first atoms line corresponds to entries in 3-21G.BAS, while for 3-21G* the second one corresponds to entries in 3-21GS.BAS. Likewise,
atoms= H Li Be B C N O F Ne Na Mg Al Si P S Cl Ar
are entries in POPLDIFF.BAS.

The good news: it's almost identical when it comes to ECP. Here's the ECPOrbital entry for LANL2DZ:
name= LANL2DZ ECP files= LANL2DZ.BAS LANL2DZ.POT atoms= H Li Be B C N O F Ne Na Mg Al Si P S Cl Ar K Ca Sc Ti V Cr Mn Fe Co Ni Cu Zn Ga Ge As Se Br Kr Rb Sr Y Zr Nb Mo Tc Ru Rh Pd Ag Cd In Sn Sb Te I Xe Cs Ba La Hf Ta W Re Os Ir Pt Au Pb Bi U Np Pu atoms= Na Mg Al Si P S Cl Ar K Ca Sc Ti V Cr Mn Fe Co Ni Cu Zn Ga Ge As Se Br Kr Rb Sr Y Zr Nb Mo Tc Ru Rh Pd Ag Cd In Sn Sb Te I Xe Cs Ba La Hf Ta W Re Os Ir Pt Au Pb Bi U Np Pu
and the ecp entry:
name= LANL2DZ ECP files= LANL2DZ.POT atoms= Na Mg Al Si P S Cl Ar K Ca Sc Ti V Cr Mn Fe Co Ni Cu Zn Ga Ge As Se Br Kr Rb Sr Y Zr Nb Mo Tc Ru Rh Pd Ag Cd In Sn Sb Te I Xe Cs Ba La Hf Ta W Re Os Ir Pt Au Pb Bi U Np Pu

The POT file is a little bit different from the .BAS file:
atom=Na ncore=10 lmax=2 ecp_potential%l=2%shell=d potential%num_exponents=5 1 175.5502590 -10.0000000 2 35.0516791 -47.4902024 2 7.9060270 -17.2283007 2 2.3365719 -6.0637782 2 0.7799867 -0.7299393 ecp_potential%l=0%shell=s-d potential%num_exponents=5 0 243.3605846 3.0000000 1 41.5764759 36.2847626 2 13.2649167 72.9304880 2 3.6797165 23.8401151 2 0.9764209 6.0123861 ecp_potential%l=1%shell=p-d potential%num_exponents=6 0 1257.2650682 5.0000000 1 189.6248810 117.4495683 2 54.5247759 423.3986704 2 13.7449955 109.3247297 2 3.6813579 31.3701656 2 0.9461106 7.1241813

.DAV files
The good news: the .DAV/basis.dir file is empty.
The bad news: .DAV/basis.pag is a binary file.
I haven't yet figured out the exact structure of it nor the best way to auto-generate it.
I think the best illustration is to show the od -c output for a few .POT.pag files:
LANL2DZ.POT.pag:
0000000 \b \0 371 003 354 003 345 003 340 003 325 003 312 003 302 003 0000020 233 003 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 0000040 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 * 0001620 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 004 \0 \0 002 D 0001640 A V : \0 h t t p : / / w w w . e 0001660 m s l . p n l . g o v / e c c e 0001700 : \0 M E T A D A T A \0 A U X I L 0001720 I A R Y \0 1 : c a t e g o r y \0 0001740 \0 e c p \0 1 : t y p e \0 \0 L A N 0001760 L 2 D Z E C P \0 1 : n a m e \0 0002000
SBKJC.POT.pag:
0000000 \b \0 371 003 356 003 347 003 342 003 327 003 314 003 304 003 0000020 235 003 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 0000040 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 * 0001620 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 004 \0 \0 0001640 002 D A V : \0 h t t p : / / w w w 0001660 . e m s l . p n l . g o v / e c 0001700 c e : \0 M E T A D A T A \0 A U X 0001720 I L I A R Y \0 1 : c a t e g o r 0001740 y \0 \0 e c p \0 1 : t y p e \0 \0 S 0001760 B K J C E C P \0 1 : n a m e \0 0002000

Trial and error in making files for def2-svp has shown me that you can copy e.g. LANL2DZ.POT.pag to DEF2_ECP.POT.pag, and edit with vim (use binary mode -b) but that you'll need to add enough spaces to the name so that the files both end at the same place. E.g. this works:
0000000 \b \0 371 003 354 003 345 003 340 003 325 003 312 003 302 003 0000020 233 003 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 0000040 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 * 0001620 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 004 \0 \0 002 D 0001640 A V : \0 h t t p : / / w w w . e 0001660 m s l . p n l . g o v / e c c e 0001700 : \0 M E T A D A T A \0 A U X I L 0001720 I A R Y \0 1 : c a t e g o r y \0 0001740 \0 e c p \0 1 : t y p e \0 \0 d e f 0001760 2 - e c p \0 1 : n a m e \0 0002000
but this doesn't (removed a single space at the end of def2-ecp):
0000000 \b \0 371 003 354 003 345 003 340 003 325 003 312 003 302 003 0000020 233 003 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 0000040 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 * 0001620 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 004 \0 \0 002 D 0001640 A V : \0 h t t p : / / w w w . e 0001660 m s l . p n l . g o v / e c c e 0001700 : \0 M E T A D A T A \0 A U X I L 0001720 I A R Y \0 1 : c a t e g o r y \0 0001740 \0 e c p \0 1 : t y p e \0 \0 d e f 0001760 2 - e c p \0 1 : n a m e \0 0001777
Note that the names should correspond to the names of the nwchem basis sets and/or files e.g. either 3-21gs or 3-21G*. Or LANL2DZ ECP or lanl2dz_ecp.

As far as I understand the solution will lie in how WebDAV uses .pag files. I don't know anything about that just yet though.

Anyway, that's it for now. There's now enough information to write your own scripts.

454. If I had a magic wand: stuff I'd fix in ECCE

Development of ECCE isn't as rapid as it used to, and one reason may be that the cost of implementing new features may be increasingly expensive, together with resources being finite.

However, ECCE is open source since about a year. Sure, you may not be able to submit your changes without vetting to a sourceforge git repo, but the good folks at EMSL/PNNL are very open to receiving patches and possibly merging them with the main trunk if you do write them.

Go to the ECCE forum and post improvements if you have made them. The forum is here: http://www.nwchem-sw.org/index.php/Special:AWCforum/sf/id11/General_ECCE_Topics.html

NOTE: as always with open source projects, simply making demands for features and general improvements in the forum, unless very small or highly critical, is probably a bit impolite.

I've made a few abortive attempts at making improvements to ECCE, and as these efforts often go, my success has been rather mixed (I think the migration from getopts to Getop:std is my only real contribution so far). My main talent seems to be in writing posts about ECCE, which I suppose isn't entirely without merit.

However, I'd like to become more involved, even though a significant barrier is the fact that ECCE is written in at least four different languages (java, C, perl, python).

So here's my wish list, which I might be amending with time (hopefully with solutions...)

1. Bugs
* When creating a new directory, all open directories lower in rank auto-close.
Example. Say we have the directory /jobs/catalyst open/expanded. If we create the directory /jobs/transition, /jobs/catalyst and everything below will close. It can be quite annoying since you may lose track of what jobs you were running and where.

* Separate DFT maxiter from SCF maxiter
NOTE: what I wrote below is wrong. DFT;iterations N;end works as advertised in DFT, scf; maxiter N;end is ignored.
Old text: When setting up DFT jobs using the theory editor, section called SCF Convergence really only sets the DFT options. In other words, if you set SCF max iterations to 999, you're really setting the number of DFT iterations, not SCF. It would be nice to add an SCF section to DFT jobs, and rename the SCF section to DFT.

* Can't run command because too many files open
While in reality you actually don't have anything open.
Not sure exactly what causes it, but it's manifested by the ECCE animation (in the left-most button) in the ECCE menu thingy (the small window with buttons for Exit, Organizer, Builder, Viewer, Machine Browser etc.) not stopping, and no window launching. In the terminal the following is shown:
[..] exit; echo GOODBYE Unable to create a socket Error: java.net.SocketException: Too many open files

Doing
lsof |grep ecce
in one case yielded 9784 hits, most of which were of the type
java 18010 31175 me 99u REG 8,6 0 17194388 /home/me/.ecce/ecce-v6.4b/server/activemq/data/kr-store/data/hash-index-blob_ecce_url_property_data-Subscriptions java 18010 31175 me 106u REG 8,6 0 17196148 /home/me/.ecce/ecce-v6.4b/server/activemq/data/kr-store/data/hash-index-topic-data_ecce_ejs_kill java 18010 31175 me 107u REG 8,6 0 17196150 /home/me/.ecce/ecce-v6.4b/server/activemq/data/kr-store/data/hash-index-blob_ecce_ejs_kill-Subscriptions java 18010 31176 me cwd DIR 8,6 4096 16517897 /home/me/.ecce/ecce-v6.4b/apps/bin java 18010 31176 me mem REG 8,6 14014 16919807 /home/me/.ecce/ecce-v6.4b/server/activemq/bin/run.jar java 18010 31176 me mem REG 8,6 367444 16919829 /home/me/.ecce/ecce-v6.4b/server/activemq/lib/optional/log4j-1.2.14.jar java 18010 31176 me mem REG 8,6 467331 16919826 /home/me/.ecce/ecce-v6.4b/server/activemq/lib/optional/spring-beans-2.5.1.jar java 18010 31176 me mem REG 8,6 81403 16919823 /home/me/.ecce/ecce-v6.4b/server/activemq/lib/activemq-console-5.1.0.jar java 18010 31176 me mem REG 8,6 455159 16919827 /home/me/.ecce/ecce-v6.4b/server/activemq/lib/optional/spring-context-2.5.1.jar java 18010 31176 me mem REG 8,6 128051 16919828 /home/me/.ecce/ecce-v6.4b/server/activemq/lib/optional/xbean-spring-3.3.jar java 18010 31176 me mem REG 8,6 52915 16919819 /home/me/.ecce/ecce-v6.4b/server/activemq/lib/commons-logging-1.1.jar java 18010 31176 me mem REG 8,6 2141382 16919830 /home/me/.ecce/ecce-v6.4b/server/activemq/lib/optional/derby-10.1.3.1.jar java 18010 31176 me mem REG 8,6 275073 16919825 /home/me/.ecce/ecce-v6.4b/server/activemq/lib/optional/spring-core-2.5.1.jar java 18010 31176 me mem REG 8,6 16030 16919820 /home/me/.ecce/ecce-v6.4b/server/activemq/lib/geronimo-j2ee-management_1.0_spec-1.0.jar java 18010 31176 me mem REG 8,6 32359 16919821 /home/me/.ecce/ecce-v6.4b/server/activemq/lib/geronimo-jms_1.1_spec-1.1.1.jar java 18010 31176 me mem REG 8,6 2315247 16919822 /home/me/.ecce/ecce-v6.4b/server/activemq/lib/activemq-core-5.1.0.jar java 18010 31176 me 12r REG 8,6 14014 16919807 /home/me/.ecce/ecce-v6.4b/server/activemq/bin/run.jar java 18010 31176 me 21r REG 8,6 2141382 16919830 /home/me/.ecce/ecce-v6.4b/server/activemq/lib/optional/derby-10.1.3.1.jar java 18010 31176 me 22r REG 8,6 367444 16919829 /home/me/.ecce/ecce-v6.4b/server/activemq/lib/optional/log4j-1.2.14.jar java 18010 31176 me 23r REG 8,6 467331 16919826 /home/me/.ecce/ecce-v6.4b/server/activemq/lib/optional/spring-beans-2.5.1.jar java 18010 31176 me 24r REG 8,6 455159 16919827 /home/me/.ecce/ecce-v6.4b/server/activemq/lib/optional/spring-context-2.5.1.jar java 18010 31176 me 25r REG 8,6 275073 16919825 /home/me/.ecce/ecce-v6.4b/server/activemq/lib/optional/spring-core-2.5.1.jar java 18010 31176 me 26r REG 8,6 128051 16919828 /home/me/.ecce/ecce-v6.4b/server/activemq/lib/optional/xbean-spring-3.3.jar java 18010 31176 me 27r REG 8,6 81403 16919823 /home/me/.ecce/ecce-v6.4b/server/activemq/lib/activemq-console-5.1.0.jar java 18010 31176 me 28r REG 8,6 2315247 16919822 /home/me/.ecce/ecce-v6.4b/server/activemq/lib/activemq-core-5.1.0.jar java 18010 31176 me 29r REG 8,6 52915 16919819 /home/me/.ecce/ecce-v6.4b/server/activemq/lib/commons-logging-1.1.jar java 18010 31176 me 30r REG 8,6 16030 16919820 /home/me/.ecce/ecce-v6.4b/server/activemq/lib/geronimo-j2ee-management_1.0_spec-1.0.jar java 18010 31176 me 31r REG 8,6 32359 16919821 /home/me/.ecce/ecce-v6.4b/server/activemq/lib/geronimo-jms_1.1_spec-1.1.1.jar

And it repeats. I don't know why -- I guess files are opened and never closed.

* MD bugs?
Hammering out MD related bugs. I don't understand the ECCE support for MD well enough to know what is a bug and what is a feature issue, but I do know that I've had issues getting MD sims running through simple point and click. So I'm not even sure that there are MD related bugs -- but you do get the impression that there are.

Luckily there are people putting up tutorials online: http://saccharides.blogspot.tw/2013/06/ecce-md-calculation.html

* Long lines in nwchem input
There's a bug when dealing with lines longer than 254 chars. Today (18th of June 2013) I encountered it for the first time in a situation unrelated to pasting BSE code -- this time the ecce_print line was simply too long. Read more here about the bug: http://verahill.blogspot.com.au/2013/02/347-minor-ecce-oddity-when-pasting.html

* Long lines in xterm/csh
xterm -f (invoked by alt+l) on a running calculation doesn't work is the resulting command is longer than 256 characters. e.g.
xterm -title "optimization_def2svp_boat_acetonitrile_to_start_again_g09_pcm_ts-1" -bg "#b7b8ba" -fg "#000000" -sb -geom 80x40 -e tail -f /home/calc/boron/testing/testing/catalyst/optimization_def2svp_boat_acetonitrile_to_start_again_g09_pcm_ts-1/g03.g03out

which has 257 characters doesn't work when used by ECCE (which uses (t)csh), but works fine in bash.


* Rounding in ECCE when using Coefficients and Exponents. 
NOTE: this is fixed in ECCE 7.0

When you select a bass set in ECCE you can either represent it as
basis Be library '6-31g' end
or you can check the 'Use Exponents and Coefficients' in the ECCE Editor. If you check that box you'll get something like this instead:
basis "ao basis" cartesian print Be S 1264.58570000 0.00194500 189.93681000 0.01483500 43.15908900 0.07209100 12.09866300 0.23715400 3.80632300 0.46919900 1.27289000 0.35652000 [..]
However, the 6-31G.BAS file gives the coefficients and exponents as
atom=Be contraction shell=S num_primitives=6 num_coefficients=1 1264.5857 0.0019448 189.93681 0.0148351 43.159089 0.0720906 12.098663 0.2371542 3.8063232 0.4691987 1.2728903 0.3565202
Note the higher precision (there are more extreme examples, but this one will do for demonstration).
Finally, if you define the basis set using the * library '6-31g' way, you get
Be (Beryllium) -------------- Exponent Coefficients -------------- --------------------------------------------------------- 1 S 1.26458570E+03 0.001945 1 S 1.89936810E+02 0.014835 1 S 4.31590890E+01 0.072091 1 S 1.20986630E+01 0.237154 1 S 3.80632320E+00 0.469199 1 S 1.27289030E+00 0.356520 2 S 3.19646310E+00 -0.112649 2 S 7.47813300E-01 -0.229506 2 S 2.19966300E-01 1.186917

In other words, you get somewhat different precision depending on how you use a basis set (3.8063232 vs 3.806323). Not a major issue, but it's still somewhat odd behaviour. The issue is much more significant when I use the def2 basis sets.

In terms of the energy at b3lyp/6-31g for an isolated Be atom I get this with Coeffs/Exps:
Total DFT energy = -14.668063062134 One electron energy = -19.130485660973 Coulomb energy = 7.245896594547 Exchange-Corr. energy = -2.783473995707 Nuclear repulsion energy = 0.000000000000 Numeric. integr. density = 3.999999797518 Total iterative time = 0.2s
vs this with Be library '6-31g'
Total DFT energy = -14.668063028950 One electron energy = -19.130484712269 Coulomb energy = 7.245894834880 Exchange-Corr. energy = -2.783473151561 Nuclear repulsion energy = 0.000000000000 Numeric. integr. density = 3.999999797514 Total iterative time = 0.2s


2. Features
* A script for importing/adding new basis sets, as they become supported by nwchem. I've started work on this but I'm stuck without understanding why. See part 1 here: http://verahill.blogspot.com.au/2013/06/455-adding-nwchem-basis-sets-to-ecce.html

DONE!
http://verahill.blogspot.com.au/2013/06/456-adding-nwchem-basis-sets-to-ecce.html
https://sourceforge.net/projects/nwbas2ecce/ !

Note: the source distribution of ECCE 7.0 will contain a few helper scripts for importing basis sets.

* More options when setting up COSMO calculations. Currently you can set the dielectric constant, but nothing else. being able to set at a minimum rsolv and iscren would be welcome.

NOTE: this is fixed in ECCE 7.0

* Better Gaussian input/output support. This will by necessity have to be produced by someone outside PNNL as they are banned from receiving a license for Gaussian (they are considered competitors owing to the development of nwchem).

* Adding support for more computational packages, especially GAMESS US and Dalton (since they are free/open source).

* Updated documentation/more examples in documentation. The easiest solution is having more people blogging about what they are doing -- and HOW they are doing it.

3. Other
* I'd like to see ECCE included in e.g. Debian (assuming that the license is acceptable and that EMSL/PNNL allow it+). However, not only am I not proficient in making canonical debian packages, the build script for ECCE is a bit more advanced than the usual configure/make/make install ones. So it'll take someone with a bit more expertise than myself.

+one issue is that funding is often tied to being able to demonstrate that a piece of software is being used. And the easiest way to show that it's used is to retain control over downloads/encouraging people to register.