Showing posts with label pspw. Show all posts
Showing posts with label pspw. Show all posts

25 November 2013

531. Briefly: NWChem 6.3 -- issues with planewave (PSPW) module and AMD FX8150 and 8350 CPUs

This is more of an announcement or warning than a proper blog post:

Both FX8350 and FX8150 have trouble running the pspw module causing the calculation to lead to exploding structures:

My other nodes have no trouble running the job in question. Also, the issue was only found in nwchem 6.3 -- nwchem 6.1.1 worked fine. So it's not an FX83x50 related fault per se.

Again, see the post at the site for more information.

24 May 2012

162. PSPW/Carr-Parrinello using ECCE

This is more of a note to self about carr-parrinello using ecce/nwchem. As always, this isn't about the science, but about making the computation run at all. And what I may consider a bug may in fact be a feature.

If you simply click your way through ecce and try to launch a pspw carr-parrinello calc, it will fail.

Two problems:

  • task pspw carr-parrinello expects a .movecs file to be present. You can 'solve' this by putting task pspw steepest_descent before you task pspw carr-parrinello statement
  • if you relaunch a run, often you get a crash with an error referring to writing after EOF. You can solve this by cleaning out your run directory.

Problem number one comes down to this:

"Velocity Wavefunction Datafile
The one-electron orbital velocities are stored in a velocity wavefunction datafile. This is a binary file and cannot be directly edited. This datafile is used by the Car-Parrinello task and can be generated using the v_wavefunction_initializer task."

The dependence on certain files (at a minimum, .movecs) being present and the need for optimisation before CPMD and the fact that either
task pspw energy
task pspw optimize
will more often than not prevent your ecce.out file from showing any of the MD stuff means it's still better to set up your file by hand and run everything by hand in a dedicated directory. Ecce doesn't copy runtime file back and forth, and that's the main problem here.

Really, what I find a problem is that I'd like to optimize and equilibriate a set of molecules, then continue using the equilibriated set.

If all you're trying to do is to get something, anything to work to get a feel for how this stuff works, then continue reading.

Problematic example file:

  1 scratch_dir /home/me/jobs/scratch
  2 Title "biphenyl_ground_twisted_cpmd_1-1"
  4 Start  biphenyl_ground_twisted_cpmd_1-1
  6 echo
  8 charge 0
 10 geometry autosym units angstrom
 11  C     0.00676622     3.53807     0.0197363
 12  C     -1.29633     2.88855     0.554869
 13  C     -1.31879     1.38415     0.519460
 14  C     0.00129627     0.730174     -0.000557722
 15  C     1.28746     1.38368     -0.578129
 16  C     1.31931     2.90453     -0.512952
 17  C     -0.0100394     -0.758319     -0.0224661
 18  C     1.33004     -1.36336     0.563945
 19  C     1.24425     -2.89848     0.485842
 20  C     -1.31683     -1.36948     -0.531559
 21  C     0.0254501     -3.54181     0.0318405
 22  C     -1.30632     -2.89413     -0.540694
 23  H     0.0916004     4.71976     0.273191
 24  H     -2.74374     3.75562     1.47791
 25  H     -2.78549     0.594633     1.40665
 26  H     2.70470     3.74589     -1.20122
 27  H     3.09496     -3.77313     1.52949
 28  H     -2.76973     -0.640827     -1.49262
 29  H     0.0203915     -4.70472     -0.288098
 30  H     -2.76848     -3.72979     -1.38695
 31  H     2.88815     -0.631319     1.39550
 32  H     2.66933     0.621436     -1.58686
 33 end
 35 ecce_print ecce.out
 37 nwpw
 38   mult 1
 39   np_dimensions -1  -1
 40   tolerances 1e-7  1e-7
 41   car-parrinello
 42     time_step 5.000000e+00
 43     fake_mass 5.000000e+02
 44     loop 10 100
 45     scaling 1.000000e+00 1.000000e+00
 46   end
 47 end
 49 task pspw car-parrinello

Quick 'solution'
3 memory 200mw
48 task pspw steepest_descent

The line numbers are added by me. Remove them before running.

You can also stick task pspw energy or optimize in there -- but the way ecce does it, with just a task pspw carr-parrinello, won't work. Either way, it'd be nice to be able to carry over the movecs files between calculations.

See below for various errors:

Error #1:
If you set up the run using ecce, it won't work and there won't be any real error message to explain why the run exits immediately.

294      >>>  JOB STARTED       AT Thu May 24 14:19:04 2012  <<<
295           ================ input data ========================
296   library name resolved from: compiled reference
297   NWCHEM_NWPW_LIBRARY set to: </opt/nwchem/nwchem-6.1/src/nwpw/libraryps/>
298   library name resolved from: compiled reference
299   NWCHEM_NWPW_LIBRARY set to: </opt/nwchem/nwchem-6.1/src/nwpw/libraryps/>
301 -----ECCE Log Information-----
302 Starting Job: Thu May 24 14:19:02 EST 2012
303 Using /home/me/jobs/scratch as nwchem SCRATCH_DIR
304 nwchem exit status = -1
305 Final exit status = -1
306 Completed Job: Thu May 24 14:19:05 EST 2012

If you launch the run in the terminal (without mpirun -- mpi suppresses error messages sometimes) you get:
     >>>  JOB STARTED       AT Thu May 24 14:20:15 2012  <<<
          ================ input data ========================
  library name resolved from: compiled reference
  NWCHEM_NWPW_LIBRARY set to: </opt/nwchem/nwchem-6.1/src/nwpw/libraryps/>
  library name resolved from: compiled reference
  NWCHEM_NWPW_LIBRARY set to: </opt/nwchem/nwchem-6.1/src/nwpw/libraryps/>
ERROR:  Could not open pipe from input file
The reason is that ecce doesn't carry files over from previous simulations -- you need the .movecs file. This can be generated by
  task pspw steepest_descent 

If you could run all your job in the same directory that wouldn't be a problem.

Error #2
438      >>>  JOB STARTED       AT Thu May 24 14:24:38 2012  <<<
439           ================ input data ========================
440  ------------------------------------------------------------------------
441  out of heap memory        0
442  ------------------------------------------------------------------------
443  ------------------------------------------------------------------------
444   current input line :
445     48: task pspw energy
446  ------------------------------------------------------------------------
447  ------------------------------------------------------------------------
448  ------------------------------------------------------------------------
449  For more information see the NWChem manual at        index.php/NWChem_Documentation

We chucked task pspw steepest_descent in before our task pspw carr-parrinello and now get a new error: out of heap memory. Easily fixed. You can set e.g. 200MW under pspw/details or add
memory 200 MW
by hand.

Of course, if you add it by clicking in ecce then your task pspw steepest_descent line will be removed, so you'll have to add that by hand again.

Error #3
According to the manual "This [movecs] datafile is used by the Car-Parrinello task and can be generated using the v_wavefunction_initializer task."
Well, try
task v_wavefunction_initialize
and you get

>>>> PSPW Serial Module - v_wavefunction_initializer <<<<
0:Segmentation Violation error, status=: 11
(rank:0 hostname:beryllium pid:24675):ARMCI DASSERT fail. ../../ga-5-1/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
Last System Error Message from Task 0:: No such file or directory
And you find that there's a file called ????

Error 4
Going full out:

task pspw wavefunction_initializer
task pspw pseudopotential_formatter
task pspw v_wavefunction_initializer
task pspw car-parrinello

 >>>> PSPW Serial Module - wavefunction_initializer <<<<
0:Floating Point Exception error, status=: 8
(rank:0 hostname:beryllium pid:26026):ARMCI DASSERT fail. ../../ga-5-1/armci/src/common/signaltrap.c:SigFpeHandler():249 cond:0
Last System Error Message from Task 0:: No such file or directory

Error 5:
If you haven't cleared out your run directory you get this via ecce
        ============ Car-Parrinello iteration ==============
     >>>  ITERATION STARTED AT Thu May 24 18:03:07 2012  <<<
    iter.         KE+Energy             Energy        KE_psi        KE_ion   Temperature
      10  -0.1662131203E+02  -0.1662582428E+02   0.43690E-02   0.39005E-02        143.80

-----ECCE Log Information-----
Starting Job: Thu May 24 18:00:41 EST 2012

and this if you run in the terminal

At line 847 of file cpmdv5.F (unit = 31, file = './cpmd_test.emotion')
Fortran runtime error: Sequential READ or WRITE not allowed after EOF marker, possibly use REWIND or BACKSPACE
      10   0.1175850345E+06   0.3940717762E+03   0.18793E-03   0.27950E-02       1852.03
mpirun has exited due to process rank 0 with PID 26303 on
node beryllium exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).