Pages

16 May 2012

150. nwchem 6.1 -- diagnostic data

EDIT:
Compiling nwchem 6.1 with internal libs on debian: http://verahill.blogspot.com.au/2012/05/compiling-nwchem-61-with-internal-libs.html
Compiling nwchem 6.1 with openblas on debian: http://verahill.blogspot.com.au/2012/05/building-nwchem-61-on-debian.html

This post in't really meant to be read -- it just contains troubleshooting data for nwchem. It was originally meant to demonstrate how no matter what I did, nwchem 6.1 would segfault on debian. Well, suddenly some of the builds work.

Background
A visible difference when compiling nwchem 6.1 vs nwchem 6.0 on the same machine, or nwchem 6.1 on debian vs nwchem on rocks/centos is that the step below in red takes a much, much longer time (20-30 minutes) with nwchem6 .1 on debian.
ar r /opt/nwchem/nwchem-6.1/lib/LINUX64/libnwcutil.a md5wrap.o md5.o
ranlib /opt/nwchem/nwchem-6.1/lib/LINUX64/libnwcutil.a
/opt/nwchem/nwchem-6.1/lib/LINUX64/libnwcutil.a
In all cases I've manually edited src/config/makefile.h by adding -lz -lssl to line 1914 (needed for python).

INDEX

  • BAD BUILD debian -- internal libs
  • Successful build on debian -- external libs
  • Successful build on ROCKS 5.4.3:
  • Good build: Nwchem 6.0 with openblas

BAD BUILD debian on AMD athlon II X3 (-march=opteron) -- internal libs:
Build script:

export LARGE_FILES=TRUE
export TCGRSH=/usr/bin/ssh
export NWCHEM_TOP=`pwd`
export NWCHEM_TARGET=LINUX64
export NWCHEM_MODULES="all python"
export PYTHONHOME=/usr
export PYTHONVERSION=2.7
export USE_MPI=y
export USE_MPIF=y
export USE_MPIF4=y
export MPI_LOC=/usr/lib/openmpi/lib
export MPI_INCLUDE=/usr/lib/openmpi/include
export LIBRARY_PATH=$LIBRARY_PATH:/usr/lib/openmpi/lib
export LIBMPI="-lmpi -lopen-rte -lopen-pal -ldl -lmpi_f77 -lpthread"
cd $NWCHEM_TOP/src
make clean
make nwchem_config
make FC=gfortran

 ldd nwchem 
        linux-vdso.so.1 =>  (0x00007fff0b927000)
        libmpi.so.0 => /usr/lib/libmpi.so.0 (0x00002b2ae1537000)
        libopen-rte.so.0 => /usr/lib/libopen-rte.so.0 (0x00002b2ae17eb000)
        libopen-pal.so.0 => /usr/lib/libopen-pal.so.0 (0x00002b2ae1a3a000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00002b2ae1c91000)
        libmpi_f77.so.0 => /usr/lib/libmpi_f77.so.0 (0x00002b2ae1e96000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00002b2ae20ce000)
        libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00002b2ae22ea000)
        libz.so.1 => /usr/lib/x86_64-linux-gnu/libz.so.1 (0x00002b2ae24ee000)
        libssl.so.1.0.0 => /usr/lib/x86_64-linux-gnu/libssl.so.1.0.0 (0x00002b2ae2704000)
        libgfortran.so.3 => /usr/lib/x86_64-linux-gnu/libgfortran.so.3 (0x00002b2ae2962000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00002b2ae2c79000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00002b2ae2efb000)
        libquadmath.so.0 => /usr/lib/x86_64-linux-gnu/libquadmath.so.0 (0x00002b2ae3111000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00002b2ae3347000)
        libcrypto.so.1.0.0 => /usr/lib/x86_64-linux-gnu/libcrypto.so.1.0.0 (0x00002b2ae36ce000)
        libnsl.so.1 => /lib/x86_64-linux-gnu/libnsl.so.1 (0x00002b2ae3ab2000)
        /lib64/ld-linux-x86-64.so.2 (0x00002b2ae1315000)

mpirun -n 1 nwchem nwch.nw
      Screening Tolerance Information
      -------------------------------
          Density screening/tol_rho: 1.00D-10
          AO Gaussian exp screening on grid/accAOfunc:  14
          CD Gaussian exp screening on grid/accCDfunc:  20
          XC Gaussian exp screening on grid/accXCfunc:  20
          Schwarz screening/accCoul: 1.00D-08
0:Segmentation Violation error, status=: 11
(rank:0 hostname:boron pid:7270):ARMCI DASSERT fail. ../../ga-5-1/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
nwchem: malloc.c:3096: sYSMALLOc: Assertion `(old_top == (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - __builtin_offsetof (struct malloc_chunk, fd)))) && old_size == 0) || ((unsigned long) (old_size) >= (unsigned long)((((__builtin_offsetof (struct malloc_chunk, fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & ~((2 * (sizeof(size_t))) - 1))) && ((old_top)->size & 0x1) && ((unsigned long)old_end & pagemask) == 0)' failed.
[boron:07270] *** Process received signal ***
[boron:07270] Signal: Aborted (6)
[boron:07270] Signal code:  (-6)
[boron:07270] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0xf030) [0x2b5a8893b030]
[boron:07270] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x35) [0x2b5a89bd7475]
[boron:07270] [ 2] /lib/x86_64-linux-gnu/libc.so.6(abort+0x180) [0x2b5a89bda6f0]
[boron:07270] [ 3] /lib/x86_64-linux-gnu/libc.so.6(+0x75d4a) [0x2b5a89c1ad4a]
[boron:07270] [ 4] /lib/x86_64-linux-gnu/libc.so.6(+0x78c73) [0x2b5a89c1dc73]
[boron:07270] [ 5] /lib/x86_64-linux-gnu/libc.so.6(__libc_malloc+0x70) [0x2b5a89c1f960]
[boron:07270] [ 6] /usr/lib/libopen-pal.so.0(+0x32538) [0x2b5a882ca538]
[boron:07270] [ 7] /usr/lib/libopen-pal.so.0(opal_show_help_vstring+0xac) [0x2b5a882c804c]
[boron:07270] [ 8] /usr/lib/libopen-rte.so.0(orte_show_help+0xac) [0x2b5a8806406c]
[boron:07270] [ 9] /usr/lib/libmpi.so.0(MPI_Abort+0x74) [0x2b5a87ddfd14]
[boron:07270] [10] ./nwchem(armci_msg_abort+0x12) [0x2a27a22]
[boron:07270] [11] ./nwchem(dassertp_fail+0x113) [0x2a14d43]
[boron:07270] [12] /lib/x86_64-linux-gnu/libc.so.6(+0x324f0) [0x2b5a89bd74f0]
[boron:07270] [13] ./nwchem(dgemm_+0x44a) [0x2c65b9a]
[boron:07270] [14] ./nwchem() [0x2931960]
[boron:07270] [15] ./nwchem(wnga_matmul+0x1f05) [0x2934e15]
[boron:07270] [16] ./nwchem(ga_dgemm_+0x9b) [0x28a5deb]
[boron:07270] [17] ./nwchem(diis_bld12_+0x10a) [0x6ee9da]
[boron:07270] [18] ./nwchem(dft_main0d_+0x1fac) [0x6d32d8]
[boron:07270] [19] ./nwchem(nwdft_+0xb89) [0x6c8cf5]
[boron:07270] [20] ./nwchem(dft_energy_gradient_+0x3b) [0x6a1adb]
[boron:07270] [21] ./nwchem(task_gradient_doit_+0x36f) [0x535f47]
[boron:07270] [22] ./nwchem(task_gradient_+0x2cc) [0x5373c7]
[boron:07270] [23] ./nwchem(driver_+0x1f2) [0x642785]
[boron:07270] [24] ./nwchem(task_optimize_+0x4eb) [0x537e3b]
[boron:07270] [25] ./nwchem(task_+0x10b6) [0x528e27]
[boron:07270] [26] ./nwchem() [0x521b80]
[boron:07270] [27] ./nwchem(main+0x1d) [0x522081]
[boron:07270] [28] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xfd) [0x2b5a89bc3ead]
[boron:07270] [29] ./nwchem() [0x5202b1]
[boron:07270] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 7270 on node boron exited on signal 6 (Aborted).

nwchem nwch.nw
      Screening Tolerance Information
      -------------------------------
          Density screening/tol_rho: 1.00D-10
          AO Gaussian exp screening on grid/accAOfunc:  14
          CD Gaussian exp screening on grid/accCDfunc:  20
          XC Gaussian exp screening on grid/accXCfunc:  20
          Schwarz screening/accCoul: 1.00D-08
0:Segmentation Violation error, status=: 11
(rank:0 hostname:boron pid:7282):ARMCI DASSERT fail. ../../ga-5-1/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
nwchem: malloc.c:3096: sYSMALLOc: Assertion `(old_top == (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - __builtin_offsetof (struct malloc_chunk, fd)))) && old_size == 0) || ((unsigned long) (old_size) >= (unsigned long)((((__builtin_offsetof (struct malloc_chunk, fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & ~((2 * (sizeof(size_t))) - 1))) && ((old_top)->size & 0x1) && ((unsigned long)old_end & pagemask) == 0)' failed.
[boron:07282] *** Process received signal ***
[boron:07282] Signal: Aborted (6)
[boron:07282] Signal code:  (-6)
[boron:07282] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0xf030) [0x2ade86678030]
[boron:07282] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x35) [0x2ade87914475]
[boron:07282] [ 2] /lib/x86_64-linux-gnu/libc.so.6(abort+0x180) [0x2ade879176f0]
[boron:07282] [ 3] /lib/x86_64-linux-gnu/libc.so.6(+0x75d4a) [0x2ade87957d4a]
[boron:07282] [ 4] /lib/x86_64-linux-gnu/libc.so.6(+0x78c73) [0x2ade8795ac73]
[boron:07282] [ 5] /lib/x86_64-linux-gnu/libc.so.6(__libc_malloc+0x70) [0x2ade8795c960]
[boron:07282] [ 6] /usr/lib/libopen-pal.so.0(+0x32538) [0x2ade86007538]
[boron:07282] [ 7] /usr/lib/libopen-pal.so.0(opal_show_help_vstring+0xac) [0x2ade8600504c]
[boron:07282] [ 8] /usr/lib/libopen-rte.so.0(orte_show_help+0xac) [0x2ade85da106c]
[boron:07282] [ 9] /usr/lib/libmpi.so.0(MPI_Abort+0x74) [0x2ade85b1cd14]
[boron:07282] [10] ./nwchem(armci_msg_abort+0x12) [0x2a27a22]
[boron:07282] [11] ./nwchem(dassertp_fail+0x113) [0x2a14d43]
[boron:07282] [12] /lib/x86_64-linux-gnu/libc.so.6(+0x324f0) [0x2ade879144f0]
[boron:07282] [13] ./nwchem(dgemm_+0x43f) [0x2c65b8f]
[boron:07282] [14] ./nwchem() [0x2931960]
[boron:07282] [15] ./nwchem(wnga_matmul+0x1f05) [0x2934e15]
[boron:07282] [16] ./nwchem(ga_dgemm_+0x9b) [0x28a5deb]
[boron:07282] [17] ./nwchem(diis_bld12_+0x10a) [0x6ee9da]
[boron:07282] [18] ./nwchem(dft_main0d_+0x1fac) [0x6d32d8]
[boron:07282] [19] ./nwchem(nwdft_+0xb89) [0x6c8cf5]
[boron:07282] [20] ./nwchem(dft_energy_gradient_+0x3b) [0x6a1adb]
[boron:07282] [21] ./nwchem(task_gradient_doit_+0x36f) [0x535f47]
[boron:07282] [22] ./nwchem(task_gradient_+0x2cc) [0x5373c7]
[boron:07282] [23] ./nwchem(driver_+0x1f2) [0x642785]
[boron:07282] [24] ./nwchem(task_optimize_+0x4eb) [0x537e3b]
[boron:07282] [25] ./nwchem(task_+0x10b6) [0x528e27]
[boron:07282] [26] ./nwchem() [0x521b80]
[boron:07282] [27] ./nwchem(main+0x1d) [0x522081]
[boron:07282] [28] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xfd) [0x2ade87900ead]
[boron:07282] [29] ./nwchem() [0x5202b1]
[boron:07282] *** End of error message ***
Aborted



 ltrace nwchem nwch.nw

MPI_Barrier(0x119f72d0, 0x7fff2e777340, 1, 0x2e2e7d0, 1)                                                                                           = 0
MPI_Barrier(0x119f72d0, 0x7fff2e776d90, 1, 0x7fff2e776dd0, 1)                                                                                      = 0
--- SIGSEGV (Segmentation fault) ---
vfprintf(0x2b1e48f4e7a0, "%d:%s: %d\n", 0x7fff2e7757580:Segmentation Violation error, status=: 11
)                                                                                            = 44
getpid()                                                                                                                                           = 7346
printf("(rank:%d hostname:%s pid:%d):ARM"..., 0, "boron", 7346(rank:0 hostname:boron pid:7346):ARMCI DASSERT fail. ../../ga-5-1/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
)                                                                                    = 124
__errno_location()                                                                                                                                 = 0x2b1e49552088
free(0x119effb0)                                                                                                                                   = <void>
signal(15, NULL)                                                                                                                                   = 0x02a56230
signal(2, 0x02a56460)                                                                                                                              = 0x02a564b0
signal(11, 0x2b1e472f2690)                                                                                                                         = 0x02a56320
MPI_Abort(0x119f72d0, 11, 0, -1, 0x7fff2e775440nwchem: malloc.c:3096: sYSMALLOc: Assertion `(old_top == (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - __builtin_offsetof (struct malloc_chunk, fd)))) && old_size == 0) || ((unsigned long) (old_size) >= (unsigned long)((((__builtin_offsetof (struct malloc_chunk, fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & ~((2 * (sizeof(size_t))) - 1))) && ((old_top)->size & 0x1) && ((unsigned long)old_end & pagemask) == 0)' failed.
 <unfinished ...>
--- SIGABRT (Aborted) ---
[boron:07346] *** Process received signal ***
[boron:07346] Signal: Aborted (6)
[boron:07346] Signal code:  (-6)
[boron:07346] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0xf030) [0x2b1e47963030]
[boron:07346] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x35) [0x2b1e48bff475]
[boron:07346] [ 2] /lib/x86_64-linux-gnu/libc.so.6(abort+0x180) [0x2b1e48c026f0]
[boron:07346] [ 3] /lib/x86_64-linux-gnu/libc.so.6(+0x75d4a) [0x2b1e48c42d4a]
[boron:07346] [ 4] /lib/x86_64-linux-gnu/libc.so.6(+0x78c73) [0x2b1e48c45c73]
[boron:07346] [ 5] /lib/x86_64-linux-gnu/libc.so.6(__libc_malloc+0x70) [0x2b1e48c47960]
[boron:07346] [ 6] /usr/lib/libopen-pal.so.0(+0x32538) [0x2b1e472f2538]
[boron:07346] [ 7] /usr/lib/libopen-pal.so.0(opal_show_help_vstring+0xac) [0x2b1e472f004c]
[boron:07346] [ 8] /usr/lib/libopen-rte.so.0(orte_show_help+0xac) [0x2b1e4708c06c]
[boron:07346] [ 9] /usr/lib/libmpi.so.0(MPI_Abort+0x74) [0x2b1e46e07d14]
[boron:07346] [10] ./nwchem(armci_msg_abort+0x12) [0x2a27a22]
[boron:07346] [11] ./nwchem(dassertp_fail+0x113) [0x2a14d43]
[boron:07346] [12] /lib/x86_64-linux-gnu/libc.so.6(+0x324f0) [0x2b1e48bff4f0]
[boron:07346] [13] ./nwchem(dgemm_+0x446) [0x2c65b96]
[boron:07346] [14] ./nwchem() [0x2931960]
[boron:07346] [15] ./nwchem(wnga_matmul+0x1f05) [0x2934e15]
[boron:07346] [16] ./nwchem(ga_dgemm_+0x9b) [0x28a5deb]
[boron:07346] [17] ./nwchem(diis_bld12_+0x10a) [0x6ee9da]
[boron:07346] [18] ./nwchem(dft_main0d_+0x1fac) [0x6d32d8]
[boron:07346] [19] ./nwchem(nwdft_+0xb89) [0x6c8cf5]
[boron:07346] [20] ./nwchem(dft_energy_gradient_+0x3b) [0x6a1adb]
[boron:07346] [21] ./nwchem(task_gradient_doit_+0x36f) [0x535f47]
[boron:07346] [22] ./nwchem(task_gradient_+0x2cc) [0x5373c7]
[boron:07346] [23] ./nwchem(driver_+0x1f2) [0x642785]
[boron:07346] [24] ./nwchem(task_optimize_+0x4eb) [0x537e3b]
[boron:07346] [25] ./nwchem(task_+0x10b6) [0x528e27]
[boron:07346] [26] ./nwchem() [0x521b80]
[boron:07346] [27] ./nwchem(main+0x1d) [0x522081]
[boron:07346] [28] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xfd) [0x2b1e48bebead]
[boron:07346] [29] ./nwchem() [0x5202b1]
[boron:07346] *** End of error message ***
unexpected breakpoint at 0x2b1e48bff474
--- SIGABRT (Aborted) ---
+++ killed by SIGABRT +++
strace nwchem nwch.nw


write(1, "(rank:0 hostname:boron pid:7439)"..., 124(rank:0 hostname:boron pid:7439):ARMCI DASSERT fail. ../../ga-5-1/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
) = 124
rt_sigaction(SIGTERM, {SIG_DFL, [TERM], SA_RESTORER|SA_RESTART, 0x2b5c984ef4f0}, {0x2a56230, [TERM], SA_RESTORER|SA_RESTART, 0x2b5c984ef4f0}, 8) = 0
rt_sigaction(SIGINT, {0x2a56460, [INT], SA_RESTORER|SA_RESTART, 0x2b5c984ef4f0}, {0x2a564b0, [INT], SA_RESTORER|SA_RESTART, 0x2b5c984ef4f0}, 8) = 0
rt_sigaction(SIGSEGV, {0x2b5c96be2690, [SEGV], SA_RESTORER|SA_RESTART, 0x2b5c984ef4f0}, {0x2a56320, [SEGV], SA_RESTORER|SA_RESTART, 0x2b5c984ef4f0}, 8) = 0
open("/usr/share/openmpi/help-mpi-api.txt", O_RDONLY) = 14
write(2, "nwchem: malloc.c:3096: sYSMALLOc"..., 431nwchem: malloc.c:3096: sYSMALLOc: Assertion `(old_top == (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - __builtin_offsetof (struct malloc_chunk, fd)))) && old_size == 0) || ((unsigned long) (old_size) >= (unsigned long)((((__builtin_offsetof (struct malloc_chunk, fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & ~((2 * (sizeof(size_t))) - 1))) && ((old_top)->size & 0x1) && ((unsigned long)old_end & pagemask) == 0)' failed.
) = 431
rt_sigprocmask(SIG_UNBLOCK, [ABRT], NULL, 8) = 0
tgkill(7439, 7439, SIGABRT)             = 0
--- SIGABRT (Aborted) @ 0 (0) ---
write(2, "[boron:07439] *** Process receiv"..., 46[boron:07439] *** Process received signal ***
) = 46
futex(0x2b5c98840840, FUTEX_WAKE_PRIVATE, 2147483647) = 0
write(2, "[boron:07439] Signal: Aborted (6"..., 67[boron:07439] Signal: Aborted (6)
[boron:07439] Signal code:  (-6)
) = 67
mmap(NULL, 134217728, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x2b5ca16c5000
munmap(0x2b5ca16c5000, 43233280)        = 0
munmap(0x2b5ca8000000, 23875584)        = 0
mprotect(0x2b5ca4000000, 135168, PROT_READ|PROT_WRITE) = 0
futex(0x2b5c98842590, FUTEX_WAKE_PRIVATE, 2147483647) = 0
futex(0x2b5c98286404, FUTEX_WAKE_PRIVATE, 2147483647) = 0
write(2, "[boron:07439] [ 0] /lib/x86_64-l"..., 83[boron:07439] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0xf030) [0x2b5c97253030]
) = 83
write(2, "[boron:07439] [ 1] /lib/x86_64-l"..., 82[boron:07439] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x35) [0x2b5c984ef475]
) = 82
write(2, "[boron:07439] [ 2] /lib/x86_64-l"..., 81[boron:07439] [ 2] /lib/x86_64-linux-gnu/libc.so.6(abort+0x180) [0x2b5c984f26f0]
) = 81
write(2, "[boron:07439] [ 3] /lib/x86_64-l"..., 78[boron:07439] [ 3] /lib/x86_64-linux-gnu/libc.so.6(+0x75d4a) [0x2b5c98532d4a]
) = 78
write(2, "[boron:07439] [ 4] /lib/x86_64-l"..., 78[boron:07439] [ 4] /lib/x86_64-linux-gnu/libc.so.6(+0x78c73) [0x2b5c98535c73]
) = 78
write(2, "[boron:07439] [ 5] /lib/x86_64-l"..., 88[boron:07439] [ 5] /lib/x86_64-linux-gnu/libc.so.6(__libc_malloc+0x70) [0x2b5c98537960]
) = 88
write(2, "[boron:07439] [ 6] /usr/lib/libo"..., 72[boron:07439] [ 6] /usr/lib/libopen-pal.so.0(+0x32538) [0x2b5c96be2538]
) = 72
write(2, "[boron:07439] [ 7] /usr/lib/libo"..., 91[boron:07439] [ 7] /usr/lib/libopen-pal.so.0(opal_show_help_vstring+0xac) [0x2b5c96be004c]
) = 91
write(2, "[boron:07439] [ 8] /usr/lib/libo"..., 83[boron:07439] [ 8] /usr/lib/libopen-rte.so.0(orte_show_help+0xac) [0x2b5c9697c06c]
) = 83
write(2, "[boron:07439] [ 9] /usr/lib/libm"..., 73[boron:07439] [ 9] /usr/lib/libmpi.so.0(MPI_Abort+0x74) [0x2b5c966f7d14]
) = 73
write(2, "[boron:07439] [10] ./nwchem(armc"..., 62[boron:07439] [10] ./nwchem(armci_msg_abort+0x12) [0x2a27a22]
) = 62
write(2, "[boron:07439] [11] ./nwchem(dass"..., 61[boron:07439] [11] ./nwchem(dassertp_fail+0x113) [0x2a14d43]
) = 61
write(2, "[boron:07439] [12] /lib/x86_64-l"..., 78[boron:07439] [12] /lib/x86_64-linux-gnu/libc.so.6(+0x324f0) [0x2b5c984ef4f0]
) = 78
write(2, "[boron:07439] [13] ./nwchem(dgem"..., 54[boron:07439] [13] ./nwchem(dgemm_+0x446) [0x2c65b96]
) = 54
write(2, "[boron:07439] [14] ./nwchem() [0"..., 42[boron:07439] [14] ./nwchem() [0x2931960]
) = 42
write(2, "[boron:07439] [15] ./nwchem(wnga"..., 60[boron:07439] [15] ./nwchem(wnga_matmul+0x1f05) [0x2934e15]
) = 60
write(2, "[boron:07439] [16] ./nwchem(ga_d"..., 56[boron:07439] [16] ./nwchem(ga_dgemm_+0x9b) [0x28a5deb]
) = 56
write(2, "[boron:07439] [17] ./nwchem(diis"..., 58[boron:07439] [17] ./nwchem(diis_bld12_+0x10a) [0x6ee9da]
) = 58
write(2, "[boron:07439] [18] ./nwchem(dft_"..., 59[boron:07439] [18] ./nwchem(dft_main0d_+0x1fac) [0x6d32d8]
) = 59
write(2, "[boron:07439] [19] ./nwchem(nwdf"..., 53[boron:07439] [19] ./nwchem(nwdft_+0xb89) [0x6c8cf5]
) = 53
write(2, "[boron:07439] [20] ./nwchem(dft_"..., 66[boron:07439] [20] ./nwchem(dft_energy_gradient_+0x3b) [0x6a1adb]
) = 66
write(2, "[boron:07439] [21] ./nwchem(task"..., 66[boron:07439] [21] ./nwchem(task_gradient_doit_+0x36f) [0x535f47]
) = 66
write(2, "[boron:07439] [22] ./nwchem(task"..., 61[boron:07439] [22] ./nwchem(task_gradient_+0x2cc) [0x5373c7]
) = 61
write(2, "[boron:07439] [23] ./nwchem(driv"..., 54[boron:07439] [23] ./nwchem(driver_+0x1f2) [0x642785]
) = 54
write(2, "[boron:07439] [24] ./nwchem(task"..., 61[boron:07439] [24] ./nwchem(task_optimize_+0x4eb) [0x537e3b]
) = 61
write(2, "[boron:07439] [25] ./nwchem(task"..., 53[boron:07439] [25] ./nwchem(task_+0x10b6) [0x528e27]
) = 53
write(2, "[boron:07439] [26] ./nwchem() [0"..., 41[boron:07439] [26] ./nwchem() [0x521b80]
) = 41
write(2, "[boron:07439] [27] ./nwchem(main"..., 50[boron:07439] [27] ./nwchem(main+0x1d) [0x522081]
) = 50
write(2, "[boron:07439] [28] /lib/x86_64-l"..., 92[boron:07439] [28] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xfd) [0x2b5c984dbead]
) = 92
write(2, "[boron:07439] [29] ./nwchem() [0"..., 41[boron:07439] [29] ./nwchem() [0x5202b1]
) = 41
write(2, "[boron:07439] *** End of error m"..., 43[boron:07439] *** End of error message ***
) = 43
rt_sigreturn(0x2b5c9883e880)            = 0
rt_sigaction(SIGABRT, {SIG_DFL, ~[], SA_RESTORER, 0x2b5c984ef4f0}, NULL, 8) = 0
tgkill(7439, 7439, SIGABRT)             = 0
--- SIGABRT (Aborted) @ 0 (0) ---
+++ killed by SIGABRT +++
Aborted
file /usr/lib/openmpi/lib/*.[012]
/usr/lib/openmpi/lib/libmca_common_sm.so.1.0.0:  ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=0x731251ada812ba409b917e7ec925e75cb95fcc52, stripped
/usr/lib/openmpi/lib/libmpi_cxx.so.0.0.1:        ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=0xe679d9fdb2473fb97bd107690b02491280462741, stripped
/usr/lib/openmpi/lib/libmpi_f77.so.0.0.1:        ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=0xdbdf7ec4bd4a97f956e3ff21ebbbdb065e8dc07d, stripped
/usr/lib/openmpi/lib/libmpi_f90.so.0.0.1:        ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=0xf1a77e4b758dcfe4838c6aaf5e92f65f770aded8, stripped
/usr/lib/openmpi/lib/libmpi.so.0.0.2:            ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=0xd8f75c840e38687f22b8a8f2c52ffb29b8a24441, stripped
/usr/lib/openmpi/lib/libopenmpi_malloc.so.0.0.0: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=0xaaf2d64a0c2b989d4bcc93be37e6af122ba701e8, stripped
/usr/lib/openmpi/lib/libopen-pal.so.0.0.0:       ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=0x6f1f28e594e0db80ea54dba9e3f497c5787039f2, stripped
/usr/lib/openmpi/lib/libopen-rte.so.0.0.0:       ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=0x15b7f2b0e8ef92cb7cd357c9f8c7693d3d90a058, stripped
cat src/tools/build/config.log |egrep "BLAS|blas|lapack|LAPACK"
configure:18162: Checks for BLAS,LAPACK,ScaLAPACK
configure:18408: Attempting to locate BLAS library
configure:18414: checking for BLAS with user-supplied flags
configure:18546: checking for BLAS in AMD Core Math Library
configure:18682: checking for BLAS in Intel Math Kernel Library
configure:18818: checking for BLAS in ATLAS
configure:18939: gfortran -o conftest     conftest.f -lf77blas -latlas -lm  >&5
/usr/bin/ld: cannot find -lf77blas
configure:18961: checking for BLAS in PhiPACK libraries
configure:19079: gfortran -o conftest     conftest.f -lsgemm -ldgemm -lblas -lm  >&5
configure:19101: checking for BLAS in Apple Accelerate.framework
configure:19227: checking for BLAS in Apple vecLib.framework
configure:19353: checking for BLAS in Alpha CXML library
configure:19608: checking for BLAS in Sun Performance Library
configure:19861: checking for BLAS in SGI/Cray Scientific Library
configure:19987: checking for BLAS in SGIMATH library
configure:20113: checking for BLAS in IBM ESSL library
configure:20224: gfortran -o conftest     conftest.f -lessl -lblas -lm  >&5
configure:20246: checking for BLAS in generic library
configure:20344: gfortran -o conftest     conftest.f -lblas -lm  >&5
configure:20574: Attempting to locate LAPACK library
configure:20637: checking for Fortran 77 LAPACK with user-supplied flags
configure:20652: gfortran -o conftest      conftest.f  -lblas -lm  >&5
configure:20676: checking for dgetrs_ in -llapack
configure:20709: cc -o conftest        conftest.c -llapack  -L/usr/lib/openmpi/lib/../lib -L/usr/lib/gcc/x86_64-linux-gnu/4.6 -L/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../x86_64-linux-gnu -L/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../lib -L/lib/x86_64-linux-gnu -L/lib/../lib -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib -L. -L/usr/lib/openmpi/lib -L/usr/lib/gcc/x86_64-linux-gnu/4.6/../../.. -lgfortran -lm -lquadmath -lblas -lm  >&5
/usr/bin/ld: cannot find -llapack
| #define HAVE_BLAS 1
| #define BLAS_SIZE 4
configure:20676: checking for dgetrs_ in -llapack_rs6k
configure:20709: cc -o conftest        conftest.c -llapack_rs6k  -L/usr/lib/openmpi/lib/../lib -L/usr/lib/gcc/x86_64-linux-gnu/4.6 -L/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../x86_64-linux-gnu -L/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../lib -L/lib/x86_64-linux-gnu -L/lib/../lib -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib -L. -L/usr/lib/openmpi/lib -L/usr/lib/gcc/x86_64-linux-gnu/4.6/../../.. -lgfortran -lm -lquadmath -lblas -lm  >&5
/usr/bin/ld: cannot find -llapack_rs6k
| #define HAVE_BLAS 1
| #define BLAS_SIZE 4
configure:20741: WARNING: LAPACK library not found, using internal LAPACK
configure:20919: Attempting to locate SCALAPACK library
configure:20926: checking for SCALAPACK with user-supplied flags
configure:20992: gfortran -o conftest       -L/usr/lib/openmpi/lib  conftest.f   -lblas  -lmpi -lopen-rte -lopen-pal -ldl -lmpi_f77 -lpthread -lm  >&5
configure:21014: checking for SCALAPACK in generic library
configure:21081: gfortran -o conftest       -L/usr/lib/openmpi/lib  conftest.f -lscalapack  -lblas  -lmpi -lopen-rte -lopen-pal -ldl -lmpi_f77 -lpthread -lm  >&5
/usr/bin/ld: cannot find -lscalapack
configure:21213: WARNING: ScaLAPACK library not found, interfaces won't be defined
| #define HAVE_BLAS 1
| #define BLAS_SIZE 4
| #define HAVE_LAPACK 0
| #define HAVE_SCALAPACK 0
| #define HAVE_BLAS 1
| #define BLAS_SIZE 4
| #define HAVE_LAPACK 0
| #define HAVE_SCALAPACK 0
configure:38904:           BLAS_LDFLAGS=
configure:38906:              BLAS_LIBS=-lblas
configure:38908:          BLAS_CPPFLAGS=
ac_cv_lib_lapack___dgetrs_=no
ac_cv_lib_lapack_rs6k___dgetrs_=no
BLAS_CPPFLAGS=''
BLAS_LDFLAGS=''
BLAS_LIBS='-lblas'
HAVE_BLAS_FALSE='#'
HAVE_BLAS_TRUE=''
HAVE_LAPACK_FALSE=''
HAVE_LAPACK_TRUE='#'
HAVE_SCALAPACK_FALSE=''
HAVE_SCALAPACK_TRUE='#'
LAPACK_CPPFLAGS=''
LAPACK_LDFLAGS=''
LAPACK_LIBS=''
SCALAPACK_CPPFLAGS=''
SCALAPACK_LDFLAGS=''
SCALAPACK_LIBS=''
#define HAVE_BLAS 1
#define BLAS_SIZE 4
#define HAVE_LAPACK 0
#define HAVE_SCALAPACK 0



Successful build on debian -- external libs

Build script:
export LARGE_FILES=TRUE
export TCGRSH=/usr/bin/ssh
export NWCHEM_TOP=`pwd`
export NWCHEM_TARGET=LINUX64
export NWCHEM_MODULES="all python"
export PYTHONVERSION=2.7
export PYTHONHOME=/usr
export BLASOPT="-L/opt/openblas/lib -lopenblas -lopenblas_barcelona-r0.1.1"
export USE_MPI=y
export USE_MPIF=y
export USE_MPIF4=y
export MPI_LOC=/usr/lib/openmpi/lib
export MPI_INCLUDE=/usr/lib/openmpi/include
export LIBRARY_PATH=$LIBRARY_PATH:/usr/lib/openmpi/lib
export LIBMPI="-lmpi -lopen-rte -lopen-pal -ldl -lmpi_f77 -lpthread"
cd $NWCHEM_TOP/src
make clean
make nwchem_config
make FC=gfortran

ldd nwchem 
        linux-vdso.so.1 =>  (0x00007fffbdfff000)
        libopenblas.so.0 => /opt/openblas/lib/libopenblas.so.0 (0x00002b1065eef000)
        libmpi.so.0 => /usr/lib/libmpi.so.0 (0x00002b1066cfc000)
        libopen-rte.so.0 => /usr/lib/libopen-rte.so.0 (0x00002b1066faf000)
        libopen-pal.so.0 => /usr/lib/libopen-pal.so.0 (0x00002b10671fe000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00002b1067456000)
        libmpi_f77.so.0 => /usr/lib/libmpi_f77.so.0 (0x00002b106765a000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00002b1067892000)
        libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00002b1067aaf000)
        libz.so.1 => /usr/lib/x86_64-linux-gnu/libz.so.1 (0x00002b1067cb2000)
        libssl.so.1.0.0 => /usr/lib/x86_64-linux-gnu/libssl.so.1.0.0 (0x00002b1067ec8000)
        libgfortran.so.3 => /usr/lib/x86_64-linux-gnu/libgfortran.so.3 (0x00002b1068127000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00002b106843d000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00002b10686bf000)
        libquadmath.so.0 => /usr/lib/x86_64-linux-gnu/libquadmath.so.0 (0x00002b10688d6000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00002b1068b0b000)
        libcrypto.so.1.0.0 => /usr/lib/x86_64-linux-gnu/libcrypto.so.1.0.0 (0x00002b1068e92000)
        libnsl.so.1 => /lib/x86_64-linux-gnu/libnsl.so.1 (0x00002b1069277000)
        /lib64/ld-linux-x86-64.so.2 (0x00002b1065ccd000)

mpirun -n 1 nwchem test.nw
WORKS!


gfortran --version
GNU Fortran (Debian 4.6.3-1) 4.6.3
ar t libncwutil.a
basis.o bas_input.o bas_contrib.o bas_checksum.o basisP.o bas_blas.o bas_blasP.o bas_vec_info.o geom.o geom_input.o geom_input2.o geom_3d.o geom_2d.o geom_1d.o geom_numcore.o geom_checksum.o geom_print_ecce.o geom_freeze.o geom_fragment.o geom_getsym.o geom_hnd.o c_inp.o hnd_rdfree.o inp.o inp_irange.o inp_ilist.o pstat_init.o pstat_term.o pstat_alloc.o pstat_on.o pstat_off.o pstat_pr_all.o pstat_free.o pstat_pr_han.o pstat_pr_det.o hdbm.o rtdb_f2c.o rtdb.o rtdb_seq.o context.o context_f2c.o dosymops.o gensym.o sym_vec_sym.o ludcmp.o mprint.o opprint.o sym_put_geom.o spgen.o sym_map.o sym_nwc.o sym_apply_op.o sym_get_cart.o sym_grp_name.o sym_ap_cart.o sym_cent_map.o sym_num_ops.o sym_ops_get.o sym_pr_all.o sym_geom_prj.o cross_prod.o sym_op_cname.o sym_pr_ops.o deter3.o sym_op_clsfy.o sym_tr_bs_op.o sym_bs_irrep.o sym_op_type.o sym_char_tab.o sym_pr_ctab.o sym_inv_op.o sym_mo_adapt.o sym_g_sym.o wrcell.o dctr.o sym_irrepname.o sym_abelian.o sym_sym.o sym_mo_ap_op.o sym_bas_op.o sym_sh_pair.o md5wrap.o md5.o output.o errquit.o ffflush.o print_center.o util_flush.o util_host.o util_date.o input_echo.o util_transpose.o ga_iter_diag.o ga_maxelt.o ga_pcg_min.o line_search.o ga_orth_vec.o ga_ran_fill.o ga_mix.o ga_list.o ga_it_proj.o ga_screen.o ga_get_diag.o fortchar.o seq_output.o ga_mat2col.o util_ch_brd.o two_ind_trn.o util_pname.o sread.o swrite.o banner.o util_print.o util_version.o util_nwchem_paper.o mk_fit_xf.o int_2c_ga.o ga_local_mdot.o util_cpusec.o util_wallsec.o gather.o scatter.o ga_trace_dg.o lcopy.o util_legal.o util_file_name.o util_io_unit.o util_speak.o util_rtdb_speak.o util_file_copy.o util_file_unlink.o util_system.o util_sleep.o util_rtdb_state.o ecce_print.o util_random.o util_job.o util_getenv.o util_getarg.o util_nwchemrc.o util_md.o util_md_c.o util_md_sockets.o dgewr.o atoi.o indint.o util_wall_remain.o ga_normf.o corr_mk_ref.o nw_inp_from_file.o bgj.o movecs_ecce.o get_density.o moeig_read.o util_debug.o util_erf.o ga_it2.o ma_print.o freeze_input.o ga_extra.o util_test.o util_ga_test.o util.o util_patch_test.o util_ndim_test.o util_perf_test.o util_test_lu.o util_test_eig.o util_dra_test.o util_eaf_test.o util_sf_test.o ga_lkain_2cpl3.o util_io.o util_xyz.o util_ma.o util_mpinap.o linux_cpu.o linux_shift.o linux_random.o erfc.o linux_setfpucw.o ga_matpow.o util_pack.o dabssum.o dabsmax.o dfill.o ifill.o mabyte_fill.o ga_it_lsolve.o ga_it_orth.o ga_orthog.o idamin.o util_jacobi.o stpr_sjacobi.o util_memcpy.o ga_accback.o ga_asymmetr.o util_gnxtval.o nxtask.o util_mirror.o util_sgroup.o icopy.o dsum.o dgefa.o




Successful build on ROCKS 5.4.3 on dell cluster (intel):
Script:
export LARGE_FILES=TRUE
export TCGRSH=/usr/bin/ssh
export NWCHEM_TOP=/share/apps/nwchem/nwchem-6.1
export NWCHEM_TARGET=LINUX64
export NWCHEM_MODULES="all python"
export PYTHONHOME=/opt/rocks
export PYTHONVERSION=2.4
export USE_MPI=y
export USE_MPIF=y
export USE_MPIF4=y
export MPI_LOC=/opt/openmpi
export MPI_INCLUDE=/opt/openmpi/include
export LIBRARY_PATH=$LIBRARY_PATH:/opt/openmpi/lib:/share/apps/openblas
export LIBMPI="-lmpi -lopen-rte -lopen-pal -ldl -lmpi_f77 -lpthread"
export BLASOPT="-L/share/apps/openblas/lib -lopenblas -lopenblas_nehalem-r0.1.1 -lopenblas_nehalemp-r0.1.1"
cd $NWCHEM_TOP/src
#make clean
make  nwchem_config
make  FC=gfortran
ldd /share/apps/nwchem/nwchem-6.1/bin/LINUX64/nwchem 
        linux-vdso.so.1 =>  (0x00007fff301fd000)
        libopenblas.so.0 => /share/apps/openblas/lib/libopenblas.so.0 (0x00002aea76d0b000)
        libmpi.so.0 => /opt/openmpi/lib/libmpi.so.0 (0x00002aea77a16000)
        libopen-rte.so.0 => /opt/openmpi/lib/libopen-rte.so.0 (0x00002aea77ded000)
        libopen-pal.so.0 => /opt/openmpi/lib/libopen-pal.so.0 (0x00002aea7806f000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00000039ee200000)
        libmpi_f77.so.0 => /opt/openmpi/lib/libmpi_f77.so.0 (0x00002aea782e2000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00000039ee600000)
        libutil.so.1 => /lib64/libutil.so.1 (0x00000039f7a00000)
        libgfortran.so.1 => /usr/lib64/libgfortran.so.1 (0x00002aea78515000)
        libm.so.6 => /lib64/libm.so.6 (0x00000039ede00000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00000039fa400000)
        libc.so.6 => /lib64/libc.so.6 (0x00000039eda00000)
        libnsl.so.1 => /lib64/libnsl.so.1 (0x00000039f0a00000)
        /lib64/ld-linux-x86-64.so.2 (0x00000039ed600000)

This build works, whether I specify the openblas libs or use the internal nwchem libs.

Good build: Nwchem 6.0 with openblas on amd phenom II X6-- this is my 'reference' build since it works perfectly.
Build script:
export LARGE_FILES=TRUE
export TCGRSH=/usr/bin/ssh
export NWCHEM_TOP=`pwd`
export NWCHEM_TARGET=LINUX64
export NWCHEM_MODULES="all python"
export PYTHONHOME=/usr
export PYTHONVERSION=2.7
export USE_MPI=y
export USE_MPIF=y
export MPI_LOC=/usr/lib/openmpi/lib
export MPI_INCLUDE=/usr/lib/openmpi/include
export BLASOPT="-L/opt/openblas/lib -lopenblas -lopenblas_barcelona-r0.1.1"
export LIBRARY_PATH=$LIBRARY_PATH:/usr/lib/openmpi/lib
export LIBMPI="-lmpi -lopen-rte -lopen-pal -ldl -lmpi_f77 -lpthread"
cd $NWCHEM_TOP/src
make clean
make nwchem_config
make FC=gfortran

ldd /opt/nwchem/nwchem-6.0/bin/LINUX64/nwchem
        linux-vdso.so.1 =>  (0x00007fff3cd72000)
        libopenblas.so.0 => /opt/openblas/lib/libopenblas.so.0 (0x00002b3404bb2000)
        libmpi.so.0 => /usr/lib/libmpi.so.0 (0x00002b34059bf000)
        libopen-rte.so.0 => /usr/lib/libopen-rte.so.0 (0x00002b3405c72000)
        libopen-pal.so.0 => /usr/lib/libopen-pal.so.0 (0x00002b3405ec1000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00002b3406119000)
        libmpi_f77.so.0 => /usr/lib/libmpi_f77.so.0 (0x00002b340631d000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00002b3406555000)
        libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00002b3406772000)
        libz.so.1 => /usr/lib/x86_64-linux-gnu/libz.so.1 (0x00002b3406975000)
        libssl.so.1.0.0 => /usr/lib/x86_64-linux-gnu/libssl.so.1.0.0 (0x00002b3406b8b000)
        libgfortran.so.3 => /usr/lib/x86_64-linux-gnu/libgfortran.so.3 (0x00002b3406dea000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00002b3407100000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00002b3407382000)
        libquadmath.so.0 => /usr/lib/x86_64-linux-gnu/libquadmath.so.0 (0x00002b3407599000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00002b34077ce000)
        libcrypto.so.1.0.0 => /usr/lib/x86_64-linux-gnu/libcrypto.so.1.0.0 (0x00002b3407b55000)
        libnsl.so.1 => /lib/x86_64-linux-gnu/libnsl.so.1 (0x00002b3407f3a000)
        /lib64/ld-linux-x86-64.so.2 (0x00002b3404990000)

No comments:

Post a Comment