09 June 2015

612. Randomly Rebooting Router (E2500-AU v1.0 w/ TomatoUSB)

Rolling update:
* 24 June 2015: 7 days uptime with wifi working perfectly. Did reboot it last night because my work computer lost contact with the router somehow (connects via reverse tunnel). The issue with the Randomly Rebooting Router can be considered solved. Obviously, it's solved by crippling the router by turning off the 5 GHz band and tkip (the latter may not be related though).

* Submitted a bug report: http://tomato.groov.pl/?page_id=334&bugerator_nav=display&bug_project=1&issue=1833

* 16 June 2015 12:42 AEST.  The router has been up for two days and four hours (and counting) in spite of heavy use of our phones. Seems like turning off 5 GHz and/or switching from AES to TKIP has worked. A fair criticism is that I don't have much of a baseline to compare with when it comes to reboots, but subjectively there's a lot less swearing over crappy wifi the past two days.

* 14 June 2015 08:05 AEST. After two days of uptime when radio-silence was enforced, we turned our phones back on. The router rebooted later that night. Same thing happened the next night. After briefly putting dd-wrt on the router, I put tomatousb back on it, turned off 5 GHz and changed from AES to TKIP. The router has been up since 9.30 pm last night (10 h and counting)

Found this bug report: http://tomato.groov.pl/?page_id=334&bugerator_nav=display&bug_project=1&issue=1813

Also read this with interest: http://movingpackets.net/2013/11/18/linksys-e2500-deserves-no-airplay/

I've seen posts that find that dd-wrt doesn't have the randomly rebooting issue. dd-wrt doesn't support dual band, at least on e2500. I was surprised that v1 of the cisco linksys firmware had the same exact issue (random reboots when 5 GHz is on). It's all pointing in a specific direction.

Not sure why using the 5 GHz channel with my laptop doesn't trigger the reboots, but maybe they did -- but happened less frequently due to the lower number of 5 GHz capable devices prior to us getting the phones.

* 10 June 2015 16:24 AEST. Since turning off wifi on the Samsung Galaxy S4 phones (but using the two laptops and the tablet listed below) the router has stayed up for 24 hours 8 hours and 11 minutes, and counting. The night between Monday and Tuesday, when we were using our phones, the router rebooted at least twice.


This is another one of those posts that don't offer a solution, but rather states a problem. I'm doing this in the hope that others who are making similar observations as I am will see this and...well, feel slightly less alone at the very least. In the best case, someone will have a solution and offer it as a comment.

So, here's the issue: 
* I have a Linksys E2500-AU v1.0 that is running TomatoUSB (howto)
Tomato v1.28.0000 MIPSR2-128 K26 USB Max ======================================================== Welcome to the Linksys E2500 v1.0 [TomatoUSB] Uptime: 08:14:26 up 1 min Load average: 0.52, 0.18, 0.06 Mem usage: 28.4% (used 17.06 of 59.96 MB) WAN : 192.168.1.100/24 @ 58:6D:8F:D3:XX:XX LAN : 192.168.2.1/24 @ DHCP: 192.168.2.2 - 192.168.2.202 WL0 : volatile @ channel: AU13 @ 58:6D:8F:D3:XX:XX WL1 : volatile50 @ channel: AU153 @ 00:01:36:1F:XX:XX ========================================================
* It has a "Broadcom BCM5357 chip rev 2 pkg 8"

* For a long time it, and its predecessor (a WRT-54GL), were running just fine. The predecessor got replaced due to a fried power supply.

* Over the past six-seven months there have been issues with the wireless signal dropping. It isn't just the wireless transmission being stopped and restarted, but the router actually reboots (according to uptime).

* We used to have the following wireless devices: Fujitsu lifebook (v100?), Thinkpad SL410, Google Nexus One and a HTC Legend. At some point we also got a Samsung Galaxy Tab 2. This configuration was running for a few years.

* Coinciding roughly with the perceived start of the rebooting issue was me purchasing a Samsung Galaxy S4 (i9505).

* The issue got a lot worse recently.

* Recently my partner also got a Samsung Galaxy S4 (i9505).

* I have an almost identical router (v2) at work, and the current uptime is 140 days. I do connect very occasionally via wireless to it using my Samsung Galaxy S4. However, this router has a "Broadcom BCM5357 chip rev 1 pkg 8".

What seems to be happening:
The Samsung Galaxy S4 phones seem to be destabilising the router and causing reboots. No, wait, hear me out. It shouldn't happen, and the adage about 'correlation vs causation' may well be true in this case too, but there are precendents (apparently) when it comes to Intel wireless devices:

On http://www.linksysinfo.org/index.php?threads/tomatousb-keeps-resetting-the-router.33208/ from 2010
Hrm...routers used to spontaneously reboot when the wireless driver failed on Tomato as a result of an Intel (mobile) wireless driver bug on Windows. Maybe similar?
On http://www.linksysinfo.org/index.php?threads/random-reboots.21020/ from 2007
Currently using DD-wrt V24, it's been up for 25 days I can confirm it has something to do with the broadcom wireless drivers.

and
What kind of wireless device does your laptop have? Is it Intel 2100/2200 by any chance?
And in the end the thread concludes that it was due to users with Intel 2100/2200 cards.

The Samsung Galaxy S4 has a Qualcomm Snapdragon 600 APQ8064AB, with is a system-on-a-chip. The Nexus One and HTC Legend also had snapdragons, but obviously much older models. The Galaxy Tab 2 seems to have a Texas Instrument chip (TI OMAP 4430).

Could it be that the phones are causing the issues?

Luckily it's something that's reasonably easy to test, so I'm looking forward to reporting back in a couple of days (of enforced radio silence).

A different test will be to swap routers (but not power supplies) between work and home and see if the behaviour is location dependent. That will take a lot more effort though due to the very specific set-ups.

As the logs get erased on reboot I'm tracking the uptime from now on using autossh and logging from a work computer that's always on.


Some more:
Below is a post regarding iphones, and rebooting routers.

While that post is not related to 5GHz causing issues, it's got me thinking that as neither the Fujitsu, Galaxy Tab, Nexus One or HTC legend support 5 GHz but the Samsung Galaxy S4 phones do, the issue may be possibly related to that. There is obviously quite a lot of things to test.
ADDITIONAL INFORMATION: in the mean time we have been checking and elimination as well. We have been trying to connect certain wireless devices to the network through the DAP and something odd has come up. We have been trying mobile phones (smartphones) at first. My own telephone (Samsung Galaxy S3) seems to cause no troubles. With that phone connected for a whole day, internet connection does not fail once (the router does not reboot). I have been trying both 5Ghz and 2.4 Ghz. bands, both worked okay. One colleague also wanted to connect his Apple Iphone 3G (S?) to the network. I told him he could but this phone could not find the 5Ghz network, so I have switched back to 2.4Ghz again. The iPhone connected and within 5 minutes the connection interrupted. I have set the network back to 5Ghz (so the iPhone could no longer connect) and changed the network settings again. This morning I switched back to 2.4 with only my phone connected. Not a problem. This afternoon I let my colleague connect his phone again and disconnected my Samsung. Within 5 minutes the router started to reboot! After I had the iPhone disconnected again and let another colleague connect his phone, a Samsung Galaxy S(1). So far no problems.

Tomato Anon
Somehow the Spontaneously Rebooting Router doesn't show up here, while the stable one does: http://anon.groov.pl/index.php?country=Australia

Either way, the anon database is a great way of quickly finding out what Tomato version you can put on your router.

05 June 2015

611. Building ecce on debian jessie

NOTE:
* I've confirmed that ECCE built like this installs and works perfectly on a Thinkpad SL410 with intel graphics

* It also compiles and runs perfectly on a home built desktop with external nvidia card (GF119) using the binary non-free nvidia drivers

* It also compiles and runs perfectly on a home built desktop with onboard nvidia (GeForce 7025/nForce 630a) using the nouveau drivers. I had issues on this desktop before, but reinstalled debian jessie from scratch. Before that I used nvidia-legacy drivers, which may or may not  (probably not) have had something to do with it not working.

* UPDATE: instead of putting
#include <freetype.h>
the recommended method is to use
#include FT_FREETYPE_H
I've updated the patch_script.sh below accordingly. Same goes for ftoutln.h vs FT_OUTLINE_H, but the latter didn't work (error saying #include must use "" or <>)

* UPDATE: If you're having issues with undetected -lGL and -lGLU in the wxwidgets step, it's because
667       endif
668       cat configure.orig | sed -e 's%^SEARCH_INCLUDE="\\%SEARCH_INCLUDE="$ECCE_HOME/${ECCE_SYSDIR}3rdparty/mesa/include \\%' >! configure
669       chmod a+x configure
670 

needs to be changed to
667       endif
668       cat configure.orig | sed -e 's%^SEARCH_INCLUDE="\\%SEARCH_INCLUDE="$ECCE_HOME/${ECCE_SYSDIR}3rdparty/mesa/include $ECCE_HOME/${ECCE_SYSDIR}3rdparty/mesa/lib \\%' >! configure
669       chmod a+x configure
670 
I had this issue on a debian wheezy system with the vendor nvidia libraries.
I wouldn't have spotted this bug otherwise.

* Another error from my debian wheezy nvidia system: if you get
checking how to run the C preprocessor... x86_64-linux-gnu-gcc -E ./configure: line 2880: syntax error near unexpected token `Using' ./configure: line 2880: `  AC_MSG_NOTICE(Using external PCRE library from $PCRE_CONFIG)'
then make sure you're not using autoconf2.13, which is an obsolete version. I think I have it due to my system originally being installed back in 2010.

* Finally, I'm currently  working on fixing minor things that have been nagging me in ecce. One is the basis set quicklist (in src/dsm/edsiimpl/ICalcUtils.C), but obviously that's a personal preference. A more serious one is the 256 character limit for csh commands:
Exceeds maximum C shell command length of 256 characters
Note that this isn't the C shell complaining -- this is a built-in limit in ecce (in src/comm/rcommand/RCommand.C). I've changed that limit to 16384 characters (the real limit is much, much higher)
I've also added two basis sets to ECCE, and have tinkered with the exchange-correlation functionals.

I'll try to push the fixes back upstreams when I'm ready if they'll accept them; otherwise I'll create my own github/sourceforge repo.

THE POST:
Building ECCE on debian wheezy was a breeze. Building ECCE on debian jessie was painful.

In the end it boiled down to two things:
*freetype headers are no longer in freetype2/freetype/ but in freetype2/

*-Wformat-security is turned on by default

mkdir ~/tmp/ecce_compile -p
cd ~/tmp/ecce_compile
sudo apt-get install bzip2 build-essential autoconf libtool ant pkg-config
sudo apt-get install gtk+-2.0-dev libxt-dev csh gfortran openjdk-7-jdk python-dev
sudo apt-get install libjpeg-dev imagemagick xterm libfreetype6-dev libfl-dev libtool-bin

As usual I'm not 100% sure when it comes to the necessary packages. libfl-dev might not be needed.

Download the ECCE source code and put the ecce-v7.0-src.tar.bz2 file in ~/tmp/ecce_compile. Put the patch_script.sh file (see below in this post for the code) in ~/tmp/ecce_compile. Then do
 
tar xvf ecce-v7.0-src.tar.bz2 
cd ecce-v7.0/
export ECCE_HOME=`pwd`
cd build/
./build_ecce

You'll now step through a list over programs and libraries that are needed and what ECCE can find. If you're having issues with e.g. javac and java being different versions, use sudo update-alternative --config javac.

At the end you'll be asked whether to skip these checks next time -- answer y(es).

Next do
./build_ecce|tee xerxes.log && ./build_ecce |tee mesa.log && ./build_ecce |tee wxwidgets.log
sh ../../patch_script.sh && ./build_ecce|tee wxpython.log 
./build_ecce|tee httpd.log && ./build_ecce|tee ecce.log

If all went well you've managed to build the ecce binaries. If not, go through wxpython.log and check for errors relating to format-security statements. Then go through ecce.log and look for issues with freetype.

What to do with the binaries? Follow one of the many ECCE installation posts on this blog, e.g. http://verahill.blogspot.com.au/2013/08/487-version-70-of-ecce-out-now.html

NOTE that if you get ''Invalid null command." trying to execute install_ecce.v7.0.csh, install tcsh and do
tcsh install_ecce.v7.0.csh

The patch_script.sh file -- when copying, make sure to check that the lines end where they are supposed to and not broken up.
#!/bin/bash cp ${ECCE_HOME}/build/3rdparty-dists/wxPython-src-2.8.12.1.tar.bz2 ${ECCE_HOME}/3rdparty/build/ cd ${ECCE_HOME}/3rdparty/build/ tar xvf wxPython-src-2.8.12.1.tar.bz2 rm wxPython-src-2.8.12.1.tar.bz2 cd ${ECCE_HOME}/3rdparty/build/wxPython-src-2.8.12.1/wxPython grep -rsl "PyErr_Format(PyExc_RuntimeError, mesg)" *|xargs -I {} sed -i 's/PyErr_Format(PyExc_RuntimeError, mesg)/PyErr_Format(PyExc_RuntimeError, "%s", mesg)/g' {} cd ${ECCE_HOME}/3rdparty/build/wxPython-src-2.8.12.1/wxPython/src/gtk/ sed -i 's/wxLogFatalError(m)/wxLogFatalError("%s", m.c_str())/g' _misc_wrap.cpp sed -i 's/wxLogError(m)/wxLogError("%s", m.c_str())/g' _misc_wrap.cpp sed -i 's/wxLogWarning(m)/wxLogWarning("%s", m.c_str())/g' _misc_wrap.cpp sed -i 's/wxLogMessage(m)/wxLogMessage("%s", m.c_str())/g' _misc_wrap.cpp sed -i 's/wxLogInfo(m)/wxLogInfo("%s", m.c_str())/g' _misc_wrap.cpp sed -i 's/wxLogDebug(m)/wxLogDebug("%s", m.c_str())/g' _misc_wrap.cpp sed -i 's/wxLogVerbose(m)/wxLogVerbose("%s", m.c_str())/g' _misc_wrap.cpp sed -i 's/wxLogStatus(pFrame, m)/wxLogStatus(pFrame, "%s", m.c_str())/g' _misc_wrap.cpp sed -i 's/wxLogStatus(m)/wxLogStatus("%s", m.c_str())/g' _misc_wrap.cpp sed -i 's/wxLogSysError(m)/wxLogSysError("%s", m.c_str())/g' _misc_wrap.cpp sed -i 's/wxLogGeneric(level, m)/wxLogGeneric(level, "%s", m.c_str())/g' _misc_wrap.cpp sed -i 's/wxLogTrace(mask, m)/wxLogTrace(mask, "%s", m.c_str())/g' _misc_wrap.cpp cd ${ECCE_HOME}/src grep -srl "<freetype/freetype.h>" |xargs -I {} sed -i 's,<freetype/freetype.h>,FT_FREETYPE_H,g' {} grep -srl "freetype/" |xargs -I {} sed -i 's,freetype/,,g' {} cd ${ECCE_HOME}/build


What I found during troubleshooting:

In files such as:
./3rdparty/build/wxPython-src-2.8.12.1/wxPython/src/gtk/_core_wrap.cpp
./3rdparty/build/wxPython-src-2.8.12.1/wxPython/src/gtk/_gdi_wrap.cpp
./3rdparty/build/wxPython-src-2.8.12.1/wxPython/src/gtk/_windows_wrap.cpp
./3rdparty/build/wxPython-src-2.8.12.1/wxPython/src/gtk/_controls_wrap.cpp
there are sections which look like this:
863 } else { 864 PyErr_Format(PyExc_RuntimeError, mesg); 865 }
Compiling with -Wformat-security means that you'll have to patch all those expression to
863 } else { 864 PyErr_Format(PyExc_RuntimeError, "%s", mesg); 865 }
There were similar issue with wxLog*(m) statements in other files, e.g.
3rdparty/build/wxPython-src-2.8.12.1/wxPython/src/gtk/_misc_wrap.cpp -> ("%s", m.c_str()) 3093 m.Replace(wxT("%"), wxT("%%")); 3094 wxLogFatalError(m); 3095 } .. 3177 m.Replace(wxT("%"), wxT("%%")); 3178 wxLogTrace(mask, m); 3179 }

04 June 2015

610. Opening G09 output files in ECCE. A rough method

Update: Note that one thing that's not recognised at the moment are the MOs. Somehow I don't think that should be too difficult. Likewise, I think one should be able to import nwchem files by a little bit of editing like below.

Original post:
When opening a gaussian output file with the ECCE viewer:
* a symlink to the file is put in a subdirectory of /tmp
* the file is parsed for indications as to what the file type is:
+go+cd /tmp/ecce_me/jobs/Gaussian03__HEa24c +go+ln -s /home/me/calcs/test/Outputs/g03.g03out g03.g03out; echo CMDSTAT=$status CMDSTAT=0 +go+grep "Gaussian 98, Revision" g03.g03out; echo CMDSTAT=$status CMDSTAT=1 +go+grep "Gaussian 94, Revision" g03.g03out; echo CMDSTAT=$status CMDSTAT=1 +go+grep "Gaussian 03, Revision" g03.g03out; echo CMDSTAT=$status CMDSTAT=1 +go+grep "%begin%input" g03.g03out; echo CMDSTAT=$status CMDSTAT=1 +go+grep "%begin%input" g03.g03out; echo CMDSTAT=$status CMDSTAT=1 +go+grep "Northwest Computational Chemistry Package" /home/me/test/Outputs/g03.g03out; echo CMDSTAT=$status
That's easy enough to fool by simply putting a Gaussian 03 line in the output (assuming that the G09 and G03 output are similar enough).

Here's a successful example, in the sense that ECCE found that it was a Gaussian 03 file:
CMDSTAT=0 +go+grep "Gaussian 98, Revision" g03.g03out; echo CMDSTAT=$status CMDSTAT=1 +go+grep "Gaussian 94, Revision" g03.g03out; echo CMDSTAT=$status CMDSTAT=1 +go+grep "Gaussian 03, Revision" g03.g03out; echo CMDSTAT=$status Gaussian 03, Revision CMDSTAT=0 +go+if (-w /tmp/ecce_me/jobs/Gaussian03__7491Sb) echo TRUE TRUE +go+echo $PATH; echo CMDSTAT=$status Word too long. +go+if (-x Gaussian-03.expt) echo TRUE +go+exit; echo GOODBYE exit; echo GOODBYE

However, it fails to detect that Gaussian-03.expt is present and executable (both of which are true).

[SOLUTION]

To sort that out and to enable G09 detection, edit apps/data/client/cap/Gaussian-03.edml:
465 <output mimetype="chemical/x-gaussian03-output" type="parse" verifypattern="Gaussian 09, Revision">g03.g03out</output> 466 <output mimetype="chemical/x-gaussian-03-output" type="parse" verifypattern="Gaussian 09, Revision">g03.out</output> 467 <output mimetype="chemical/x-gaussian03-output" type="parse" verifypattern="Gaussian 03, Revision">g03.g03out</output> 468 <output mimetype="chemical/x-gaussian-03-output" type="parse" verifypattern="Gaussian 03, Revision">g03.out</output> .. 480 <importer>${ECCE_HOME}/scripts/parsers/Gaussian-03.expt </importer>
Your G09 files should now open properly (most of the time).

My ecce_env is fine and my runtime_setup.sh file is called by bash, but somehow it wouldn't find the Gaussian-03.expt file. Maybe it has something to do with the use of csh -f

NOTE that the file isn't imported -- it's just opened. It would've been nice if you could actually import the calculation into ECCE. Still, being able to view it is a nice start. 

609. NBO6 on a debian cluster (/w g09)

Curse blogspot and the lack of revision control and backups! I lost my post when it was almost finished.

So here's a briefer version. I have bought NBO6 and I want to integrate it with gaussian G09 rev. D (you can't use it directly with earlier binary versions)

My 'instructions' are basically copy/pasted from the NBO6 installation instructions -- this is a tl;dr version.
sudo cp nbo6.0-bin-linux-x86_64.tar.gz /opt/ cd /opt/ sudo tar xvf nbo6.0-bin-linux-x86_64.tar.gz sudo chown $USER:$USER nbo6 -R vim nbo6/bin/gaunbo6
Edit:
3 set INT = i8 4 set BINDIR = /opt/nbo6/bin
In your queue file add
set path = ( /opt/nbo6/bin $path )
In my case, as I use ECCE I edited my apps/siteconfig/CONFIG.node files:
Gaussian-03Command{ set path = ( /opt/nbo6/bin $path ) setenv GAUSS_SCRDIR /home/me/scratch setenv GAUSS_EXEDIR /opt/gaussian/g09d/g09/bsd:/opt/gaussian/g09d/g09/local:/opt/gaussian/g09d/g09/extras:/opt/gaussian/g09d/g09 /opt/gaussian/g09d/g09/g09< $infile > $outfile echo 0 }
I then tested it by running a basic gaussian calculation:
%Chk=H2O_631g.chk #P rOPBE/6-31G 6D 10F SCRF=(PCM,Solvent=water) Punch=(MO) pop=(nbo6) H2O 6-31G 0 1 ! charge and multiplicity O 0.00000 0.00000 0.118491 H 0.00000 0.754898 -0.473964 H 0.00000 -0.754898 -0.473964
and got
... 411 fchk file "/home/me/scratch/Gau-24659.EFC" 412 mat. el file "/home/me/scratch/Gau-24659.EUF" 413 414 Writing Wrt12E file "/home/me/scratch/Gau-24659.EUF" 415 Gaussian matrix elements Version 1 NLab= 7 Len12L=8 Len4L=8 416 Write GAUSSIAN SCALARS from file 501 offset 0 to matrix element file. .. 429 Write ALPHA FOCK MATRIX from file 10536 offset 0 to matrix element file. 430 No 2e integrals to process. 431 Perform NBO analysis... 432 433 *********************************** NBO 6.0 *********************************** 434 N A T U R A L A T O M I C O R B I T A L A N D 435 N A T U R A L B O N D O R B I T A L A N A L Y S I S 436 ********************* Me ********************* .. 447 Filename set to /home/me/scratch/Gau-24659 .. 620 ------------------------------- 621 Total Lewis 9.99612 ( 99.9612%) 622 Valence non-Lewis 0.00029 ( 0.0029%) 623 Rydberg non-Lewis 0.00358 ( 0.0358%) 624 ------------------------------- 625 Total unit 1 10.00000 (100.0000%) 626 Charge unit 1 0.00000 627 628 $CHOOSE 629 LONE 1 2 END 630 BOND S 1 2 S 1 3 END 631 $END 632 633 Maximum scratch memory used by NBO was 62605 words 634 Maximum scratch memory used by G09NBO was 9032 words ...