01 February 2013

329. ECCE, xterm and X forwarding: fixing broken "tail -f on output" in ECCE/'untrusted X11 forwarding' error


The problem
In ECCE when you highlight a running job on a remote server which you've set up with the frontendMachine option (here and here and here) which is a ROCKS 5.4.3/CentOS server and e.g. hit Alt+L or "Run Mgmt"/"Tail -f on Output file" and nothing happens, and when you set ECCE to provide verbose output (add "ECCE_RCOM_LOGMODE true" to ecce/apps/siteconfig/site_runtime) you see the following errors:

X11 connection rejected because of wrong authentication. X connection to localhost:43.0 broken (explicit kill or server shutdown).
and
OpenSSH_4.3p2, OpenSSL 0.9.8e-fips-rhel5 01 Jul 2008 Warning: untrusted X11 forwarding setup failed: xauth key data not generated Warning: No xauth data; using fake authentication data for X11 forwarding.
Obviously there are non-ECCE related situation where you may see these errors too. Doesn't matter -- same solution.


The diagnostics
cat /etc/ssh/sshd_config |grep X11
X11Forwarding yes X11DisplayOffset 10
cat /etc/ssh/ssh_config |grep X11|grep -v ^#
ForwardX11 yes
sudo cat /etc/ssh/sshd_config |grep X11|grep -v ^#
X11Forwarding yes X11DisplayOffset 10

So, why localhost:43? And why isn't it working? From my workstation to the cluster which is connected to the net via the front node, and then from the cluster front to the cluster front's local name.

ssh -X server.external.dns
echo $DISPLAY
localhost:42.0
ssh -X server.local.dns
Warning: untrusted X11 forwarding setup failed: xauth key data not generated Warning: No xauth data; using fake authentication data for X11 forwarding.
echo $DISPLAY
localhost:44.0
yet
ssh -Y server.local.dns

works fine.

The solution:
Simpler than I thought:
I edited ~/.ssh/config on the server, and did
Host server.local.dns Hostname server.local.dns User me ForwardX11 yes ForwardX11Trusted yes

And now it works!

Presumably I could've just edited /etc/ssh_config instead, but it's a multi-user cluster and I'm happier to change things on a user-by-user basis.

28 January 2013

328. Liberate your router: dd-wrt on Netgear WGT624 v4

UPDATE 1 Feb 2013: I haven't had any explicit problems with my router since flashing it. Everything is apparently working well and my network connection is reliable and fine (if only subjectively a bit slower than before --  running a speed test shows that it's as fast as ever so not sure what's happening). HOWEVER, I've suddenly started having issues with ECCE and submitting jobs via a frontendMachine -- I kept getting "cannot 'cd' to run directory" errors, but the ECCE log contains no errors messages at all. This wouldn't happen for very small NWChem input files, and it would happen ca 80% of the time. Normally I wouldn't suspect this was a router issue, but changing back to my (unflashed) AR430W resolved the issues immediately. Somehow I suspect this is a router version of this http://verahill.blogspot.com.au/2012/09/briefly-packet-corrupt-during-ssh.html, but then I should see error messages in the ECCE log...

Other than that I'm really happy with dd-wrt (no sarcasm intended -- I've had no other issues and I love the power dd-wrt gives me over my hardware).

Finally, there's the old adage about correlation vs causuality. We'll see if the errors start popping up again while using my AR430W.

Original post:

I've been using Tomato with my WRT54G for a couple of years now, and I'm incredibly happy with it. Since I have a couple of old routers (airlink 1010 ar430w and netgear wgt624 v4) with stock firmware lying around I figured it was time to turn them into something useful. So here's how to flash the netgear router. If it stands up to sustained use I'll be writing an AR430W guide later.


dd-wrt


Lengthy preamble
The stock firmware basically does nothing for me -- it's clunky, slow, and there's no terminal access. In particular, I want busybox/ssh, Tomato does all that for me, but it doesn't support a particularly wide range of routers (I reckon that Tomato is the reason why Linksys WRT54GL still costs $90 in Australia, in spite of being old as sin -- those who doubt the value of opening up their hardware may want to consider the RoI on that one)

In addition to Tomato, there's also DD-WRT (supported devices) and OpenWRT (supported devices). DD-WRT support a huge number of routers, but it appears to be a whole lot more complicated to install than Tomato. Maybe this varies according to the router as well.

For instructions you're referred via the database to the dd-wrt forum thread about your router. The problem with this is that you'll be facing 30-odd pages with instructions, problems, dead-ends etc. Some threads end with a step-by-step summary on how to install dd-wrt, but not all do.

Anyway, here's my best attempt at writing a simple and complete step-by-step guide to replacing the stock firmware on Netgear WGT24 v4 with DD-WRT on Debian Testing/Wheezy. I'm basically just following this blog post: http://lauriaus.no-ip.org/blog/?p=90 , but hopefully I've added enough detail to make it possible for just about anyone to follow this guide.

Please consult http://www.dd-wrt.com/site/support/router-database to see what files you need. NOTE: the files below only apply to v4 of Netgear WGT624. Installing them on any other router may brick it.



On your linux computer:

Get the files:
sudo apt-get install atftpd tftp putty
cd /tmp
mkdir ftpdboot
cd ftpdboot/
wget http://www.dd-wrt.com/dd-wrtv2/downloads/others/redboot_collection/images_default/redboot_ap61_16M_4M_admtek.rom
wget http://www.dd-wrt.com/routerdb/de/download/Netgear/WGT624/v4/linux.bin/3614 -O linux.bin
wget http://www.dd-wrt.com/routerdb/de/download/Netgear/WGT624/v4/wgt624v4-firmware.bin/3613 -O wgt624v4-firmware.bin

Edit /etc/default/atftpd:
USE_INETD=false #true OPTIONS="--tftpd-timeout 300 --retry-timeout 5 --mcast-port 1758 --mcast-addr 239.239.239.0-255 --mcast-ttl 1 --maxthread 100 --verbose=5 /tmp/ftpdboot"

Edit /etc/inetd.conf
32 tftp dgram udp4 wait nobody /usr/sbin/tcpd /usr/sbin/in.tftpd --tftpd-timeout 300 --retry-timeout 5 --mcast-port 1758 --mcast-addr 239.239.239.0-255 --mcast-ttl 1 --maxthread 100 --verbose=5 /tmp/ftpdboot
and do
sudo /etc/init.d/openbsd-inetd reload

for good luck. If you don't have openbsd-inetd you may have xinetd or inetutils-inetd installed instead (I think openbsd-inetd is default on debian). Edit the command as necessary.

Edit your /etc/network/interfaces file:
auto eth0 iface eth0 inet static address 192.168.1.155 gateway 192.168.1.1 netmask 255.255.255.0

and run
sudo service networking restart

Make sure that your card came up ok (do e.g. ip addr)
2: eth0: broadcast mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 00:26:9e:27:9b:20 brd ff:ff:ff:ff:ff:ff inet 192.168.1.155/24 brd 192.168.1.255 scope global eth0
Continue.

Prepare two terminals, side by side (or start a screen session with two tabs open). In one, type
echo "^C"> end.txt
putty telnet 192.168.1.1:9000 -m end.txt

But don't hit enter after the second command.

In the other terminal, type
ping 192.168.1.1

but don't hit enter.

Connect the ethernet port on your computer to one of the ethernet LAN ports (not WAN/Internet) on your router.

You are next going to unplug the power from the router, and hit enter after the ping command. Immediately when you get ping replies:
64 bytes from 192.168.1.1: icmp_req=4 ttl=64 time=0.371 ms
you hit enter after the putty command in the other window. If nothing good happens, then redo (i.e. unplug the router, hit enter after the ping command etc. Don't start the ping until you're re-plugged the router).

Ready? GO!
ping 192.168.1.1
64 bytes from 192.168.1.1: icmp_req=1 ttl=64 time=0.371 ms
putty telnet 192.168.1.1:9000 -m end.txt

And you should get

Before you continue make sure that you've opened up your firewall e.g. if you're not connected to the internet you can go crazy like this:
sudo iptables -P INPUT ACCEPT
sudo iptables -P OUTPUT ACCEPT
sudo iptables -P FORWARD ACCEPT

And don't forget to restore your firewall once you're done.
Time to get dangerous.

RedBoot> fis init
About to initialize [format] FLASH image system - continue (y/n)? Y *** Initialize FLASH Image System ... Erase from 0xbffe0000-0xbfff0000: . ... Program from 0x80ff0000-0x81000000 at 0xbffe0000: . RedBoot> ip_address -h 192.168.1.155 IP: 192.168.1.1/255.255.255.0, Gateway: 192.168.1.254 Default server: 192.168.1.155
RedBoot> load -r -b %{FREEMEMLO} redboot_ap61_16M_4M_admtek.rom
Using default protocol (TFTP) TFTP timed out 1/15 Can't load 'redboot_ap61_16M_4M_admtek.rom': operation timed out
Try again:
RedBoot> load -r -b %{FREEMEMLO} redboot_ap61_16M_4M_admtek.rom
Using default protocol (TFTP) Raw file loaded 0x80040c00-0x8005007f, assumed entry at 0x80040c00
RedBoot> fis create -l 0x30000 -e 0xbfc00000 RedBoot fis create -l 0x30000 -e 0xbfc00000 RedBoot An image named 'RedBoot' exists - continue (y/n)? y ... Erase from 0xbfc00000-0xbfc30000: ... ... Program from 0x80040c00-0x80050080 at 0xbfc00000: . ... Erase from 0xbffe0000-0xbfff0000: . ... Program from 0x80ff0000-0x81000000 at 0xbffe0000: .
RedBoot> reset

You'll see a couple of flashing lights on the router as the only indication that something just happened. Kill your current putty connection and start a new one.
=~=~=~=~=~=~=~=~=~=~=~= PuTTY log 2013.01.28 20:29:38 =~=~=~=~=~=~=~ ^C
RedBoot> fis init
About to initialize [format] FLASH image system - continue (y/n)? y *** Initialize FLASH Image System ... Erase from 0xbffe0000-0xbfff0000: . ... Program from 0x80ff0000-0x81000000 at 0xbffe0000: .
RedBoot> ip_address -h 192.168.1.155
IP: 192.168.1.1, Default server: 192.168.1.155
RedBoot> load -r -b 0x80041000 linux.bin
Using default protocol (TFTP) Raw file loaded 0x80041000-0x803ecfff, assumed entry at 0x80041000
RedBoot> fis create linux

Be patient -- this step takes a long time: 19 minutes in my case (some routers take an hour). Write down the time when it starts and WAIT at least 20 minutes.
... Erase from 0xbfc10000-0xbffbc000: ........................................................... ... Program from 0x80041000-0x803ed000 at 0xbfc10000: ........................................................... ... Erase from 0xbffe0000-0xbfff0000: . ... Program from 0x80ff0000-0x81000000 at 0xbffe0000: .
RedBoot> fconfig
Run script at boot: true Boot script: .. fis load -l kernel .. go Enter script, terminate with empty line
>> fis load -l linux >> exec >> Boot script timeout (1000ms resolution): 12 Use BOOTP for network configuration: false bootp_my_gateway_ip: 192.168.1.254 Local IP address: 192.168.1.1 bootp_my_ip_mask: 255.255.255.0 Default server IP address: 192.168.1.55 Console baud rate: 9600 GDB connection port: 9000 Force console for special debug messages: false net_debug: false Update RedBoot non-volatile configuration - continue (y/n)? y ... Erase from 0xbffe0000-0xbfff0000: . ... Program from 0x80ff0000-0x81000000 at 0xbffe0000: . RedBoot> reset

Done!

You can now navigate to 192.168.1.1 in your router, but unplug, replug the router for good luck.
Success!


I created a user called admin and set a password i.e. there's no pw or username you need to know a priori.
click on services


check sshd

Don't trust important infrastructure with passwords. Use keys.




And finally
ssh root@192.168.1.1

and hopefully you're in.

First impressions:
The busybox ('linux') version is a bit too sparse for my liking -- no netstat command...but it's still obviously a major step up from the stock firmware. dd-wrt is different from tomato -- if you're used to one you're not necessarily going to feel comfortable with the other. Luckily, dd-wrt is widely used and there are plenty of resource online. In addition, there's a demo ( http://www.dd-wrt.com/demo/ ) so you can try it out before installing it.

How to set up 'static' dhcp (i.e. make sure that some computers always have the same IP address while still running a dhcp server) wasn't completely obvious either, but this post helped: http://www.dd-wrt.com/wiki/index.php/Static_DHCP


25 January 2013

327. Installing Octave, Gnuplot, maxima etc on OSX 10.8.2 via macports

Update 7 Feb 2013: Got my hand on a Mac again and sorted out the octave forge packages.

NOTE: while on the surface this has little to do with linux, it IS in our interest to get people on other platforms to use the same tools we do. Or at least compatible ones. So knowing about macports will help even the most die-hard linux user. In fact, it will help especially those.


I'm no friend of Apple, Mac or OS X for many reasons (restrictive environment, weird mouse, their typical target audience), but pragmatism will make you happier than zealotry. What doesn't work for me obviously works for other people.

Students (and professors) at Australian universities do use apple laptops in significant number though (yet our ERP system only works properly with Internet Explorer -- try requesting leave using chrome on linux and you may be in for a surprise) so I've recently been faced with the issue of installing my standard linux tools on students' apple laptops. In fact it's gotten so bad that several Australian universities 'give' ipads to all their students for 'free' (nothing is free when you're paying for it through your tuition). As someone working at a uni I resent this since, by using the one platform where you can't install what you want on it, this puts pressure on me to restrict my teaching to what can be had via the almighty app store. (there's a lot of BS about blended learning, and you'd be lucky to find a white board anywhere. Blackboards are completely gone which is idiocy of the first degree)

Anyway. There's plenty of people here using Macbook-thingies and it's in my own interest to get them on the narrow path to justice, liberty and the FOSS way.

Macports is a really cool package manager/repository for linux software for Mac OSX and works by compiling the software from the sources -- pretty much how I imagine that the Gentoo experience may be like. Anyway, it works fine although it takes a fair amount of time to install things.


So here's how to do it (no screenshots because I'm typing this from memory):
  1. Open the App Store and install XCode (free)
  2. Open Xcode, go to Preferences/Download, and install Command Line Tools
  3. Download macports via this link for OS X 10.8: https://distfiles.macports.org/MacPorts/MacPorts-2.1.2-10.8-MountainLion.pkg
    Other versions of OS X are also supported, see here: http://www.macports.org/install.php
  4. Install the downloaded macports by opening the .pkg file
  5. Open a Terminal window  (Applications/Utilities/Terminal) and run
  6. sudo port -v selfupdate
  7. Run
  8. sudo port install gnuplot maxima vim nano xterm
    which will take a while -- it needs to set up the build environment from scratch in addition to regular dependencies. Set aside an hour or so just in case. If there's an error, try running the command once more.

  9. Run
  10. sudo port install octave-devel qtoctave-mac
    which will take a long while. If the compile seems to have stopped, checked the titlebar of the terminal window -- the command it's executing will continously change during the compile)

  11. Run octave by running the command
    octave
    in the terminal.
  12. To install octave packages you can install them in the Octave environment :
    pkg install -forge miscellaneous struct general optim
  13. addpath
  14. In case you're having problems actually using the octave-forge packages, you might need to create/edit your ~/.octaverc along the lines of
    addpath('/Users/verahill/octave/optim-1.2.2')
    Replace verahill with the proper user name, and edit the version number as needed.

  15. X11/Xquartz - setting DISPLAY
  16. At this point
    echo $DISPLAY
    gave nothing, and trying to launch an X11 program (e.g. xterm) complained that DISPLAY was not set. Setting DISPLAY manually (export DISPLAY=:0.0) didn't help either.

    'Bad' solution: the first solution was to install xorg-server (sudo port install xorg-server), and then manually
    X &
    export DISPLAY=:0.0
    
    or
    Xquartz &
    export DISPLAY=:0.0
    
    Both X and Xquartz come from macports.

    Good solution
    I then tried to change tack and went to http://xquartz.macosforge.org/landing/ and downloaded XQuartz-2.7.4.dmg (I was originally under the impression that it came as default with "mountain lion" but no.).
    Open the file, then run the .pkg file in that archive. Log out of OSX, then log in again. Now try launching e.g. xterm form a terminal and it should work.


The entire process will take well over an hour, but at the end of it you'll have Octave, Gnuplot AND a complete build environment!

And there are plenty more things you can install with macports (e.g. qtoctave (qtoctave-mac), gedit (gedit gedit-plugins), kile, maxima, qtiplot).

Not sure why I am so excited over it since all these things are available in most linux repos, but there you go -- compiling stuff is ALWAYS exciting.