28 October 2012

267. ECCE client connecting to remote site via reverse port/local port forwarding

The situation I'm about describe is quite specific, yet I don't think it's that unusual.

A. I've got a computer at work which is behind a firewall so that I can't connect directly to it from the outside. This will be referred to as Work.

B. I've got a laptop at home which is connected to a wireless  router. This will be referred to as Home.

C. The router is a Linksys/tomato router, which is accessible from the outside (myrouter.com). This will be referred to as Router.

I'd like to connect from home to my ecce server at work so that I can monitor and submit jobs.

At Work:
ssh -R 19997:localhost:8096 root@myrouter.com

At Home:
ssh -L 5555:localhost:19997 root@myrouter.com

We're basically tying together port 5555 at Home with port 8096 at Work, via an intermediary server.

At Home, edit your ecce/apps/siteconfig/Dataservers and change the relevant lines to

<eccedata>
  <ecceserver>
    <url>http://localhost:5555/Ecce</url>
    <desc>ECCE Data Server--remote</desc>
  </ecceserver>

  <basisset>http://localhost:5555/Ecce/system/GaussianBasisSetLibrary</basisset>
</eccedata>

Note that submission actually happens from your ecce client, not your server (i.e. from Home, not Work), so to get your submission scripts in order you may have to do a bit of fiddling. E.g. if you ecce server is also the queue master for an SGE batch system:

Work:
ssh -R 19999:localhost:22 root@myrouter.com

Home:
ssh -L 5454:localhost:19999 root@myrouter.com

Home:
Edit /apps/siteconfig/remote_shells.site and add
ssh_p5454: ssh -XC -p 5454|scp -P 5454|xterm

But you can read more about that here: http://verahill.blogspot.com.au/2012/05/port-redirection-with-eccenwchem.html

25 October 2012

266. Back-up your gmail using getmail on debian testing/wheezy

Since I'm thinking about moving universities the issue of backing up my work email (which is hosted by gmail) account is weighing at the back of my mind.

I'm just following this post: https://wiki.archlinux.org/index.php/Backup_Gmail_with_getmail

This post is duplicating most of what's done there (I like the Arch tutes -- well written and thorough) so this is more of a 'yes, I followed it and it works' kind of posts.

Don't forget that gmail has an imap bandwidth limit of ca 2.5 GB per day: https://support.google.com/a/bin/answer.py?hl=en&answer=1071518

First install getmail4. You can probably ignore bug =#633799 (I'm assuming that you use apt-listbugs and receive the warning below -- if not, you probably shouldn't be using testing...)

sudo apt-get install getmail4
#633799 - getmail causes irrecoverable mail corruption when using mbox
mkdir ~/.getmail touch ~/.getmail/getmailrc chmod og-rwx ~/.getmail/getmailrc mkdir /media/backups/workmail cd /media/backups/workmail mkdir cur new tmp

If you don't make the last three folders getmail will complain.
Edit your ~/.getmail/getmailrc
[retriever] type = SimpleIMAPSSLRetriever server = imap.gmail.com mailboxes = ("[Gmail]/All Mail",) username = firstname.lastname@mycompany.com password = myPassword [destination] type = Maildir path = /media/backups/workmail/ [options] verbose = 2 message_log = ~/.getmail/log # retrieve only new messages # if set to true it will re-download ALL messages every time! #read_all = false # do not alter messages delivered_to = false received = false
Note that you may have a folder called [Google Mail/All Mail] instead

More options here: http://pyropus.ca/software/getmail/configuration.html

Next run getmail

NOTE: if you interrrupt (by e.g. ctrl+c) and then resume by running getmail again you'll download all emails again. If you let it run to completion, everything will work properly, however, and in the future it will only download new emails. 

getmail
msg 1/5053 (5621 bytes) msgid 648042553/1 from someone@somewhere delivered to Maildir /media/backups/workmail/ [..] msg 772/5053 (10839 bytes) from someone@somewhere delivered to Maildir /media/backups/workmail/ msg 773/5053 (4377 bytes) from someone@somewhere delivered to Maildir /media/backups/workmail [..] 5053 messages (783074371 bytes) retrieved, 0 skipped Summary: Retrieved 5053 messages (783074371 bytes) from SimpleIMAPSSLRetriever:firstname.lastname@mycompany.com@imap.gmail.com:993
If that all worked ok, edit your crontab and make it run e.g. once per day:

crontab -e
00 23 * * * getmail -q
Since I have a private gmail account as well I created a file called privatemailrc and use it with
getmail -r ~/.getmail/privatemailrc -q

I also created folders /media/backups/privatemail/cur, /media/backups/privatemail/tmp and /media/backups/privatemail/new and point to those in my privatemailrc file.

24 October 2012

265. shmmax revisited -- and shmall, shmmni

I've upgraded two of my nodes -- my old 4 core node with 8 GB ram now has 4x4=16 GB RAM, while my old 8 core, 16 GB ram now has 4*8=32 GB ram.

When using nwchem you eventually will run into an shmmax problem:


******************* ARMCI INFO ************************
The application attempted to allocate a shared memory segment of 44498944 bytes in size. This might be in addition to segments that were allocated succesfully previously. The current system configuration does not allow enough shared memory to be allocated to the application.

This is most often caused by:
1) system parameter SHMMAX (largest shared memory segment) being too small or
2) insufficient swap space.
Please ask your system administrator to verify if SHMMAX matches the amount of memory needed by your application and the system has sufficient amount of swap space. Most UNIX systems can be easily reconfigured to allow larger shared memory segments,
see http://www.emsl.pnl.gov/docs/global/support.html
In some cases, the problem might be caused by insufficient swap space.
*******************************************************
0:allocate: failed to create shared region : -1
(rank:0 hostname:boron pid:17222):ARMCI DASSERT fail. shmem.c:armci_allocate():1082 cond:0

I haven't gotten that in a while since I increased shmmax to 6572498432, but running a frequency calculation on a large molecule with unrestricted DFT triggered it again on my 32 GB node. So I hit google. These posts were informative:
http://www.pythian.com/news/245/the-mysterious-world-of-shmmax-and-shmall/
http://padmavyuha.blogspot.com.au/2010/12/configuring-shmmax-and-shmall-for.html
http://yuji.wordpress.com/2011/11/03/what-is-shmmax-shmall-shmmni-shared-memory-max/


me@neon:~$  cat /proc/sys/kernel/shmall
2097152
me@neon:~$ cat /proc/sys/kernel/shmni
4096
me@neon:~$ cat /proc/sys/kernel/shmmax
6572498432

That works out to (4096 bytes/page*2097152)*(1/(1024*1024*1024) bytes per gigabyte) pages=8.192 GB. And they are the same on all my nodes in spite of the memory available varying.

Another way of looking at it:
ipcs -lm

------ Shared Memory Limits --------
max number of segments = 4096
max seg size (kbytes) = 6418455
max total shared memory (kbytes) = 8388608
min seg size (bytes) = 1


Your shmmall is the number of pages total, the shmmni is the page size and the shmmax is the largest contigouos chunk of RAM available.

 So if I get things right, and parroting what's said on the pages above, your shmmall should approach but not exceed your total physical memory, you shmni is better left alone, and your shmmax can be anywhere up to your total RAM.

The links above cite Oracle recommendations which state that (for 32 bit system) it should be 4 GB - 1 byte OR half your RAM, whichever is smaller. I'll show that case here, but will be testing using 80% of my RAM for my calcs.

 So for my boxes:

32 GB RAM => shmmax=16GB, shmmall=(32-2 GB)/4095, shmni=4096
sudo sysctl -w kernel.shmmax=17179869184
sudo sysctl -w kernel.shmall=7340032
ipcs -lm

------ Shared Memory Limits -------- max number of segments = 4096 max seg size (kbytes) = 16777216 max total shared memory (kbytes) = 29360128 min seg size (bytes) = 1
16 GB RAM => shmmax=8GB, shmmall=(16-2 GB)/4096, shmni=4096
sudo sysctl -w kernel.shmmax=8589934592
sudo sysctl -w kernel.shmall=3670016


If you're happy with those values, make them permanent by editing your sysctl.conf and adding the relevant lines:
kernel.shmmax=17179869184
kernel.shmall=7340032


So here are the formulae (assuming that you set shmmax to half your ram and leave 2 gb out of shmall):
shmmax=RAM (bytes)/2
shmni=4096
shmmall=(RAM(bytes)-2147483648)/shmni