24 October 2012

265. shmmax revisited -- and shmall, shmmni

I've upgraded two of my nodes -- my old 4 core node with 8 GB ram now has 4x4=16 GB RAM, while my old 8 core, 16 GB ram now has 4*8=32 GB ram.

When using nwchem you eventually will run into an shmmax problem:


******************* ARMCI INFO ************************
The application attempted to allocate a shared memory segment of 44498944 bytes in size. This might be in addition to segments that were allocated succesfully previously. The current system configuration does not allow enough shared memory to be allocated to the application.

This is most often caused by:
1) system parameter SHMMAX (largest shared memory segment) being too small or
2) insufficient swap space.
Please ask your system administrator to verify if SHMMAX matches the amount of memory needed by your application and the system has sufficient amount of swap space. Most UNIX systems can be easily reconfigured to allow larger shared memory segments,
see http://www.emsl.pnl.gov/docs/global/support.html
In some cases, the problem might be caused by insufficient swap space.
*******************************************************
0:allocate: failed to create shared region : -1
(rank:0 hostname:boron pid:17222):ARMCI DASSERT fail. shmem.c:armci_allocate():1082 cond:0

I haven't gotten that in a while since I increased shmmax to 6572498432, but running a frequency calculation on a large molecule with unrestricted DFT triggered it again on my 32 GB node. So I hit google. These posts were informative:
http://www.pythian.com/news/245/the-mysterious-world-of-shmmax-and-shmall/
http://padmavyuha.blogspot.com.au/2010/12/configuring-shmmax-and-shmall-for.html
http://yuji.wordpress.com/2011/11/03/what-is-shmmax-shmall-shmmni-shared-memory-max/


me@neon:~$  cat /proc/sys/kernel/shmall
2097152
me@neon:~$ cat /proc/sys/kernel/shmni
4096
me@neon:~$ cat /proc/sys/kernel/shmmax
6572498432

That works out to (4096 bytes/page*2097152)*(1/(1024*1024*1024) bytes per gigabyte) pages=8.192 GB. And they are the same on all my nodes in spite of the memory available varying.

Another way of looking at it:
ipcs -lm

------ Shared Memory Limits --------
max number of segments = 4096
max seg size (kbytes) = 6418455
max total shared memory (kbytes) = 8388608
min seg size (bytes) = 1


Your shmmall is the number of pages total, the shmmni is the page size and the shmmax is the largest contigouos chunk of RAM available.

 So if I get things right, and parroting what's said on the pages above, your shmmall should approach but not exceed your total physical memory, you shmni is better left alone, and your shmmax can be anywhere up to your total RAM.

The links above cite Oracle recommendations which state that (for 32 bit system) it should be 4 GB - 1 byte OR half your RAM, whichever is smaller. I'll show that case here, but will be testing using 80% of my RAM for my calcs.

 So for my boxes:

32 GB RAM => shmmax=16GB, shmmall=(32-2 GB)/4095, shmni=4096
sudo sysctl -w kernel.shmmax=17179869184
sudo sysctl -w kernel.shmall=7340032
ipcs -lm

------ Shared Memory Limits -------- max number of segments = 4096 max seg size (kbytes) = 16777216 max total shared memory (kbytes) = 29360128 min seg size (bytes) = 1
16 GB RAM => shmmax=8GB, shmmall=(16-2 GB)/4096, shmni=4096
sudo sysctl -w kernel.shmmax=8589934592
sudo sysctl -w kernel.shmall=3670016


If you're happy with those values, make them permanent by editing your sysctl.conf and adding the relevant lines:
kernel.shmmax=17179869184
kernel.shmall=7340032


So here are the formulae (assuming that you set shmmax to half your ram and leave 2 gb out of shmall):
shmmax=RAM (bytes)/2
shmni=4096
shmmall=(RAM(bytes)-2147483648)/shmni

No comments:

Post a Comment