15 May 2013

414. Frequency vs cores? Crude benchmarking on AMD FX 8150

I'm thinking about building my next computational node, and one issue which is preoccupying me is whether to go for lots of cores (e.g. a dual sock mobo with two 16 core 2.1 GHz cpus) or for a balance of cores and frequency (e.g. single-socket mobo with a 3.8 GHz 8 core cpu). Remember, this is built with private money -- not research grants -- so the budget is tight.

I mean, I can't look at something like this without wanting to buy it: http://www.newegg.com/Product/Product.aspx?Item=N82E16819113036. The question is whether I'm better off buying another one or two fx8150 for the price of 16x2 down-clocked cores.

Benchmarking with the FX 8150 actually makes some sense here if one of the newegg reviewers is to be believed, since the Opteron 6272 is described as two 8150s glued together and down-clocked.

The system: 32 gb ram, fx 8150, nwchem 6.1.1 with acml 5.3.1 (gfortran,int64, fma4) and openmpi.

Short of finding benchmarks for the type of applications that interest me (nwchem, mostly), I figure I could get a rough idea by throttling the frequency of my eight-core FX8150 and compare with unthrottled runs where the number of cores is limited.

Two things to take into account when looking at the times below:
  • modern processors are complex beasts -- I don't claim to fully understand threads vs virtual threads and integer vs FPU. In the FX8150 there are four fpus but eight cores. What this really means in practical terms when doing these particular test calculations, I don't know.
  • This isn't my job, and I need my nodes for running job-related calcs, so by necessity I had to use a short test job. There's inevitably some variability in the results, and using longer test jobs might affect the results somewhat.
  • The execution times vary A LOT for 'identical' conditions (see raw data), hence why I repeated the runs in bold ten times at 3.6 GHz to get reasonably solid comparison values. Still not perfect since the distribution isn't properly gaussian.

The specific question I wanted answered is:
Are 8 threads at 2.1 GHz significantly better than 4 threads at 3.6 GHz?
Short answer: No.
Looks like I won't be investing in 2 x 16 core 2.1 GHz cpus after all.


Optimization
c/f     3.60    3.30    2.70    2.10    1.40
8       44/3    49/6    58/1    75/6    110/5  
7       48/3                     72
6       52/1                    106
5       59/4            85       97
4       67/8            93     113/10    156
3       85/7
2      117/10
1      237/24
c=number of cores; f= frequency in GHz.

(times in seconds. 44/3 means 44 s +/- 3 s)

The way I read this is that it's better to have a 4-core 3.6 GHz cpu than an 8-core 2.1 GHz CPU. The whole 4 FPU/8 cores has me confused though, so I'm not sure whether that's affecting the results in a significant way.

The other thing to take into account is that there isn't normally a linear relationship between number of cores and execution times anyway -- doubling the number of cores doesn't normally lead to a halving of the execution time, so 16 cores at 2.10 GHz wouldn't necessarily be anywhere near 75/2=37 s. (again, that's ignoring the 2 cores/1 fpu issue)

-------------
c/f: raw data
--------------
8/3.6: 37.7,47.4,46.9,38.8, 46.8, 42.4,46.6, 43.9,44.7,42.8 => 44+/-3 s
7/3.6: 41.3,48.7,47.9,48.8,47.0,48.8,50.8,42.4,52.1,47.9 => 48+/-3 s
6/3.6: 49.5,53.4,50.5,53.4,52.4,53.3,51.3,53.4,52.5,53.55 => 52+/-1 s
5/3.6: 54.1,57.1, 67.7,52.2,59.6,58.4,59.8,57.6,59.4,58.6 => 59+/-4 s
4/3.6: 83.1,63.5,73.7,70.0,68.6,58.1,58.1,67.2,69.9,58.2 => 67 +/-8 s
3/3.6: 89.5, 86.0, 82.8, 97.9, 74.4,86.2,89.7, 86.3, 74.5, 86.2 => 85 +/-7 s
2/3.6: 114.1,137.4, 118.6, 108.3, 116.3, 123.6, 104.4,124.3,104.7, 120.6 => 117+/-10 s
1/3.6: 242.6,201.9,232.9,242.7, 233.2,202.0,233.1,265.2, 278.9,233.5 => 237+/- 24
8/3.3: 51.9, 42.4,42.7,55.3,43.3,55.8,54.6,48.1,42.4,48.1 => 49+/-6 s
8/2.7: 59.4, 57.3,59.1,57.8,58.9,56.8,59.0,58.5,59.2,56.9 => 58+/-1
8/2.1: 75.6,82.9,73.7,65.1,76.9,84.3,65.4,73.9,76.4,78.1 => 75+/-6 s
8/1.4: 112.5,110.5,112.1,108.6,113.1,114.4,112.4,109.1,97.9 => 110+/-5
4/2.1: 124.9,103.7,104.1, 92.4, 117.6,115.5,117.5,120.1,115.6,120.2 => 113+/-10 s

An alternative would be to report the fastest time (out of e.g. 10 tries) since it represents maximum capacity.



optimization input
scratch_dir /scratch
start benzeneopt 

geometry units angstroms
C  0.100  1.396  0.000
C  1.209  0.698  0.000
C  1.209 -0.698  0.000
C  0.000 -1.396  0.000
C -1.209 -0.698  0.000
C -1.209  0.698  0.000
H  0.000  2.479  0.000
H  2.147  1.240  0.000
H  2.147 -1.240  0.000
H  0.000 -2.479  0.000
H -2.147 -1.240  0.000
H -2.147  1.240  0.000
end

basis
 H library "6-31+g*" 
 c library "6-31+g*"
end
dft
 direct
end

task dft optimize



Setting frequency
The following script was called with the frequency in GHz, e.g. sudo setfreq 3.6

setfreq
/usr/bin/cpufreq-set -c 0 -g userspace
/usr/bin/cpufreq-set -c 1 -g userspace
/usr/bin/cpufreq-set -c 2 -g userspace
/usr/bin/cpufreq-set -c 3 -g userspace
/usr/bin/cpufreq-set -c 4 -g userspace
/usr/bin/cpufreq-set -c 5 -g userspace
/usr/bin/cpufreq-set -c 6 -g userspace
/usr/bin/cpufreq-set -c 7 -g userspace
/usr/bin/cpufreq-set -c 0 -f $1G
/usr/bin/cpufreq-set -c 1 -f $1G
/usr/bin/cpufreq-set -c 2 -f $1G
/usr/bin/cpufreq-set -c 3 -f $1G
/usr/bin/cpufreq-set -c 4 -f $1G
/usr/bin/cpufreq-set -c 5 -f $1G
/usr/bin/cpufreq-set -c 6 -f $1G
/usr/bin/cpufreq-set -c 7 -f $1G

13 May 2013

413. Povray 3.7-rc7 on debian wheezy

Update 15/5/2013: Seems like povray has been removed from debian! http://packages.debian.org/search?keywords=povray This means it's even safer to remove libjpeg62

Original post:
The latest beta version of povray, povray 3.7-rc7 will only build if you don't have libjpeg62 installed. Luckily, not much seems to rely on libjpeg62 anymore other than the debian version of povray.

If you want both the debian version and this version of povray installed at the same time, use a chroot environment to build.

The main reason for wanting to build your own povray 3.7 is that it supports parallel processing and so can speed up rendering significantly.

Note: you can compile povray 3.6 using the same instructions. Download it from here: http://www.povray.org/redirect/www.povray.org/ftp/pub/povray/Official/Unix/povray-3.6.tar.bz2

Building povray 3.7-rc7
sudo mkdir /opt/povray
sudo chown $USER /opt/povray 

mkdir ~/tmp
cd ~/tmp
sudo apt-get autoremove libjpeg62
sudo apt-get install libboost-all-dev libpng-dev libjpeg8-dev libtiff-dev build-essential checkinstall libsdl-dev

wget http://www.povray.org/redirect/www.povray.org/beta/source/povray-3.7.0.RC7.tar.bz2
tar xvf povray-3.7.0.RC7.tar.bz2
cd povray-3.7.0.RC7/
./configure --prefix=/opt/povray --program-suffix=_3.7 COMPILED_BY="me@here
make 
sudo checkinstall
0 - Maintainer: [ root@boron ] 1 - Summary: [ povray 3.7-rc7 ] 2 - Name: [ povray ] 3 - Version: [ 3.7.0.RC7 ] 4 - Release: [ 1 ] 5 - License: [ GPL ] 6 - Group: [ checkinstall ] 7 - Architecture: [ amd64 ] 8 - Source location: [ povray-3.7.0.RC7 ] 9 - Alternate source location: [ ] 10 - Requires: [ ] 11 - Provides: [ povray ] 12 - Conflicts: [ ] 13 - Replaces: [ ]
echo 'export PATH=$PATH:/opt/povray/bin' >> ~/.bashrc source ~/.bashrc

And that's it. Note that we added a suffix to the binary, so you'll have to call it with 'povray_3.7' instead of just 'povray'.

10 May 2013

412. system-config-samba on debian

I don't use samba anymore, but I do remember that Ubuntu had a very simple tool for setting up samba shares (turns out it's by redhat, not canonical). While it might be better in the long run to craft your own smb.conf, system-config-samba is convenient for quickly setting things up.

Anyway, someone at forums.debian.net asked for it, which got me thinking about making a post:

sudo apt-get install build-essential gfortran checkinstall python-all-dev cdbs debhelper quilt intltool python-central rarian-compat pkg-config gnome-doc-utils samba python-libuser libuser1 python-glade2
mkdir ~/tmp
cd ~/tmp
wget https://launchpad.net/ubuntu/+archive/primary/+files/system-config-samba_1.2.63.orig.tar.gz
tar xvf system-config-samba_1.2.63.orig.tar.gz
wget https://launchpad.net/ubuntu/+archive/primary/+files/system-config-samba_1.2.63-0ubuntu5.diff.gz
gunzip system-config-samba_1.2.63-0ubuntu5.diff.gz
patch -p0 < system-config-samba_1.2.63-0ubuntu5.diff
cd system-config-samba-1.2.63/
dpkg-buildpackage -uc -us
sudo dpkg -i ../system-config-samba_1.2.63-0ubuntu5_all.deb
sudo touch /etc/libuser.conf
gksu system-config-samba

And here we go:
i.e. it at the very least reads my /etc/samba/smb.conf accurately.

To make system-config-samba show up in your gnome menus, add it via Main Menu. Make sure to include gksu in the command to launch it.