25 September 2012

246. Cluster network performance testing (very basic) on Debian Testing using a gigabit switch

Playing with hpcc got me thinking about my network connection.

My cluster looks like this:
I've got four nodes which are connected via two networks, 192.168.2.0/24 and 192.168.1.0/24. The 192.168.1.0/24 network is connected using a gigabit switch. Be (see below) acts as the gateway. The 192.168.2.0/24 network is connected via a crappy old netgear 10/100 router (dhcp) and provides access to the outside world (hello mac spoofing :) ). Each box shares a folder via nfs using a unique name.
_Nodes_
Be: AMD II X3, 8 GB ram (192.168.1.1): Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8169 Gigabit Ethernet (rev 10)
Ta: Intel i5-2400, 8 GB ram (192.168.1.150):  Intel Corporation 82579LM Gigabit Network Connection (rev 04)
B: AMD Phenom II X6, 8 GB ram (192.168.1.101): Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 03)
Ne: AMD FX 8150 X8, 16 GB ram (192.168.1.120): Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8169 Gigabit Ethernet (rev 10)

So, time to test the network performance:
sudo apt-get install iperf

On all your boxes (e.g. using clusterssh) start the iperf daemon
iperf -s

Then on each of your nodes run:
iperf -c 192.168.1.1 && iperf -c 192.168.1.101 && iperf -c 192.168.1.150 && iperf -c 192.168.1.120

------------------------------------------------------------
Client connecting to 192.168.1.1, TCP port 5001
TCP window size: 45.7 KByte (default)
------------------------------------------------------------
[  3] local 192.168.1.101 port 37893 connected with 192.168.1.1 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec   564 MBytes   473 Mbits/sec
------------------------------------------------------------
Client connecting to 192.168.1.101, TCP port 5001
TCP window size:  169 KByte (default)
------------------------------------------------------------
[  3] local 192.168.1.101 port 35926 connected with 192.168.1.101 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  15.5 GBytes  13.3 Gbits/sec
------------------------------------------------------------
Client connecting to 192.168.1.150, TCP port 5001
TCP window size: 22.9 KByte (default)
------------------------------------------------------------
[  3] local 192.168.1.101 port 48257 connected with 192.168.1.150 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec   564 MBytes   473 Mbits/sec
------------------------------------------------------------
Client connecting to 192.168.1.120, TCP port 5001
TCP window size: 22.9 KByte (default)
------------------------------------------------------------
[  3] local 192.168.1.101 port 43236 connected with 192.168.1.120 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec   617 MBytes   517 Mbits/sec


Overall, this is what I got
Client/Server (MBit/s)
     Be     B     Ta    Ne
Be   13.7G  310   308   316
B    564    15.5G 564   617
Ta   726    660   19.7G 936
Ne   882    484   917   19.4G

I'm not sure whether to expect a metric gigabit (1000 metric MBit) or a binary one (1024 binary MBit), but looking at our results our best is 936 Mbit/s and worst 308 Mbit/s. All of them should thus ideally reach at least 936 MBit/s. They all have gigabit network card.

And now, try to improve it:
I went through the whole shebang with
sudo ifconfig eth1 mtu 9000
sudo ifconfig eth1 mtu 8000
etc.
Anyway, I got the following MTUs that way:
Be  7100
B    7100
Ne  9000
Ta   9000

I then set the MTUs to 7100 on all the nodes and tried pinging from node to node, e.g.:
ping -s 7072 -M do 192.168.1.101

Well, that maxed out at 1472 i.e. about MTU 1500 which was the original value. So I'm a bit confused.


Settings:
Be:
eth1      Link encap:Ethernet  HWaddr 00:f0:4d:83:0a:48  
          inet addr:192.168.1.1  Bcast:192.168.1.255  Mask:255.255.255.0
          inet6 addr: fe80::2f0:4dff:fe83:a48/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:24124966 errors:0 dropped:27064 overruns:0 frame:0
          TX packets:19569426 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:25859945667 (24.0 GiB)  TX bytes:14200267703 (13.2 GiB)
B:
eth1      Link encap:Ethernet  HWaddr 02:00:8c:50:2f:6b  
          inet addr:192.168.1.101  Bcast:192.168.1.255  Mask:255.255.255.0
          inet6 addr: fe80::8cff:fe50:2f6b/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:14540970 errors:0 dropped:36651 overruns:0 frame:0
          TX packets:16801915 errors:0 dropped:2 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:12347398135 (11.4 GiB)  TX bytes:18008416370 (16.7 GiB)
Ta:
eth1      Link encap:Ethernet  HWaddr 78:2b:cb:b3:a4:b7  
          inet addr:192.168.1.150  Bcast:192.168.1.255  Mask:255.255.255.0
          inet6 addr: fe80::7a2b:cbff:feb3:a4b7/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:14717233 errors:0 dropped:68232 overruns:0 frame:0
          TX packets:17769966 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:13860096243 (12.9 GiB)  TX bytes:20207270880 (18.8 GiB)
          Interrupt:20 Memory:e1a00000-e1a20000 
Ne:
eth1      Link encap:Ethernet  HWaddr 90:2b:34:93:75:e6  
          inet addr:192.168.1.120  Bcast:192.168.1.255  Mask:255.255.255.0
          inet6 addr: fe80::922b:34ff:fe93:75e6/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:13567520 errors:0 dropped:0 overruns:0 frame:0
          TX packets:10710054 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:13086635236 (12.1 GiB)  TX bytes:12381041605 (11.5 GiB)

No comments:

Post a Comment