28 February 2012

86. Building sinfo 0.0.45 on Debian Testing

I use sinfo to keep an eye on my cluster:
http://www.ant.uni-bremen.de/whomes/rinas/sinfo/#down

The debian repo version is 0.0.42-1
The latest version is sinfo 0.0.45

Here's the changelog since 0.0.42
sinfo 0.0.45- Tue, 13 Mar 2012 07:07:27 +0100corrected README compile hint for FreeBSDadded configure flag --disable-IPv6 to disable IPv6 supportsinfo 0.0.44 - Tue, 13 Dec 2011 18:32:33 +0100added reconnect for TCP connectionsadded LIBADD to make it --as-needed linkable tnx to T.Hardersinfo 0.0.43 - Thu, 01 Sep 2011 09:00:13 +0200fixed printing bug (integer underflow) when using sinfo -L or -W tnx to J.Erkkilae


There's little reason for compiling it yourself, but there's really no reason not to try either. I like debian and I like using apt-get to manage my system. Learning to be a bit more independent won't hurt though.
So here we go:

--START HERE --
wget http://www.ant.uni-bremen.de/whomes/rinas/sinfo/download/sinfo-0.0.45.tar.gz
tar -xvf sinfo-0.0.45.tar.gz 
cd sinfo-0.0.45/
sudo apt-get install libboost-dev libasio-dev libboost-signals-dev
./configure
(If your interface configuration has disabled IPv6 you must use ../configure --disable-IPv6 or sinfod will silently exit)
make 
sudo checkinstall


Done.

Start the sinfo daemon by
sudo sinfod --quiet --bcastaddress=192.168.1.255


Use
sinfo
to monitor

If you had sinfo installed before, autoremove it, then edit the left-behind /etc/init.d/sinfo script and change
/usr/sbin/sinfod
 to
 /usr/sbin/local/sinfod
Otherwise see http://verahill.blogspot.com.au/2012/02/debian-testing-wheezy-64-building_23.html for an example of writing your own init.d script.


To see all the cluster nodes running sinfo, just start
sinfo
If you don't see anything, then you've most likely not opened up your firewall -- you need to be able to listen to bcast.

Build errors:

Error:
In file included from message.cc:2:0:
message.h:5:34: fatal error: boost/shared_array.hpp: No such file or directory
compilation terminated.
make[2]: *** [message.lo] Error 1
make[2]: Leaving directory `/home/me/tmp/sinfo-0.0.45/libmessage'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/me/tmp/sinfo-0.0.45/libmessage'
make: *** [all-recursive] Error 1

Solution:
sudo apt-get install libboost-dev


Error:
In file included from udpmessagereceiver.cc:2:0:
udpmessagereceiver.h:4:20: fatal error: asio.hpp: No such file or directory
compilation terminated.
make[1]: *** [udpmessagereceiver.lo] Error 1
make[1]: Leaving directory `/home/me/tmp/sinfo-0.0.45/libmessageio'
make: *** [all-recursive] Error 1

Solution:
sudo apt-get install libasio-dev
which provides
/usr/include/asio.hpp
which is different from the asio.hpp included in libboost


Error:
/usr/bin/ld: cannot find -lboost_signals-mt
collect2: ld returned 1 exit status
make[2]: *** [sinfod] Error 1
make[2]: Leaving directory `/home/me/tmp/sinfo-0.0.45/sinfod'
make[1]: *** [all] Error 2
make[1]: Leaving directory `/home/me/tmp/sinfo-0.0.45/sinfod'
make: *** [all-recursive] Error 1

Solution:
sudo apt-get install libboost-signals-dev

Links to this post:
http://ant.uni-bremen.de/whomes/rinas/sinfo/

85. Nvidia bug causes evolution to crash/segmentation fault. Temporary and permanent fixes on Debian Testing

The nvidia-tls bug is affecting evolution too...
(and it's not just GNOME - http://www.linuxmintusers.de/index.php?topic=6859.0)


UPDATE: I have two nvidia boxes running debian testing. Only the one with GT 430 is exhibiting problems. My GT 520 box is unaffected.


UPDATE: Here's how to downgrade your drivers:
http://verahill.blogspot.com.au/2012/03/debian-testing-downgrading-nvidia.html

If you don't want to read the entire post, here's the summary:
1. I think the only semi-permanent solution is to downgrade from 295.20 to nvidia driver version 290.10
2. you can run evolution with
strace -o evolution.log evolution
and IT WILL NOT CRASH
3. It doesn't matter whether you use the nvidia binary straight from nvidia, using sgfxi, or use the nvidia-kernel-dkms/glx debian way. Evolution still dies.

PS strace is normally used to track system calls for the purpose trouble shooting. That it prevents evolution from crashing is completely unintended. But it works as a quick-fix.

PPS  What it does:

"The nvidia-tls libraries (/usr/lib/libnvidia-tls.so.x.y.z and /usr/lib/tls/libnvidia-tls.so.x.y.z); these files provide thread local storage support for the NVIDIA OpenGL libraries (libGL, libGLcore, and libglx). Each nvidia-tls library provides support for a particular thread local storage model (such as ELF TLS), and the one appropriate for your system will be loaded at run time."




The symptoms:
Start  evolution, and it will crash with a segmentation fault within the first ten seconds or so

dmesg points to the nvidia bug:

[19690.606196] evolution[13032]: segfault at 10 ip 00007f5a0f53ac0f sp 00007f59ddde6508 error 6 in libnvidia-tls.so.295.20[7f5a0f53a000+3000]
[21476.236668] evolution[18197]: segfault at 10 ip 00007fd4389c2c0f sp 00007fd418d56508 error 6 in libnvidia-tls.so.295.20[7fd4389c2000+3000]
[21513.224145] evolution[18387]: segfault at 10 ip 00007f2cd3e85c0f sp 00007f2cb3a44508 error 6 in libnvidia-tls.so.295.20[7f2cd3e85000+3000]
[21954.867694] evolution[19803]: segfault at 10 ip 00007f1680aa9c0f sp 00007f165bffe508 error 6 in libnvidia-tls.so.295.20[7f1680aa9000+3000]
[22129.426444] evolution[20435]: segfault at 10 ip 00007f2a05bf8c0f sp 00007f29e5725508 error 6 in libnvidia-tls.so.295.20[7f2a05bf8000+3000]


Running
CAMEL_DEBUG=all evolution >& evolution.log
three times had it crash with

First time:

DB SQL operation [BEGIN] started
Camel SQL Exec:
BEGIN
Camel SQL Exec:
COMMIT
DB Operation ended. Time Taken : 0.000060
###########
received: * LSUB (\HasNoChildren) "/" "INBOX"
received: B00005 OK Success
sending : B00006 LIST "" "*"
--> Segmentation fault

Second time:

===========
DB SQL operation [ATTACH DATABASE ':memory:' AS mem] started
Camel SQL Exec:
ATTACH DATABASE ':memory:' AS mem
POP3_STREAM_LINE (25): '-ERR unrecognized command'
DB Operation ended. Time Taken : 0.011516
###########

Database succesfully opened  

===========
DB SQL operation [ATTACH DATABASE ':memory:' AS mem] started
Camel SQL Exec:
ATTACH DATABASE ':memory:' AS mem
DB Operation ended. Time Taken : 0.010961
###########
**
GLib-GIO:ERROR:/tmp/buildd/glib2.0-2.30.2/./gio/gdbusmessage.c:1986:append_value_to_blob: assertion failed: (g_utf8_validate (v, -1, &end) && (end == v + len))
--> Segmentation fault

Third time:

===========
DB SQL operation [BEGIN] started
Camel SQL Exec:
BEGIN
Camel SQL Exec:
COMMIT
DB Operation ended. Time Taken : 0.000070
###########
sending : A00004 SELECT INBOX
--> Segmentation fault

strace:
I can't crash evolution with either strace or valgrind running. Now why is that?


Solution:
Downgrading. Which turns out to be more difficult than one would imagine.

UPDATE: Here's how to downgrade your drivers:
http://verahill.blogspot.com.au/2012/03/debian-testing-downgrading-nvidia.html

If you don't want to downgrade the nvidia drivers:
A temporary solution is, odd as it may seem, to use
strace -o evolution.log evolution
because it just refuses to crash. I don't know why, but it works. 

84. Downloading Debian installation ISOs using jigdo

Jigdo is the preferred way of downloading and maintaining debian ISOs for those who can't use torrents (e.g. company policies etc.). The advantage is that you only need to download what has changed since you last used jigdo, thus saving on traffic.

What I'm showing here is also described on the debian website -- my contribution is just to provide another, perhaps more detailed, example.

Anyway, it's fairly easy to use jigdo:
sudo apt-get install jigdo-file

mkdir ~/debiso
cd ~/debiso
jigdo-lite

Pick a url from here: http://www.debian.org/CD/jigdo-cd/#which
I'll use a jigdo file for amd64 testing by clicking here http://cdimage.debian.org/cdimage/weekly-builds/amd64/jigdo-cd/ and picking the following url:
http://cdimage.debian.org/cdimage/weekly-builds/amd64/jigdo-cd/debian-testing-amd64-CD-1.jigdo

You also need to pick a mirror -- your country mirror or a nearby university are good choices. I picked ftp://ftp.au.debian.org/debian.

Here's how it works:

me@beryllium:~/jigdo$ jigdo-lite


Jigsaw Download "lite"
Copyright (C) 2001-2005  |  jigdo@
Richard Atterer          |  atterer.net
Loading settings from `/home/me/.jigdo-lite'


-----------------------------------------------------------------
To resume a half-finished download, enter name of .jigdo file.
To start a new download, enter URL of .jigdo file.
You can also enter several URLs/filenames, separated with spaces,
or enumerate in {}, e.g. `http://server/cd-{1_NONUS,2,3}.jigdo'
jigdo [http://cdimage.debian.org/cdimage/weekly-builds/amd64/jigdo-cd/debian-testing-amd64-CD-1.jigdo]: http://cdimage.debian.org/cdimage/weekly-builds/amd64/jigdo-cd/debian-testing-amd64-CD-1.jigdo


Not downloading .jigdo file - `debian-testing-amd64-CD-1.jigdo' already present


-----------------------------------------------------------------
Images offered by `http://cdimage.debian.org/cdimage/weekly-builds/amd64/jigdo-cd/debian-testing-amd64-CD-1.jigdo':
  1: 'Debian GNU/Linux testing "Wheezy" - Official Snapshot amd64 CD Binary-1 20120220-05:20 (20120220)' (debian-testing-amd64-CD-1.iso)


Further information about `debian-testing-amd64-CD-1.iso':
Generated on Mon, 20 Feb 2012 05:33:08 +0000


-----------------------------------------------------------------
If you already have a previous version of the CD you are
downloading, jigdo can re-use files on the old CD that are also
present in the new image, and you do not need to download them
again. Mount the old CD ROM and enter the path it is mounted under
(e.g. `/mnt/cdrom').
Alternatively, just press enter if you want to start downloading
the remaining files.
Files to scan: 


-----------------------------------------------------------------
The jigdo file refers to files stored on Debian mirrors. Please
choose a Debian mirror as follows: Either enter a complete URL
pointing to a mirror (in the form
`ftp://ftp.debian.org/debian/'), or enter any regular expression
for searching through the list of mirrors: Try a two-letter
country code such as `de', or a country name like `United
States', or a server name like `sunsite'.
Debian mirror [http://sluglug.ucsc.edu/debian/]: ftp://ftp.au.debian.org/debian


Downloading .template file
--2012-02-28 10:22:04--  http://cdimage.debian.org/cdimage/weekly-builds/amd64/jigdo-cd/debian-testing-amd64-CD-1.template
Resolving cdimage.debian.org (cdimage.debian.org)... 2001:6b0:e:2018::138, 2001:6b0:e:2018::163, 130.239.18.163, ...
Connecting to cdimage.debian.org (cdimage.debian.org)|2001:6b0:e:2018::138|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 48901055 (47M) [text/plain]


Saving to: `debian-testing-amd64-CD-1.template'


100%[===================================================================================================================================================>] 48,901,055  4.92M/s   in 15s     


2012-02-28 10:22:19 (3.16 MB/s) - `debian-testing-amd64-CD-1.template' saved [48901055/48901055]

[....]

FINISHED --2012-02-28 10:37:02--
Total wall clock time: 4.1s
Downloaded: 3 files, 311K in 2.4s (128 KB/s)
Found 3 of the 3 files required by the template                                                                                                                                  
Successfully created `debian-testing-amd64-CD-1.iso'


-----------------------------------------------------------------
Finished!
The fact that you got this far is a strong indication that `debian-testing-amd64-CD-1.iso'
was generated correctly. I will perform an additional, final check,
which you can interrupt safely with Ctrl-C if you do not want to wait.


OK: Checksums match, image is good!  



All the necessary packages are downloaded so it takes a while. Once it's done, the iso is automatically built.

If you want to update your iso - delete it and run jigdo-lite again.