Lindqvist -- a blog about Linux and Science. Mostly.: July 2011

31 July 2011

12. Pulseaudio - controlling where the output goes

The non-free flash plugin can be a bit of a headache, since even if you install it and flash videos play nicely, you may not get any sound. On a laptop, you might not notice it, since the default output seems to be the laptop speakers. However, you may have problems piping the sounds through any attached USB speakers. Same goes for desktops, where the default output may be the sounds ports on your motherboard.

The fix is simple (assuming you have pulseaudio installed)
create an /etc/asound.conf file or a ~/.asoundrc file, with the following in it:

pcm.!default.type pulse
ctl.!default.type pulse

Rebooting should take care of things

11. Sorting out problems with NVIDIA or ATI graphics cards on Debian Testing

This bug has been a problem for a while now when running debian testing - if you do a dist-upgrade, chances are your prorietary drivers get borked.

Symptoms include gdm not starting i.e. no Gnome login window. Instead you get dumped into the terminal.

There are two fixes - one quick and dirty one and one which is more long term.

1. Quick and dirty
If you want Gnome up and running without necessarily using graphics acceleration, simply log in in your terminal and rename your xorg.conf.

sudo mv xorg.conf xorg.conf.old

This will allow you to use the graphics card on your motherboard again. It won't help with a PCI card much.

2. A real fix
SMXI. It may seems scary, but is fairly straightforward. Log in (presumably in the terminal)

sudo su
cd /usr/local/bin && wget -Nc smxi.org/smxi.zip && unzip smxi.zip && smxi

The sudo su logs you in as root. You could also sudo each command individually above.

Follow the instructions and let the smxi script install your graphics drivers.

28 July 2011

10. Combining MS data into a matrix

The following script takes a series of files with comma-delimited data named 0v.csv, 10v.csv etc., extracts everything in column 4 (in my case it contains relative abundance values from MS) from row 7 to the bottom, and stores those columns in files called 0.dat, 10.dat etc.

The second column (in my case it contains m/z values) is extracted from one of the csv files, and stored in mz.x.

Next, all the .dat files are pasted together, with the columns side by side, and the mz.x data added as the first column.

The data is rotated using the rotate script I've published earlier.

A file with cone-voltage values, cv.xy, has already been prepared by hand (single column), and is pasted together with the m/z data.

procms.sh
#!/usr/bin.bash
for e in 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 220 240 260 280 300
do
tail -n +8 $e\v.csv | sed 's/\,/\t/g'| gawk '{print $4}' > $e.dat
done

#get m/z
tail -n +8 0v.csv | sed 's/\,/\t/g'| gawk '{print $2}' > mz.x

paste 0.dat 10.dat 20.dat 30.dat 40.dat 50.dat 60.dat 70.dat 80.dat 90.dat 100.dat 110.dat 120.dat 130.dat 140.dat 150.dat 160.dat 170.dat 180.dat 190.dat 200.dat 220.dat 240.dat 260.dat 280.dat 300.dat > all.dat
paste mz.x all.dat > $1dat
rotate $1.dat > $1\rot.dat
paste cv.xy $1\rot.dat > $1\tot.dat

26 July 2011

9. rotating a matrix

I lifted this off of some website at some point, but can't remember where and so can't give credit.

rotate.sh
gawk '
{
for (i=1; i<=NF; i++) {
a[NR,i] = $i
}
}
NF>p { p = NF }
END {
for(j=1; j<=p; j++) {
str=a[1,j]
for(i=2; i<=NR; i++){
str=str" "a[i,j];
}
print str
}
}' $1

8. Very rudimentary peak-finding

The following scripts looks through a file with one x column and several y columns, and returns values larger than a certain cutoff.

Usage:
findpeaks filename columnnumber cutoffvalue
e.g.
findpeaks test.dat 3 0.01

findpeaks:
#!/usr/bin/python2.6
import sys
infile=sys.argv[1]
colno=int(sys.argv[2])
cutoff=float(sys.argv[3])

minsig=100
maxsig=0

spectra=open(infile,'r')

for spectrum in spectra:
spectrum=spectrum.rstrip('\n')
spectrum=spectrum.split(' ')
# print spectrum

try:
spectrum[colno]=float(spectrum[colno])
if spectrum[colno]>cutoff:
print spectrum[1],' ',spectrum[colno]

if spectrum[colno]<minsig:
minsig=spectrum[colno]
if spectrum[colno]>maxsig:
maxsig=spectrum[colno]

except:
a=0
maxsig=0
print "not working: "

print "max sig: ",maxsig,", minsig: ",minsig
spectra.close

20 July 2011

7. Processing 1D Bruker nmr data

Bruker 1D binary NMR files can be processed using a combination of cat, grep, sed, gawk and od, together with python and octave (w/ octave-optim) for some fancy line-fitting.

brukdig2asc:
#!/bin/bash
#usage: brukdig2asc
SW=`cat acqus | grep 'SW_h' | sed 's/\=/\t/g' | gawk '{print $2}'| tr -d '\n'`
TD=`cat acqus | grep 'TD=' | sed 's/\=/\t/g' | gawk '{print $2}'| tr -d '\n'`
O=`cat acqus | grep '$O1=' | sed 's/\=/\t/g' | gawk '{print $2}'`
SFO=`cat acqus | grep 'SFO1=' | sed 's/\=/\t/g' | gawk '{print $2}'`
#TIME=16384
#SWEEP=23809.5238095238
#AQ=`echo "1/(23809.5238095238/(16384/2))" | bc -lq`
cp fid fid.bin
ls fid.bin | cpio -o | cpio -i --swap -u
od -An -t dI -v -w8 fid.bin| gawk '{print NR,$1,$2}'| sed '1,64d' >fid.asc1
pynmr $SW $TD $O $SFO
makespec

pynmr:
#!/usr/bin/python2.6
import sys
#print str(sys.argv)
sweepwidth=float(sys.argv[1])
nopts=int(sys.argv[2])
centrefreq=float(sys.argv[3])
basefreq=float(sys.argv[4])

aq=1/(sweepwidth/(nopts/2))
#print str(sweepwidth),str(nopts)
f=open('fid.asc1','r')
g=open('fid.asc','w')
for line in f:
    line=line.rstrip('\n')
    line=line.split(' ')
#    print line
    freq=float(line[0])/(nopts/2)*sweepwidth+(centrefreq-sweepwidth/2)
    line[0]=(float(line[0])/(nopts/2))*aq
    g.write(str(line[0])+'\t'+str(line[1])+'\t'+str(line[2])+'\t'+str(freq)+'\n')
f.close
g.close

makespec:

#!/bin/bash
octave --silent --eval "fid=load('fid.asc');
#make xaxis
[nopts b]=size(fid);
aq=max(fid(:,1));
sw=nopts/aq;
freqx=linspace(0,sw,nopts)';

#apodizing
lb=5/10000;
fid(:,2)=fid(:,2).*exp(-lb.*freqx);
fid(:,3)=fid(:,3).*exp(-lb.*freqx);

#phasing
spec=[fid(:,1) real(fftshift(fft(fid(:,2)+i*fid(:,3)))) imag(fftshift(fft(fid(:,2)+i*fid(:,3))))];
[a b]=size(spec); spec(a/2,2:3)=[0 0];
phc=linspace(0,2*pi,180);
maxsig=0;k=1;
for n=1:180;
localmax=max( real( (spec(:,2)+i*spec(:,3)).*exp(i*phc(n)) ));
if (localmax>maxsig)
maxsig=localmax;
k=n;
endif
endfor;
#simple baseline
absd=inline('m+t*0','t','m');
guess=0;
[f m kvg iter corp covp covr stdresid z r2]=leasqr(fid(:,4),real((spec(:,2)+i*spec(:,3)).*exp(i*phc(k))),guess,absd);
#disp(m)
#disp(sqrt(diag(covp)))

#make spectrum
spectrum=[fid(:,4) real((spec(:,2)+i*spec(:,3)).*exp(i*phc(k)))-m imag((spec(:,2)+i*spec(:,3)).*exp(i*(phc(k)+pi/2)))-m];

#fitting
pkg load optim
[a b]=max(spectrum(:,2));
centre=fid(b,4);
guess=[10 max(spec(:,2))]; #centre width height
#disp(guess)
lorentzian=inline('p(2)*(1/pi)*(p(1)/2)./((t-centre).^2+(0.5*p(1))^2)','t','p');
[f p r2]=leasqr(fid((b-150):(b+150),4),spectrum((b-150):(b+150),3),guess,lorentzian);

#filter out artefacts from fitting set
filtered=[0 0];
res=floor((max(fid(:,4))-min(fid(:,4)))/nopts);
for l=(b-ceil(5*p(1)/res)):(b+5*ceil(p(1))/res)
delta=lorentzian(fid(l,4),p)-spectrum(l,2);

if (delta>(lorentzian(fid(l,4),p))/1.2)
# do nothing
else
filtered=[filtered; fid(l,4) spectrum(l,2)];
endif
endfor

filtered=[ filtered(2:size(filtered(:,2)),1) filtered(2:(size(filtered(:,2))),2) ];
[f p r2]=leasqr(filtered(:,1),filtered(:,2),p,lorentzian);

#disp(p')
#disp(r2)
params=[centre centre/67.8 max(lorentzian(fid(:,4),p)) p(1) 1.000 p(2)];
disp(params)
#save
spex=[fid(:,4) real((spec(:,2)+i*spec(:,3)).*exp(i*phc(k)))-m imag((spec(:,2)+i*spec(:,3)).*exp(i*(phc(k)+pi/2)))-m lorentzian(fid(:,4),p)];
save spectrum.dat spex;"

6. Miscellaneous scripts

The following two scripts look through a set of files with paired xy values but not necessarily the same number of x values in each file, extracts all the x values from all the files (script numero uno), then makes sure that all x values are present in all files (and zero-fills to make up for missing data). In short, it's a way of creating files that can be used to generate a matrix of values where all rows have the same number of fields.

The first script:
homogenise.sh
#!/bin/bash
for e in {0..200..10} {220..300..20}
   do
       cat $e.dat | gawk '{ printf("%s\t",$1) }'
       echo ""
   done

The second script:

makelist.py
#!/usr/bin/python2.6
import sys
infile=sys.argv[1]

f=open(infile,'r')
arr=[]

for line in f:
    line=line.rstrip('\n')
    line=line.split('\t')
    try:
        for i in line:
            i=float(i)
            if i not in arr:
                arr+=[i]
    except:
        arr=arr

f.close

arr.sort()
mylist=arr
last = mylist[-1]
for i in range(len(mylist)-2, -1, -1):
    if last == mylist[i]:
        del mylist[i]
    else:
        last = mylist[i]

voltages=[0,10,20,30,40,50,60,70,80,90,100,110,120,130,140,150,160,170,180,190,200,220,240,260,280,300]
myys=[0]*len(mylist)

for n in voltages:
    print "voltage: ',n,'\n'
    f=open(str(n)+'.dat','r')
    g=open(str(n)+'full.dat','w')
    arrx=[]
    arry=[]

    for line in f:
        line=line.rstrip('\n')
        line=line.split(' ')
        try:
            line[0]=float(line[0])
            line[1]=float(line[1])
            arrx+=[line[0]]
            arry+=[line[1]]

        except:
            #do nothing
            print "fail"

    for i in range(0,len(arrx)-1):

        try:
            myys[mylist.index(arrx[i])]=arry[i]
        except:
            a=0

    for i in range(0,len(myys)-1):
        g.write(str(mylist[i])+'\t'+str(myys[i])+'\n')

    f.close
    g.close

Pages