Showing posts with label sed. Show all posts
Showing posts with label sed. Show all posts

03 March 2013

352. Installing coreutils and sed, gawk on Windows (without cygwin)

My third Windows XP post. Only a few more to go...

You can also set up all these tools via cygwin: http://verahill.blogspot.com.au/2013/03/353-cygwin-with-octave-and-gnuplot-on.html

1. coreutils
coreutils covers a fair number of the basic linux file tools. In particular cat, paste and join are of interest with respect to data processing.

A. Go to http://sourceforge.net/projects/gnuwin32/files/coreutils/5.3.0/coreutils-5.3.0.exe/download?use_mirror=waix&download= and download coreutils-5.3.0.exe. Run the file.

B. Make sure to add C:\Program Files\GnuWin32\bin to the PATH:
* Right-click on my computer, select properties, Advanced.
* Click on Environment variables
* In the second box, 'system variables', highlight 'Path' and click on 'Edit'. In the variable value box, add ';C:\Program Files\GnuWin32\bin'

C. Go to the start meny, click on Run..., and type in 'cmd'
In the (fake) DOS window, type e.g. ls -lah to see if everything works.

2. gawk
Go to http://gnuwin32.sourceforge.net/packages/gawk.htm and open http://downloads.sourceforge.net/project/gnuwin32/gawk/3.1.6-1/gawk-3.1.6-1-setup.exe

Run the .exe and use the default installation location (which is the same as the coreutils above -- so you don't have to add a separate path). Run cmd and do e.g.

cat ntuser.ini|gawk "{print $2}"
S e t t i n g s ; T e m p o r a r y
3. sed
Sed is installed like gawk -- go to http://gnuwin32.sourceforge.net/packages/sed.htm and go to http://gnuwin32.sourceforge.net/downlinks/sed.php

Run the .exe file and install in the default location. Try e.g.

cat ntuser.ini|sed "s/e/E/g"

06 June 2012

177. Jerry-rigging g09 UV/VIS spectra in gnuplot and/or octave

EDIT: I had a nicer post with lots of figures before. Because I realised that the data is good enough to be included in a future paper we're working on, I had to take everything down again. All the data in the plots now is made up (hence 'fakeuv.dat'), and I haven't made the plots look nice.

I don't like proprietary formats for anything. They never, ever benefit anyone other than the software vendor.

Almost as bad as using binary proprietary formats is not providing export facilities to ascii formats.

I may have missed it, but I was using gaussview to look at td-dft calculated uv/vis spectra -- and couldn't find a way of exporting the data. Sure, I could export the graph as a png, svg etc. file. But not double column tab-separated ascii file.

There's a bit of fudging in what I'm doing  -- I'll be the first one to admit that.

So here's single line to export the wavelengths and intensities:
cat g03.g03out|grep Excited|grep -v singles|sed 's/=/\t/g'|gawk '{print $7,$10}'>uvvis.dat

You can plot them in gnuplot using
plot 'uvvis.dat' u 1:2 w impulse

The problem is that these are just spikes -- not the smooth uv/vis like spectra we're used to. On the other hand, if I understand things correctly, this is the REAL data, while the smoothed uv/vis spectrum above is more for presentation purposes. I might obviously be wrong, and I am by no stretch a computational or theoretical chemist - I just like their tools.

We've got an immensely powerful tool at our hands: Octave!
data=load('fakeuv.dat');
gauss= @(x,c,r,s) r.*1./(s.*sqrt(2*pi)).*exp(-0.5*((x-c)./s).^2)
x=linspace(250,850,600);
plot(x,cumsum(gauss(x,data(:,1),data(:,2),20)))

where 20 is an abitrary value. Anyway, this is how it looks:
We can try s=30 instead:

We export it
outdata=cumsum(gauss(x,data(:,1),data(:,2),30));
exportdata=[x' outdata'];
save 'uvvis2.sim' exportdata
and plot it in gnuplot
plot 'uvvis2.sim' u 1:48 w lines
It might not look like the UV/VIS spectrum you're used to, but as I said in the beginning, the data's all made up -- using 'real' calculated data I got a beautiful spectrum.

20 July 2011

7. Processing 1D Bruker nmr data

Bruker 1D binary NMR files can be processed using a combination of cat, grep, sed, gawk and od, together with python and octave (w/ octave-optim) for some fancy line-fitting.

 brukdig2asc:
 #!/bin/bash
#usage: brukdig2asc
SW=`cat acqus | grep 'SW_h' | sed 's/\=/\t/g' | gawk '{print $2}'| tr -d '\n'`
TD=`cat acqus | grep 'TD=' | sed 's/\=/\t/g' | gawk '{print $2}'| tr -d '\n'`
O=`cat acqus | grep '$O1=' | sed 's/\=/\t/g' | gawk '{print $2}'`
SFO=`cat acqus | grep 'SFO1=' | sed 's/\=/\t/g' | gawk '{print $2}'`
#TIME=16384
#SWEEP=23809.5238095238
#AQ=`echo "1/(23809.5238095238/(16384/2))" | bc -lq`
cp fid fid.bin
ls fid.bin | cpio -o | cpio -i --swap -u
od -An -t dI -v -w8 fid.bin| gawk '{print NR,$1,$2}'| sed '1,64d' >fid.asc1
pynmr $SW $TD $O $SFO
makespec

pynmr:
#!/usr/bin/python2.6
import sys
#print str(sys.argv)
sweepwidth=float(sys.argv[1])
nopts=int(sys.argv[2])
centrefreq=float(sys.argv[3])
basefreq=float(sys.argv[4])

aq=1/(sweepwidth/(nopts/2))
#print str(sweepwidth),str(nopts)
f=open('fid.asc1','r')
g=open('fid.asc','w')
for line in f:
    line=line.rstrip('\n')
    line=line.split(' ')
#    print line
    freq=float(line[0])/(nopts/2)*sweepwidth+(centrefreq-sweepwidth/2)
    line[0]=(float(line[0])/(nopts/2))*aq
    g.write(str(line[0])+'\t'+str(line[1])+'\t'+str(line[2])+'\t'+str(freq)+'\n')
f.close
g.close

makespec:

#!/bin/bash
octave --silent --eval "fid=load('fid.asc');
#make xaxis
[nopts b]=size(fid);
aq=max(fid(:,1));
sw=nopts/aq;
freqx=linspace(0,sw,nopts)';

#apodizing
lb=5/10000;
fid(:,2)=fid(:,2).*exp(-lb.*freqx);
fid(:,3)=fid(:,3).*exp(-lb.*freqx);

#phasing
spec=[fid(:,1) real(fftshift(fft(fid(:,2)+i*fid(:,3)))) imag(fftshift(fft(fid(:,2)+i*fid(:,3))))];
[a b]=size(spec); spec(a/2,2:3)=[0 0];
phc=linspace(0,2*pi,180);
maxsig=0;k=1;
for n=1:180;
        localmax=max( real( (spec(:,2)+i*spec(:,3)).*exp(i*phc(n)) ));
        if (localmax>maxsig)
                maxsig=localmax;
                k=n;
        endif
endfor;
#simple baseline
absd=inline('m+t*0','t','m');
guess=0;
[f m kvg iter corp covp covr stdresid z r2]=leasqr(fid(:,4),real((spec(:,2)+i*spec(:,3)).*exp(i*phc(k))),guess,absd);
#disp(m)
#disp(sqrt(diag(covp)))

#make spectrum
spectrum=[fid(:,4) real((spec(:,2)+i*spec(:,3)).*exp(i*phc(k)))-m imag((spec(:,2)+i*spec(:,3)).*exp(i*(phc(k)+pi/2)))-m];

#fitting
pkg load optim
[a b]=max(spectrum(:,2));
centre=fid(b,4);
guess=[10 max(spec(:,2))]; #centre width height
#disp(guess)
lorentzian=inline('p(2)*(1/pi)*(p(1)/2)./((t-centre).^2+(0.5*p(1))^2)','t','p');
[f p r2]=leasqr(fid((b-150):(b+150),4),spectrum((b-150):(b+150),3),guess,lorentzian);

#filter out artefacts from fitting set
filtered=[0 0];
res=floor((max(fid(:,4))-min(fid(:,4)))/nopts);
for l=(b-ceil(5*p(1)/res)):(b+5*ceil(p(1))/res)
delta=lorentzian(fid(l,4),p)-spectrum(l,2);

if (delta>(lorentzian(fid(l,4),p))/1.2)
# do nothing
else
filtered=[filtered; fid(l,4) spectrum(l,2)];
endif
endfor

filtered=[ filtered(2:size(filtered(:,2)),1) filtered(2:(size(filtered(:,2))),2)  ];
[f p r2]=leasqr(filtered(:,1),filtered(:,2),p,lorentzian);

#disp(p')
#disp(r2)
params=[centre centre/67.8 max(lorentzian(fid(:,4),p)) p(1) 1.000 p(2)];
disp(params)
#save
spex=[fid(:,4) real((spec(:,2)+i*spec(:,3)).*exp(i*phc(k)))-m imag((spec(:,2)+i*spec(:,3)).*exp(i*(phc(k)+pi/2)))-m lorentzian(fid(:,4),p)];
save spectrum.dat spex;"