13 February 2013

338. Annotating PDFs in linux -- revisited. Still no obvious solution

Update 28/02/2014: I've had a look at I, Librarian and Master PDF Editor here: http://verahill.blogspot.com.au/2014/02/558-more-options-for-pdf-annotation.html

There are a a few main reasons why even platform agnostic people have trouble moving to linux. One of them is poor compatibility of open solutions with MS Office (incompatibility between MS Equations and libre/openoffice is my main gripe, and some may find the lack of native EndNote an issue), and one quite particular to academics is the lack of proper pdf annotation (traditionally you edit your galley proofs by making annotations on the pdf -- my latest paper from Wiley came back with a doc file though, which was a bit surprising, but promising)

I've gotten by in the past by using 'pdf x-change viewer', which is a windows program, under wine. However, there are a lot of things I don't like about that solution, and the search for a native solution has continued to the point where I'm willing to throw money at it.

Why the exercise is unfair
The problem is that our definition of what 'works' and what doesn't is based on what we expect it to look like -- and that's based on our experience. I have colleagues that consider libreoffice 'crap' because it behaves or looks different from MS Office. While I agree that it's not viable as a replacement to office when collaborating with Office users who refuse to use libreoffice, it works fine if libreoffice is all you use. Anyway, it's unfair. Same goes for pdf annotation -- we expect it to look and function like adobe acrobat, simply because that's what we're used to. Any deviation is akin to a bug.

So keep that in mind when I write off some of the alternatives that actually do work -- just not in the way I expect them to.

I've found a lot of people (online) swearing by e.g.  Xournal, so even if it doesn't work for me, you may decide differently.

Commercial software in academia

I don't normally like buying software which is critical to my work. There are many reasons for it, and cost is only a minor one (although I don't like spending tax payer money on overpriced software).

Instead, if the software is critical to the science that I'm doing, I prefer to write my own algorithms in octave or python, and if it's a bit more peripheral I'm still wary of becoming reliant on a piece of software that may one day disappear -- either because I'm forced to upgrade through planned obsolescence, or because the company goes bankrupt/discontinues the software without releasing it as open source.

I cannot recommend Mendeley. And that's for two reasons:
1. it installs a list file in /etc/apt/sources.list.d/ without asking. That's the way malware (and google...) behaves!
2. There's plenty of mention about how it's 'free', but the free version is very restrictive, and if you create a private group you're stuck with a nagging message saying that you "must" upgrade if you want to add more users/groups. A simple one-time message would suffice.

Maybe I'm overreacting, but I don't appreciate this behaviour at all. Beyond that the annotation function works fine, so you decide for yourself.

Anyway, here's a short list of programs I've considered:
(and FOSS= Free and Open Source Software)



Evince
Evince is FOSS and now (since when?) supports adding annotations. It's not working very well (sometimes doesn't save annotation, slow), and you can't delete annotations, so be careful what you're saving.
Evince


pdfedit
pdfedit is another standard linux package. It does highlighting well, but annotates by adding text on top of the document -- not as a sticky note. FOSS.
pdfedit


flpsed
Does annotation, but as text superimposed on the pdf document, not as a collapsible sticky note. FOSS..

Qoppa PDF Studio
This is java based, and runs on Linux. I must already now say that there's one aspect of it that I really don't like: it's available in a Standard and a Professional version. That's the kind of artificial crippling of software that Microsoft likes to engage in, but I though we were beyond that on Linux...the price, $89, is too steep for something that I'd only use for annotations. Note also that the trial version puts a big nasty watermark over everything -- but you can hardly fault them for that, since it's not free. Closed source. Commercial.
Other than that it works, although it's not as pretty as mendeley.

Qoppa PDF studio


Xournal
Xournal can export annotated PDFs, but it doesn't do annotation in the same way as the other programs i.e. using sticky notes. Instead you can simply add text on top of the pdf, and it doesn't really do it for me. FOSS..
Xournal -- the annotations are not easy to spot


Whyteboard
It draws on top of PDF using imagemagick. FOSS.

Mendeley
While it's meant mainly for reference management, this does proper PDF annotation as well, and is platform agnostic. However, it is closed source, requires you to log in (even if you're using the desktop client) every time you use it, and needs you to explicitly keep documents (at least their titles etc.) out of the shared web catalogue. Other than that the pdf annotation works beautifully. I get really annoyed by the requirement to log in even when working offline though. It's free in the sense of gratis though -- but only up to 100 Mb of shared document space, you can only have one private group, and it can only have two members (+you). If you want more you need to pay (see e.g. here).
In practical terms, it seems to use GMT to time stamp annotationsand I haven't found an obvious way
of changing that (without going online). Also, it installs a file into /etc/apt/sources.list.d/ without asking.
Mendeley


Misc
Other 'solutions' that pop up is Okular which work in a roundabout way -- i.e the annotations aren't stored as part of the pdf. Again, it looks pretty, but the annotations are not exported with the pdf.
UPDATE: Note that this doesn't seem to be an issue anymore -- see comments below and this post: https://groakat.wordpress.com/2013/08/27/annotating-pdf-with-okular/ -- note that the new version is NOT in either wheezy or jessie i.e. they won't work.

okular


I tried FoxIt reader as well which claims to do annotation but doesn't on Linux -- and the windows version is not functional under linux/wine.

PDF X-change viewer/wine pops up so often as a suggested solution that it's beginning to look like spam. It does work though:
PDF Xchange viewer


An online only option is http://www.pdfescape.com -- but the paranoid part of me doesn't like the idea of uploading documents that are meant to be private.

12 February 2013

337. Modifying Nwchem 6.1.1 to work with GabEdit

Karol Strutynski left the following comment on a post about NWChem and Gabedit:

Hello,
I have one important comment:
The vectors coefficients in the nwchem output are incomplete!
The default behaviour of nwchem is to print 10 first coefficients with value bigger than 0.15. For systems with many atoms it is not enough, usually its not even close.

This behaviour is hard-coded in the nwchem source.
To change this you must search each instance of movecs_print_anal in the source code and replace 0.15d0 for smaller value in appropriate calls.
Furthermore you must change one loop in the src/ddscf/movecs_pr_anal.F file and around 200 line there will be loop:
do klo = 0, min(n-1,9), 2
You must increase the range of this loop, for something more reasonable like:
do klo = 0, min(n-1,199), 2

After recompiling the nwchem will print more coefficients and the gabedit will produce more reliable orbitals.

Best regards,
Karol Strutynski

So let's modify NWChem. I'll be modifying the 27th of June release of NWChem 6.1.1, which you'll obtain as Nwchem-6.1.1-src.2012-06-27.tar.gz from http://www.nwchem-sw.org/index.php/Download.


Change the number in red to something smaller (I tried 0.01d0) in the following files:
 /src/ddscf/uhf.F
 146  9611    continue
 147          call movecs_print_anal(basis, ilo, ihi, 0.15d0, g_movecs,
 148      $        'UHF Final Alpha Molecular Orbital Analysis',
 149      $        .true., dbl_mb(k_eval), oadapt, int_mb(k_irs),
 150      $        .true., dbl_mb(k_occ))
 151          call movecs_print_anal(basis, ilo, ihi, 0.15d0, g_movecs(2),
 152      $        'UHF Final Beta Molecular Orbital Analysis',
 153      $        .true., dbl_mb(k_eval+nbf), oadapt, int_mb(k_irs+nmo),
 154      $        .true., dbl_mb(k_occ+nbf)

/src/ddscf/scf_vec_guess.F
506          if (scftype.eq.'RHF' .or. scftype.eq.'ROHF') then
507             call movecs_print_anal(basis, 1,
508      &           nprint, 0.15d0, g_movecs,
509      &           'ROHF Initial Molecular Orbital Analysis',
510      &           .true., dbl_mb(k_eval), oadapt, int_mb(k_irs),
511      &           .true., dbl_mb(k_occ))
512          else
513             nprint = min(nalpha+20,nmo)
514             call movecs_print_anal(basis, max(1,nbeta-20),
515      &           nprint, 0.15d0, g_movecs,
516      &           'UHF Initial Alpha Molecular Orbital Analysis',
517      &           .true., dbl_mb(k_eval), oadapt, int_mb(k_irs),
518      &           .true., dbl_mb(k_occ))
519             call movecs_print_anal(basis, max(1,nbeta-20),
520      &           nprint, 0.15d0, g_movecs(2),
521      &           'UHF Initial Beta Molecular Orbital Analysis',
522      &           .true., dbl_mb(k_eval+nbf), oadapt, int_mb(k_irs+nmo),
523      &           .true., dbl_mb(k_occ+nbf))

/src/ddscf/rohf.F
155          endif
156          call movecs_print_anal(basis, ilo, ihi, 0.15d0, g_movecs,
157      $        'ROHF Final Molecular Orbital Analysis',
158      $        .true., dbl_mb(k_eval), oadapt, int_mb(k_irs),
159      $        .true., dbl_mb(k_occ))

/src/mcscf/mcscf.F
680       if (util_print('final vectors analysis', print_default))
681      $     call movecs_print_anal(basis,
682      $     max(1,nclosed-10), min(nbf,nclosed+nact+10),
683      $     0.15d0, g_movecs, 'Analysis of MCSCF natural orbitals',
684      $     .true., dbl_mb(k_evals), .true., int_mb(k_sym),
685      $     .true., dbl_mb(k_occ))
686 c

/src/nwdft/scf_dft_cg/dft_cg_solve.F
166           call movecs_fix_phase(g_movecs(ispin))
167           call movecs_print_anal(basis, ilo, ihi, 0.15d0,
168      &         g_movecs(ispin),blob,
169      &         .true., dbl_mb(k_eval+(ispin-1)*nbf),
170      &         oadapt, int_mb(k_irs+(ispin-1)*nbf),
171      &         .true., dbl_mb(k_occ+(ispin-1)*nbf))
172         enddo

/src/nwdft/scf_dft/dft_scf.F
1736             call movecs_print_anal(ao_bas_han, ilo, ihi, 0.15d0,
1737      &           g_movecs(ispin),
1738      &           blob,
1739      &           .true., dbl_mb(k_eval(ispin)), oadapt,
1740      &           int_mb(k_ir+(ispin-1)*nbf_ao),
1741      &           .true., dbl_mb(k_occ+(ispin-1)*nbf_ao))

/src/nwdft/scf_dft/dft_mxspin_ovlp.F
186       call movecs_print_anal(basis,int_mb(k_non),int_mb(k_non)
187      & ,0.15d0,g_alpha,'Alpha Orbitals without Beta Partners',
188      &   .false., 0.0 ,.false., 0 , .false., 0 )
189 c
190       if (nct.GE.2) then
191       do i = 2,nct
192       ind = int_mb(k_non+i-1)
193       call movecs_print_anal(basis,ind,ind
194      & ,0.15d0,g_alpha,' ',
195      &   .false., 0.0 ,.false., 0 , .false., 0 )
196       enddo
197       endif

352 c
353        call movecs_print_anal(basis, 1, nalp, 0.15d0, g_ualpha,
354      & 'Alpha Orb. w/o Beta Partners (after maxim. alpha/beta overlap)',
355      &   .false., 0.0 ,.false., 0 , .false., 0 )
356 c
Otherwise once could presumably edit the header in ./src/ddscf/movecs_pr_anal.F directly and substitute thresh. At a minimum you should edit that file according to Karol's instructions: change the number in red below to e.g. 199.

/src/ddscf/movecs_pr_anal.F
198             do klo = 0, min(n-1,9), 2
199                khi = min(klo+1,n-1)
200                write(LuOut,2) (
201      $              int_mb(k_list+k)+1,
202      $              dbl_mb(k_vecs+int_mb(k_list+k)),
203      $              (byte_mb(k_tags+int_mb(k_list+k)*16+m),m=0,15),
204      $              k = klo,khi)
205  2             format(1x,2(i5,2x,f12.6,2x,16a1,4x))
206             enddo

Compilation
At this point you should be able to follow post 242. Briefly: Compiling NWChem 6.1.1 with Python on Debian Testing (Wheezy) and compile nwchem with python etc. Don't forget to edit /src/config/makefile.h for python support as shown in that post. Once you're done with that you can compare the GabEdit plots with and without the modification.

Alternatively, if you're simply making changes to a copy of nwchem that you've compiled before, you can speed thing up by a factor of ca 300 by following this post:
http://verahill.blogspot.com.au/2013/04/380-modifying-nwchem-code-without-full.html



The difference:
I ran a job on benzene as described in post 281. Visualising NWChem output with GabEdit. I chose to run use the ELF (electron localisation function) on output from the unmodified and modified nwchem binaries. It's a pretty big difference:

Original

Modified

08 February 2013

336. Compiling ATLAS, netblas, lapack and openblas on Arch Linux

Here's another Arch post.

I was a bit surprised to find that there's no ATLAS in the standard Arch repositories (it is in AUR though), so here's how to build some of the more common math libraries for yourself:


ATLAS

pacman -S wget base-devel gcc-fortran cpupower
sudo systemctl enable cpupower

To build ATLAS you should set the governor for your CPU to performance to get the best optimization:

cpupower frequency-set -g performance
sudo cp /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor

Basically copy the scaling_governor to all cpus (cpu0, cpu1, cpu2 ...) as shown in the last line above. When you set the governor back to e.g. ondemand, follow the same steps.

sudo mkdir /opt/ATLAS
chown ${USER} /opt/ATLAS
mkdir -p ~/tmp/atlas
cd ~/tmp/atlas
wget http://www.netlib.org/lapack/lapack-3.4.2.tgz
wget http://downloads.sourceforge.net/project/math-atlas/Stable/3.10.1/atlas3.10.1.tar.bz2
tar xvf atlas3.10.1.tar.bz2
mkdir build/
cd build/
../ATLAS/./configure --prefix=/opt/ATLAS -Fa alg '-fPIC' --with-netlib-lapack-tarfile=$HOME/tmp/atlas/lapack-3.4.2.tgz --shared
make
make install

Simple as that. You can now change the governor back
cpupower frequency-set -g ondemand
sudo cp /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
sudo cp /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
...


netlib BLAS and lapack
pacman -S wget base-devel gcc-fortran cmake
sudo mkdir /opt/netlib
sudo chown $USER /opt/netlib
mkdir /opt/netlib/blas/lib -p
mkdir -p ~/tmp/blas
cd ~/tmp/blas
wget http://www.netlib.org/blas/blas.tgz
tar xvf blas.tgz
cd BLAS/

Edit make.inc
OPTS = -O3 -shared -m64 -march=native -fPIC
make all
gfortran -shared -Wl,-soname,libnetblas.so -o libblas.so.1.0.1 *.o -lc
ln -s libblas.so.1.0.1 libnetblas.so
cp lib*blas* /opt/netlib/blas/lib
cd ../
wget http://www.netlib.org/lapack/lapack-3.4.2.tgz
tar xvf lapack-3.4.2.tgz
mkdir /opt/netlib/lapack
mkdir build/
cd build/
ccmake ../lapack-3.4.2/ -DCMAKE_INSTALL_PREFIX=/opt/netlib/lapack -DBUILD_SHARED_LIBS=ON -DUSE_OPTIMIZED_BLAS=ON 

Hit c twice to configure, then g to generate.
Edit CMakeCache.txt and add the following lines at the beginning:
########################
# EXTERNAL cache entries
########################
 BLAS_FOUND:STRING=TRUE
 BLAS_GENERIC_FOUND:BOOL=TRUE
 BLAS_GENERIC_blas_LIBRARY:FILEPATH=/opt/netlib/blas/lib/libnetblas.so
 BLAS_LIBRARIES:PATH=/opt/netlib/blas/lib/libnetblas.so

Do
ccmake ../lapack-3.4.2/

again, then hit c once, then g.
Next,
make
make install

Done.

Openblas
Copied from here: http://verahill.blogspot.com.au/2013/02/334-compiling-nwchem-with-openmpi-and.html

Download from http://github.com/xianyi/OpenBLAS/tarball/v0.1.1

pacman -S wget base-devel gcc-fortran
sudo mkdir /opt/openblas
sudo chown $USER /opt/openblas
tar xvf xianyi-OpenBLAS-v0.1.1-0-g5b7f443.tar.gz
cd xianyi-OpenBLAS-e6e87a2/
make all BINARY=64 CC=/usr/bin/gcc FC=/usr/bin/gfortran USE_THREAD=0 INTERFACE64=1 1> make.log 2>make.err
make PREFIX=/opt/openblas install
cp lib*.*  /opt/openblas/lib