20 September 2012

243. My own personal benchmarks for NWChem, gromacs with atlas, openblas, acml on AMD and intel

Update: you can compile against acml on intel as well, and against mkl on amd. Still need to do some performance testing to see how well it works. The artificial penalty of running mkl on AMD is well-publicised and led to a lawsuit, but I don't know how acml performs on mkl.


The title says it all, really. Since I'm back to exploring ways of improving performance for my little cluster I figured I'd break this out as a separate post. Most of this data was found here before: http://verahill.blogspot.com.au/2012/09/new-compute-node-using-amd-fx-8150.html

All units are running up-to-date debian testing (wheezy).

Configuration:
Boron (B): Phenom II X6 2.8 GHz, 8Gb RAM (2.8*6=16.8 GFLOPS predicted)
Neon (Ne): FX-8150 X8 3.6 GHz, 16 Gb RAM (3.6*8=28.8 GFLOPS predicted (int), 3.6*4=14.4 GFLOPS (fpu))
Tantalum (Ta): Quadcore i5-2400 3.1 GHz, 8 Gb RAM (3.1*4=12.4 GFLOPS predicted)
Vanadium (V):  Dual socket 2x Quadcore Xeon X3480 3.06 GHz, 8Gb. CentOS (ROCKS 5.4.3)/openblas.

Results

Gromacs --double (1 ns 6x6x6 nm tip4p water box; dynamic load balancing, double precision, 500k steps)
B  :  10.662 ns/day (11.8  GFLOPS, runtime 8104 seconds)***
B  :    9.921 ns/day ( 10.9 GFLOPS, runtime 8709 seconds)**
Ne:  10.606 ns/day (11.7  GFLOPS, runtime 8146 seconds) *
Ne:  12.375 ns/day (13.7  GFLOPS, runtime 6982 seconds)**
Ne:  12.385 ns/day (13.7  GFLOPS, runtime 6976 seconds)****
Ta:  10.825 ns/day (11.9  GFLOPS, runtime 7981 seconds)***
V :   10.560 ns/dat (11.7  GFLOPS, runtime 8182 seconds)***
*no external blas/lapack.
**using ACML libs
*** using openblas
**** using ATLAS

Gromacs --single (1 ns 6x6x6 nm tip4p water box; dynamic load balancing, single precision, 500 k steps)
B  :   17.251 ns/day (19.0 GFLOPS, runtime 5008 seconds)***
Ne:   21.874 ns/day (24.2 GFLOPS, runtime  3950 seconds)**
Ne:   21.804 ns/day (24.1 GFLOPS, runtime 3963  seconds)****
Ta:   17.345 ns/day (19.2 GFLOPS, runtime  4982 seconds)***
V :   17.297 ns/day (19.1 GFLOPS, runtime 4995 seconds)***
*no external blas/lapack.
**using ACML libs
*** using openblas
**** using ATLAS

NWChem (opt biphenyl cation, cp-md/pspw):
B  :   5951 seconds**
B  :   4084 seconds ***
B  :   5782 seconds ***xy
Ne:    3689 seconds**
Ta :   4102 seconds***
Ta :   4230 seconds***xy
V :    5396 seconds***

*no external blas/lapack.
**using ACML libs
*** using openblas
x Reconfigured using getmem.nwchem

NWChem (opt biphenyl cation, geovib, 6-31G**/ub3lyp):
B  :  2841 seconds **
B  :  2410 seconds***
B  :  2101 seconds ***x
B  :  2196 seconds ***xy
Ne: 1665 seconds **
Ta : 1785 seconds***
Ta : 1789 seconds***xy
V  : 2600 seconds***

*no external blas/lapack.
**using ACML libs
*** using openblas
x Reconfigured using getmem.nwchem
y NWChem 6.1.1

A Certain Commercial Ab Initio Package (Freq calc of pre-optimised H14C19O3 at 6-31+G*/rb3lyp):
B  :    2h 00 min (CPU time 10h 37 min)
Ne:   1h 37 min (CPU time: 11h 13 min)
Ta:   1h 26 min (CPU time: 5h 27 min)
V  :   2h 15 min (CPU time 15h 50 min)
Using precompiled binaries.


Gamess:
(I'm still working on learning how to run gamess efficiently, so take these values with a huge saucer of salt for now). bn.inp does a geometry optimisation of a biphenyl cation (mult 2) at ub3lyp/6-31G**. bn.inp has no $STATPT card while bn3.inp does and it makes a huge difference -- but is this because it does 20 steps (nsteps=20), then kills the run? The default is 50 steps and it does seem like all the runs do the maximum number of steps, then exit.

 Again, still learning. See below for input files. Will fix this post as I learn what the heck I'm doing. The relative run times on each node are still comparable though, but just don't use the numbers to compare the run speed of e.g. nwchem vs gamess.

Gamess using bn.inp with atlas
B:    9079 seconds
Ne: 7252 seconds
Ta:  9283 seconds

Gamess using bn.inp with openblas
B:   9071 seconds
Ta: 9297 seconds

Gamess using bn.inp with acml
Ne: 7062 seconds

Gamess using bn3.inp with atlas. 
B: 4016 seconds
Ne: 3162 seconds
Ta: 4114 seconds

MPQC:
Here I've used the version in the debian repos. I've created a hostfile
neon slots=8 max_slots=8
tantalum slots=4 max_slots=4
boron slots=6 max_slots=6

and then just looked at changing the order and slots assignment as well as total number of cores assigned using mpirun.

Simple test case looking at number of cores/distribution:
n cores:  Seconds: Configuration(cores,exec nodes)
4    :   11   : 4(Ta)
4    :   17   : 4(Ne)
4    :   17   : 4(B)
4    :   42   : 2(Ta)+2(B)
6    :   12   : 6(B)
6    :   13   : 6(Ne)
6    :   74   : 2(Ta)+2(B)+2(Ne)
8    :   12   : 8(Ne)
10  :   43   : 4(Ta)+6(B)
12  :   47   : 4(Ta)+8(Ne)
14  :   55   : 6(B)+8(Ne)
18  :   170 :  4(Ta)+6(B)+8(Ne)

My beowulf cluster doesn't seem to be much of a super computer. All in all, this looks like a pretty good argument in favour of upgrading to infiniband...


bn.inp:
 $CONTRL 
COORD=CART UNITS=ANGS scftyp=uhf dfttyp=b3lyp runtyp=optimize 
ICHARG=1 MULT=2 maxit=100
$END
 $system mwords=2000 $end
 $BASIS gbasis=n31 ngauss=6 ndfunc=1 npfunc=1 $END
 $guess guess=huckel $end

 $DATA
biphenyl
C1
C      6.0      0.0000000000   -3.5630100000    0.0000000000 
C      6.0     -1.1392700000   -2.8592800000   -0.3938400000 
C      6.0     -1.1387900000   -1.4654500000   -0.3941500000 
C      6.0      0.0000000000   -0.7428100000    0.0000000000 
C      6.0      1.1387900000   -1.4654500000    0.3941500000 
C      6.0      1.1392700000   -2.8592800000    0.3938400000 
C      6.0      0.0000000000    0.7428100000    0.0000000000 
C      6.0      1.1387900000    1.4654500000   -0.3941500000 
C      6.0      1.1392700000    2.8592800000   -0.3938400000 
C      6.0     -1.1387900000    1.4654500000    0.3941500000 
C      6.0      0.0000000000    3.5630100000    0.0000000000 
C      6.0     -1.1392700000    2.8592800000    0.3938400000 
H      1.0      0.0000000000   -4.6489600000    0.0000000000 
H      1.0     -2.0282700000   -3.3966200000   -0.7116100000 
H      1.0     -2.0214800000   -0.9282700000   -0.7279300000 
H      1.0      2.0282700000   -3.3966200000    0.7116100000 
H      1.0      2.0282700000    3.3966200000   -0.7116100000 
H      1.0     -2.0214800000    0.9282700000    0.7279300000 
H      1.0      0.0000000000    4.6489600000    0.0000000000 
H      1.0     -2.0282700000    3.3966200000    0.7116100000 
H      1.0      2.0214800000    0.9282700000   -0.7279300000 
H      1.0      2.0214800000   -0.9282700000    0.7279300000 
 $END


bn3.inp:
$CONTRL 
COORD=CART UNITS=ANGS scftyp=uhf dfttyp=b3lyp runtyp=optimize 
ICHARG=1 MULT=2 maxit=100
$END
 $system mwords=2000 $end
 $BASIS gbasis=n31 ngauss=6 ndfunc=1 npfunc=1 $END
 $STATPT OPTTOL=0.0001 NSTEP=20 HSSEND=.TRUE. $END
 $guess guess=huckel $end

 $DATA
biphenyl
C1
C      6.0      0.0000000000   -3.5630100000    0.0000000000 
C      6.0     -1.1392700000   -2.8592800000   -0.3938400000 
C      6.0     -1.1387900000   -1.4654500000   -0.3941500000 
C      6.0      0.0000000000   -0.7428100000    0.0000000000 
C      6.0      1.1387900000   -1.4654500000    0.3941500000 
C      6.0      1.1392700000   -2.8592800000    0.3938400000 
C      6.0      0.0000000000    0.7428100000    0.0000000000 
C      6.0      1.1387900000    1.4654500000   -0.3941500000 
C      6.0      1.1392700000    2.8592800000   -0.3938400000 
C      6.0     -1.1387900000    1.4654500000    0.3941500000 
C      6.0      0.0000000000    3.5630100000    0.0000000000 
C      6.0     -1.1392700000    2.8592800000    0.3938400000 
H      1.0      0.0000000000   -4.6489600000    0.0000000000 
H      1.0     -2.0282700000   -3.3966200000   -0.7116100000 
H      1.0     -2.0214800000   -0.9282700000   -0.7279300000 
H      1.0      2.0282700000   -3.3966200000    0.7116100000 
H      1.0      2.0282700000    3.3966200000   -0.7116100000 
H      1.0     -2.0214800000    0.9282700000    0.7279300000 
H      1.0      0.0000000000    4.6489600000    0.0000000000 
H      1.0     -2.0282700000    3.3966200000    0.7116100000 
H      1.0      2.0214800000    0.9282700000   -0.7279300000 
H      1.0      2.0214800000   -0.9282700000    0.7279300000 
 $END

19 September 2012

242. Briefly: Compiling NWChem 6.1.1 with Python on Debian Testing (Wheezy)

Back at the end of June a minor version of NWChem (bug fixes) was released.

There isn't much difference between compiling 6.1.1 and 6.1. Mainly, the difference is in what line to edit for python compatibility (NWChem 6.1 here:http://verahill.blogspot.com.au/2012/05/building-nwchem-61-on-debian.html )

1. Install dev packages
sudo apt-get install libopenmpi-dev openmpi-bin python2.7-dev zlib1g-dev libssl-dev

2. Compile openblas  or download e.g. acml.
(compiles fine, but doesn't run, with atlas)

3. NWchem goodness:

sudo mkdir /opt/nwchem
sudo chown $USER /opt/nwchem
cd /opt/nwchem
wget http://www.nwchem-sw.org/images/Nwchem-6.1.1-src.2012-06-27.tar.gz
(or go here http://www.nwchem-sw.org/index.php/Download)

tar xvf Nwchem-6.1.1-src.2012-06-27.tar.gz
cd nwchem-6.1.1-src/src/config/

Edit makefile.h and hange (line numbers are just for convenience -- don't add them)
1956 #   EXTRA_LIBS += -ltk -ltcl -L/usr/X11R6/lib -lX11 -ldl
1957      EXTRA_LIBS +=    -lnwcutil  -lpthread -lutil -ldl
1958   LDOPTIONS = -Wl,--export-dynamic

to
1956 #   EXTRA_LIBS += -ltk -ltcl -L/usr/X11R6/lib -lX11 -ldl
1957      EXTRA_LIBS +=    -lnwcutil  -lpthread -lutil -ldl -lssl -lz
1958   LDOPTIONS = -Wl,--export-dynamic

cd to /opt/nwchem/nwchem-6.1.1-src/ Create a file called buildconf.sh with the following content:
export LARGE_FILES=TRUE
export TCGRSH=/usr/bin/ssh
export NWCHEM_TOP=`pwd`
export NWCHEM_TARGET=LINUX64
export NWCHEM_MODULES="all python"
export PYTHONVERSION=2.7
export PYTHONHOME=/usr
export BLASOPT="-L/opt/openblas/lib -lopenblas"
#export BLASOPT="-L/opt/acml/acml5.2.0/gfortran64_int64/lib -lacml"
#export BLASOPT="-L/opt/ATLAS/lib -lsatlas -ltatlas"
export USE_MPI=y
export USE_MPIF=y
export USE_MPIF4=y
export MPI_LOC=/usr/lib/openmpi/lib
export MPI_INCLUDE=/usr/lib/openmpi/include
export LIBRARY_PATH=$LIBRARY_PATH:/usr/lib/openmpi/lib:/opt/openblas/lib
#export LIBRARY_PATH=$LIBRARY_PATH:/usr/lib/openmpi/lib:/opt/acml/acml5.2.0/gfortran64_int64/lib
#export LIBRARY_PATH=$LIBRARY_PATH:/usr/lib/openmpi/lib:/opt/ATLAS/lib
export LIBMPI="-lmpi -lopen-rte -lopen-pal -ldl -lmpi_f77 -lpthread"
cd $NWCHEM_TOP/src
make clean
make nwchem_config
make FC=gfortran 1> make.log 2>make.err
export FC=gfortran
cd ../contrib
./getmem.nwchem

Start the compilation:
time sh buildconf.sh 

 On a quadcore i5-2400 it took 18 minutes.

241. pKa, part 3: ccCA in NWChem. Doing something wrong?

First of all, I'm having problems reproducing the output from 'task ccca' by following the methods described in J. Chem. Theory Comput 2008, 4, 328-334 (scaling 0.9854) or Mol. Phys. 2009,107(8-12),1107-1121. The discrepancies are the energies reported for the MP2/cc-pVTZ-DK and CCSD(T)/cc-PVTZ which leads to a difference in calculated relativistic and correlation corrections. More about that at some other time.

Here's using ccCA in NWChem on acetic acid/acetate and formic acid/formate.
More about how it works in another post.

Basically, the way I am using it the results are very, very poor with  ccCA. All I can think is that I must be doing something wrong.


INPUT files 

Acetic acid input:

Title "aceticacid"
Start  aceticacid

echo

charge 0

geometry autosym units angstrom
 C     -0.312051     -1.36877     0.00000
 H     -0.929226     -1.55822     -0.878253
 H     -0.929226     -1.55822     0.878253
 H     0.548700     -2.02934     0.00000
 C     0.150590     0.0606620     0.00000
 O     -0.897092     0.922315     0.00000
 H     -0.521850     1.81528     0.00000
 O     1.29371     0.435169     0.00000
end

basis
* library "cc-pvtz"
end

dft
  mult 1
  direct
  noio
  XC b3lyp
  grid fine
  mulliken
end

driver
  default
end

ccca
  optimize
end

task ccca

Acetate input:

Title "acetate_ccca"
Start  acetate_ccca
echo
charge -1

geometry autosym units angstrom
 C     -0.0311237     -1.36218     0.00000
 H     0.501926     -1.73727     0.878691
 H     0.501926     -1.73727     -0.878691
 H     -1.05131     -1.75101     0.00000
 C     -0.00500996     0.204086     0.00000
 O     1.14247     0.706045     0.00000
 O     -1.12049     0.771493     0.00000
end

basis
* library "cc-pvtz"
end

dft
  mult 1
  direct
  noio
  XC b3lyp
  grid fine
  mulliken
end

driver
  default
end

ccca
   optimize
end

task ccca


Formic acid input:
Title "formicacid"
Start  formicacid

echo

charge 0

geometry autosym units angstrom
 C     0.410955     -0.132154     0.00000
 H     1.50430     -0.0475164     0.00000
 O     -0.134104     1.09718     0.00000
 H     -1.09846     0.988665     0.00000
 O     -0.203188     -1.15938     0.00000
end

basic
* library "cc-pvtz"
end

dft
  mult 1
  direct
  noio
  XC b3lyp
  grid fine
  mulliken
end

driver
end

ccca
   optimize
end

task ccca


Formate input:
Title "formate"
Start  formate

echo

charge -1

geometry autosym units angstrom
 C     0.00000     0.00000     0.329396
 H     0.00000     0.00000     1.47310
 O     -1.13532     0.00000     -0.189103
 O     1.13532     0.00000     -0.189103
end

basis "ao basis" spherical print
* library "cc-pvtz"
END

dft
  mult 1
  direct
  noio
  XC b3lyp
  grid fine
  mulliken
end

driver
end

ccca
   optimize
end

task ccca


OUTPUT 

Acetic acid
 Temperature                      =   298.15K
 frequency scaling parameter      =   0.9889

 Zero-Point correction to Energy  =   38.155 kcal/mol  (  0.060805 au)
 Thermal correction to Energy     =   41.060 kcal/mol  (  0.065434 au)
 Thermal correction to Enthalpy   =   41.653 kcal/mol  (  0.066378 au)

 Total Entropy                    =   69.467 cal/mol-K
   - Translational                =   38.179 cal/mol-K (mol. weight =  60.0211)    - Rotational                   =   23.830 cal/mol-K (symmetry #  =        1)
   - Vibrational                  =    7.458 cal/mol-K

 Cv (constant volume heat capacity) =   14.439 cal/mol-K
   - Translational                  =    2.979 cal/mol-K
   - Rotational                     =    2.979 cal/mol-K
   - Vibrational                    =    8.480 cal/mol-K


 ccCA: calculations done, now printing results
 
 ccCA-P  reference energy =   -228.82035086993045     
 ccCA-S3 reference energy =   -228.82800135561300     
 ccCA-S4 reference energy =   -228.82030423449530     
 ccCA-PS3 reference energy=   -228.82417611277174     
 DK correction            =  -0.13322049012506909     
 CCSD(T) correction       =   -4.8762979862686961E-002
 CV correction            =  -0.20936881324035994     
 ---------------------------
 Total ccCA-P   energy    =   -229.21170315315857     
 Total ccCA-S3  energy    =   -229.21935363884111     
 Total ccCA-S4  energy    =   -229.21165651772341     
 Total ccCA-PS3 energy    =   -229.21552839599985     
 
 Thermochemistry available:
            ZPE   =   6.0851792771826778E-002
 ccCA-P   E+ZPE   =  -229.15085136038675     
 ccCA-S3  E+ZPE   =  -229.15850184606930     
 ccCA-S4  E+ZPE   =  -229.15080472495160     
 ccCA-PS3 E+ZPE   =  -229.15467660322804     
 Wrote ccCA-P    energy to the RTDB 
 Leaving ccCA module...

 Task  times  cpu:     5565.6s     wall:     5650.7s

Acetate

 Temperature                      =   298.15K
 frequency scaling parameter      =   0.9889

 Zero-Point correction to Energy  =   29.591 kcal/mol  (  0.047157 au)
 Thermal correction to Energy     =   31.853 kcal/mol  (  0.050762 au)
 Thermal correction to Enthalpy   =   32.446 kcal/mol  (  0.051706 au)

 Total Entropy                    =   64.067 cal/mol-K
   - Translational                =   38.129 cal/mol-K (mol. weight =  59.0133)
   - Rotational                   =   23.739 cal/mol-K (symmetry #  =        1)
   - Vibrational                  =    2.199 cal/mol-K

 Cv (constant volume heat capacity) =   11.235 cal/mol-K
   - Translational                  =    2.979 cal/mol-K
   - Rotational                     =    2.979 cal/mol-K
   - Vibrational                    =    5.276 cal/mol-K

 ccCA: calculations done, now printing results
 
 ccCA-P  reference energy =   -228.25857936176124     
 ccCA-S3 reference energy =   -228.26625083689740     
 ccCA-S4 reference energy =   -228.25849407721080     
 ccCA-PS3 reference energy=   -228.26241509932930     
 DK correction            =  -0.13318127658752132     
 CCSD(T) correction       =   -4.4728554700242285E-002
 CV correction            =  -0.20921905251765338     
 ---------------------------
 Total ccCA-P   energy    =   -228.64570824556665     
 Total ccCA-S3  energy    =   -228.65337972070282     
 Total ccCA-S4  energy    =   -228.64562296101622     
 Total ccCA-PS3 energy    =   -228.64954398313472     
 
 Thermochemistry available:
            ZPE   =   4.7193435242008613E-002
 ccCA-P   E+ZPE   =  -228.59851481032464     
 ccCA-S3  E+ZPE   =  -228.60618628546081     
 ccCA-S4  E+ZPE   =  -228.59842952577421     
 ccCA-PS3 E+ZPE   =  -228.60235054789271     
 Wrote ccCA-P    energy to the RTDB 
 Leaving ccCA module...

 Task  times  cpu:     3859.1s     wall:     3910.2s


Formic acid

 Temperature                      =   298.15K
 frequency scaling parameter      =   0.9889

 Zero-Point correction to Energy  =   20.909 kcal/mol  (  0.033320 au)
 Thermal correction to Energy     =   22.902 kcal/mol  (  0.036497 au)
 Thermal correction to Enthalpy   =   23.495 kcal/mol  (  0.037441 au)

 Total Entropy                    =   59.329 cal/mol-K
   - Translational                =   37.387 cal/mol-K (mol. weight =  46.0055)    - Rotational                   =   21.008 cal/mol-K (symmetry #  =        1)
   - Vibrational                  =    0.934 cal/mol-K

 Cv (constant volume heat capacity) =    8.703 cal/mol-K
   - Translational                  =    2.979 cal/mol-K
   - Rotational                     =    2.979 cal/mol-K
   - Vibrational                    =    2.744 cal/mol-K

 ccCA: calculations done, now printing results

 ccCA-P  reference energy =   -189.56748775122853
 ccCA-S3 reference energy =   -189.57364633780318
 ccCA-S4 reference energy =   -189.56733835209894
 ccCA-PS3 reference energy=   -189.57056704451585
 DK correction            =  -0.11856238070660652
 CCSD(T) correction       =   -3.0831132609506540E-002
 CV correction            =  -0.16057296161548607
 ---------------------------
 Total ccCA-P   energy    =   -189.87745422616013
 Total ccCA-S3  energy    =   -189.88361281273478
 Total ccCA-S4  energy    =   -189.87730482703054
 Total ccCA-PS3 energy    =   -189.88053351944745

 Thermochemistry available:
            ZPE   =   3.3346398704552728E-002
 ccCA-P   E+ZPE   =  -189.84410782745556
 ccCA-S3  E+ZPE   =  -189.85026641403022
 ccCA-S4  E+ZPE   =  -189.84395842832598
 ccCA-PS3 E+ZPE   =  -189.84718712074289
 Wrote ccCA-P    energy to the RTDB
 Leaving ccCA module...

 Task  times  cpu:     1369.3s     wall:     1407.5s

Formate
 Temperature                      =   298.15K
 frequency scaling parameter      =   0.9889

 Zero-Point correction to Energy  =   12.385 kcal/mol  (  0.019737 au)
 Thermal correction to Energy     =   14.252 kcal/mol  (  0.022713 au)
 Thermal correction to Enthalpy   =   14.845 kcal/mol  (  0.023656 au)

 Total Entropy                    =   56.927 cal/mol-K
   - Translational                =   37.321 cal/mol-K (mol. weight =  44.9976)
   - Rotational                   =   19.229 cal/mol-K (symmetry #  =        2)
   - Vibrational                  =    0.377 cal/mol-K

 Cv (constant volume heat capacity) =    7.310 cal/mol-K
   - Translational                  =    2.979 cal/mol-K
   - Rotational                     =    2.979 cal/mol-K
   - Vibrational                    =    1.351 cal/mol-K


 ccCA: calculations done, now printing results
 
 ccCA-P  reference energy =   -189.01189831122844     
 ccCA-S3 reference energy =   -189.01808726799254     
 ccCA-S4 reference energy =   -189.01171261173857     
 ccCA-PS3 reference energy=   -189.01499278961049     
 DK correction            =  -0.11851294005154500     
 CCSD(T) correction       =   -2.6430727035545942E-002
 CV correction            =  -0.16057463040127118     
 ---------------------------
 Total ccCA-P   energy    =   -189.31741660871680     
 Total ccCA-S3  energy    =   -189.32360556548090     
 Total ccCA-S4  energy    =   -189.31723090922694     
 Total ccCA-PS3 energy    =   -189.32051108709885     
 
 Thermochemistry available:
            ZPE   =   1.9751889903778755E-002
 ccCA-P   E+ZPE   =  -189.29766471881302     
 ccCA-S3  E+ZPE   =  -189.30385367557713     
 ccCA-S4  E+ZPE   =  -189.29747901932316     
 ccCA-PS3 E+ZPE   =  -189.30075919719508     
 Wrote ccCA-P    energy to the RTDB 
 Leaving ccCA module...

Solvation energy

Solvation energy may seem easy to calculate, but difficult to calculate accurately using implicit methods, in particular for ions. I used the optimized structures from above, and then did a single-point COSMO (rsolv 0. Not ideal)  at RDFT(b3lyp)/cc-pVTZ/

Acetic acid: -8.59 kcal/mol
Acetate: -72.33 kcal/mol
Formic acid: -8.90
Formate: -72.59 kcal/mol
H+: -624.61 kcal/mol (lit. value)


Free energies:  
G: ccCA-P+(Hcorr-T*Scorr)
 G(acetic acid): -229.21170315315857*627.503+(41.653-298.15*69.467/1000)-8.59
 G(acetate): -228.64570824556665*627.503+(32.446-298.15*64.067/1000)-72.33
 G(formic acid): -189.87745422616013*627.503+(23.495-298.15*59.329/1000)-8.90
 G(formate):  -189.31741660871680*627.503+(14.845-298.15*56.927/1000)-72.59
 G(H+): -6.28 kcal/mol (lit. value) -264.61 (lit. value)=-270.89 kcal/mol


Results: Direct approach:
Don't forget to account for the standard state. R=1.9858775(34)×10−3 kcal/(K.mol)
DG(acetic/acetate)=G(acetate)+G(H+)-G(acetic)+RT*ln(1/24.46)=
(-228.64570824556665*627.503+(32.446-298.15*64.067/1000)-72.33-270.89)-(-229.21170315315857*627.503+(41.653-298.15*69.467/1000)-8.59)+1.9858775*298.15*log(1/24.46)/1000=
11.0435795929699 kcal/mol=46.2063370169861 kJ/mol =>
pKa=DG*log10(e)/RT=
11.0435795929699*log10(e)/(1.9858775*10**(-3)*298.15)
=8.1 (which is very bad -- it should be close to 4.75)

DG(formic/formate)=G(formate)+G(H+)-G(formic)-RT*ln(1/24.46)=
(-189.31741660871680*627.503+(14.845-298.15*56.927/1000)-72.59-270.89)-(-189.87745422616013*627.503+(23.495-298.15*59.329/1000)-8.90)+1.9858775*298.15*log(1/24.46)/1000=
7.01850845285312 kcal/mol=29.3654393667375 kJ/mol
pKa=5.1 (which is quite bad -- it should be close to 3.75)


Results: Isodesmic approach
From an older post:
"Assuming that we know that formic acid has a pKa of 3.75, then DG_solution=pKa*RT/log(e)=3.75*8.314*298.15/log10(e)/1000=21.404 kJ/mol. The reverse reaction is -21.404 kJ/mol."
That's about 5.116 kcal/mol

DG(acetate)+DG(formic)-(DG(acetic)+DG(formate))+DG(ref)=
((-228.64570824556665*627.503+(32.446-298.15*64.067/1000)-72.33)+(-189.87745422616013*627.503+(23.495-298.15*59.329/1000)-8.90))-((-229.21170315315857*627.503+(41.653-298.15*69.467/1000)-8.59)+(-189.31741660871680*627.503+(14.845-298.15*56.927/1000)-72.59))+5.116=
4.02507114014588 kcal/mol+5.116 kcal/mol =9.14107114014588 kcal/mol <=> pKa= 6.7
Which is better, but still not as good as here.

Using the E+zpe energies doesn't help much:
((-228.59851481032464*627.503+(32.446-298.15*64.067/1000)-72.33)+(-189.84410782745556*627.503+(23.495-298.15*59.329/1000)-8.90))-((-229.15085136038675*627.503+(41.653-298.15*69.467/1000)-8.59)+(-189.29766471881302*627.503+(14.845-298.15*56.927/1000)-72.59))+5.116=
9.10100587110175 kcal/mol <=> pKa=6.68

I really have no idea why the results are so bad when I had reasonable results with DFT/b3lyp/6-31++G** which should be worse than the E(CBS)+E(CC)+E(CV)+E(DK) approach for calculating electronic energies.

Solvation energies are a bit different and could explain some of the difference. Using the solvation energies from here I got:
((-228.64570824556665*627.503+(32.446-298.15*64.067/1000)-73.23)+(-189.87745422616013*627.503+(23.495-298.15*59.329/1000)-9.99))-((-229.21170315315857*627.503+(41.653-298.15*69.467/1000)-9.32)+(-189.31741660871680*627.503+(14.845-298.15*56.927/1000)-72.47))+5.116=
7.76107114008302 kcal/mol <=> pKa=5.69. Not 4.75, but closer.

Using rsolv 1.3 I get
((-228.64570824556665*627.503+(32.446-298.15*64.067/1000)-71.09)+(-189.87745422616013*627.503+(23.495-298.15*59.329/1000)-6.37))-((-229.21170315315857*627.503+(41.653-298.15*69.467/1000)-6.53)+(-189.31741660871680*627.503+(14.845-298.15*56.927/1000)-71.90))+5.116=
10.1610711401063 kcal/mol which is bad.


More thinking.
  This paper says that the gas phase free energy for the deprotonation of acetic acid should be 341.1 kcal/mol
(-228.64570824556665*627.503+(32.446-298.15*64.067/1000)-6.82)-(-229.21170315315857*627.503+(41.653-298.15*69.467/1000))=340.75 kca/mol
We're within 1 kcal/mol 

The same paper states 338.5 kcal/mol for formic acid:
(-189.31741660871680*627.503+(14.845-298.15*56.927/1000)-6.82)-(-189.87745422616013*627.503+(23.495-298.15*59.329/1000))=336.67 kca/mol

For our direct solution phase pKa calculation formic acid was off by about 1.9 kcal/mol which is similar to the error here.