Showing posts with label nwchem. Show all posts
Showing posts with label nwchem. Show all posts

05 June 2013

439. Calculate frequencies from a hessian file from NWChem: example in Octave (matlab)

I wanted to calculate normal modes (frequencies) for specific atoms in a calculation, and so I had to write my own code.

This Octave code calculates frequencies for the first N atoms, where N is given in the input.mass file.

Background
The format that NWChem uses for the Hessian is that of a flat, triangular matrix i.e. a triangular matrix such as
1  
2 3 
4 5 6
is represented as
1
2
3
4
5
6

The Hessian is symmetric around the diagonal, so the full Hessian matrix is
1 2 4
2 3 5
4 5 6

The Hessian is independent of the masses of the atom pairs, while the frequencies are heavily dependent on the masses (isotope effects are quite visible for light elements).

To get the mass-weighted matrix we divide by the square root of the product of the masses (H * /(sqrt(m1*m2))). Note that the matrix reported in the nwchem output ("MASS-WEIGHTED NUCLEAR HESSIAN (Hartree/Bohr/Bohr/Kamu)") is multiplied by 1,000.

Once you have the mass-weighted hessian you need to calculate the eigenvalues, sort them and convert them to cm-1 using a scaling factor.

That's it.

The code:
See below for example input.mass and input.hess

%% prepare
clear;
format long

%%Calculate conversion factor from H/B/B/amu to cm-1

%% csi=299792458; %speed of light, m/s 
%% t2au=2.418884326505E-17; % seconds per a.u.
%% Better to do it by hand to avoid rounding errors:
cau=(2.99792458 * 2.418884326505)*1E-9; %c in metres per t(a.u.)

%% 1 electron (au)=9.10938291E-31 kg
%% 1 amu = 1.66053892E-27 kg
%% Better to do by hand to avoid rounding errors:
amu2au=(1.66053892/9.10938291)*1E4;% 1 amu in a.u. (via kgs)
%% For clarity
cmtom=1/100; %m per cm
%% And finally we get our scaling factor:
scaling=cmtom*(1/(2*pi*cau*sqrt(amu2au))); %( m/cm * 1/((m/au) * au) = m/cm * 1/m = 1/cm)


%%read masses
% The mass file contains the masses of the atoms
% The first line is the number of atoms in the file
% The remaining lines are the atom masses in the same order
% as the atoms are given in the nwchem input
protomasses=fopen("input.mass");
natoms=str2num(fgetl(protomasses));
for i = 1:natoms
 mass(end+1)=str2num(fgetl(protomasses));
end
fclose(protomasses);

%% Read and construct hessian from flat hessian in .hess file
%% The .hess file provided by nwchem is flat (i.e. one
%% dimensional) and is the triangular form (i.e half) of 
%% the full hessian. We use fgetl/str2num so that we can deal 
%% with instances of scientific notation in the hessian file.
%% While we"re at it we construct the mass-weighted force matrix too.
protohessian=fopen("input.hess"); 
hessian=zeros(3*natoms);
massweighted=zeros(3*natoms);

for i = 1:3*natoms
 for j=1:i
  hessian(i,j)=str2num(fgetl(protohessian));
  massweighted(i,j)=hessian(i,j)/sqrt( mass(ceil(i/3))*mass(ceil(j/3)));
 end
end

for i=1:3*natoms
 for j=1:i
  hessian(j,i)=hessian(i,j);
  massweighted(j,i)=massweighted(i,j);
 end
end

%% Diagonalize and compute frequencies in cm^{-1}
eigen=sort(eig(massweighted));
freqs=sqrt(eigen).*scaling;

%% Make imaginary frequencies negative and store them 
%% in a new array
for n=1:size(freqs,1)
 if imag(freqs(n))==0
  frequencies(end+1,1)=real(freqs(n));
 else
  frequencies(end+1,1)=-imag(freqs(n));
 end
end

%% Echo frequencies to stdout
printf("%10.4f \n",frequencies)
%% Save frequencies as well to modes.outs
outfile=fopen("normal.out","w");
fprintf(outfile,"%i \n",natoms);
fprintf(outfile,"%10.10f \n",frequencies);
fclose(outfile);
%save 'modes.out' -ascii  frequencies

input.mass (for water):
3
1.5994910D+01
1.0078250D+00
1.0078250D+00

input.hess (this one has imaginary frequencies as well):
     6.6177469151D-01
    -5.8658669668D-12
    -1.0013075598D-05
     1.0754299967D-09
     4.5060920407D-10
     3.6644723357D-01
    -3.3088202114D-01
     2.1099357839D-10
     1.6617441386D-01
     3.6163164885D-01
     2.5270659061D-12
     4.0920019206D-06
     3.2209366184D-11
     1.6382988861D-11
     8.3427731090D-07
     2.3904755566D-01
    -2.2311539742D-10
    -1.8322029567D-01
    -2.0261099118D-01
     1.1292349908D-10
     1.7796238990D-01
    -3.3088202212D-01
    -2.4469194991D-10
    -1.6617441477D-01
    -3.0749615389D-02
     1.2368245322D-10
    -3.6436594678D-02
     3.6163164980D-01
     2.5272503844D-12
     4.0920029550D-06
     3.2022391582D-11
     1.6289326095D-11
    -4.9229359909D-06
    -1.5407535297D-11
    -1.8816632580D-11
     8.3427660670D-07
    -2.3904755666D-01
    -2.2750774181D-10
    -1.8322029575D-01
     3.6436523006D-02
     1.1512005611D-10
     5.2580053385D-03
     2.0261099171D-01
     1.1238793371D-10
     1.7796238961D-01

Output:
 
  -11.0036 
   -1.6327 
    3.1676 
    3.9298 
    7.5811 
   12.2862 
 1619.0207 
 3616.0904 
 3781.1341

c.f.
 ----------------------------------------------------------------------------
 Normal Eigenvalue ||                 Infra Red Intensities
  Mode   [cm**-1]  || [atomic units] [(debye/angs)**2] [(KM/mol)] [arbitrary]
 ------ ---------- || -------------- ----------------- ---------- -----------
    1      -11.004 ||    0.426523           9.840       415.796      59.477
    2       -1.633 ||    0.000029           0.001         0.028       0.004
    3        3.168 ||    0.000003           0.000         0.003       0.000
    4        3.930 ||    0.000700           0.016         0.682       0.098
    5        7.581 ||    0.134394           3.101       131.014      18.741
    6       12.286 ||    0.000000           0.000         0.000       0.000
    7     1619.021 ||    0.070174           1.619        68.409       9.786
    8     3616.091 ||    0.004517           0.104         4.404       0.630
    9     3781.135 ||    0.009065           0.209         8.837       1.264
 ----------------------------------------------------------------------------

19 May 2013

421. NWChem 6.3 on ROCKS 5.4.3/CentOS 5.6

Update 23 May 2013: The execution times are pretty much the same as for 6.1.1 with a new patch. I've updated the instructions below to incorporate this new patch (http://www.nwchem-sw.org/images/Iswtch.patch.gz)

Update 21 May 2013:
The execution times can be improved considerably by setting
ARMCI_NETWORK=SOCKETS

They are still ca 30% longer than 6.1.1 though due to slower SCF convergence.
See http://www.nwchem-sw.org/index.php/Special:AWCforum/st/id834/Nwchem_6.3_running_2-5_times_slo....html

UPDATE 20 May 2013:
Nwchem 6.3 is very slow compared to 6.1.1. A six-core run (out of eight cores available) was 121 s using 6.1.1 but 254 seconds on 6.3!

I observed this on debian as well: 6.3 on debian is five times slower (190s vs 40 s for example at 8 cores in http://verahill.blogspot.com.au/2013/05/414-frequency-vs-cores-crude.html) than 6.1.1. Not sure why that is.

Original:
NWChem 6.3 is out now. Here's how to build it on ROCKS 5.4.3 (based on Centos 5.6) for CPU-based calculations (currently only CCSD(T) can take advantage of GPU/CUDA anyway).

To build on debian, see http://verahill.blogspot.com.au/2013/05/424-nwchem-63-on-debian-wheezy.html

This assumes that you've got a proper build environment (gcc, fortran, openmpi) installed.

Openblas:
I've added all users who do computations to the group compchem.
sudo mkdir /share/apps/openblas
sudo chown $USER:compchem /share/apps/openblas
cd ~/tmp
wget http://nodeload.github.com/xianyi/OpenBLAS/tarball/v0.1.1
tar xvf v0.1.1
cd xianyi-OpenBLAS-e6e87a2/
wget http://www.netlib.org/lapack/lapack-3.4.1.tgz
make all BINARY=64 CC=/usr/bin/gcc FC=/usr/bin/gfortran USE_THREAD=0 INTERFACE64=1 1> make.log 2>make.err

make PREFIX=/share/apps/openblas install
cp lib*.*  /share/apps/openblas/lib
sudo chmod 755 /share/apps/openblas -R

For later use with nwchem and ecce, add /share/apps/openblas/lib to /etc/ld.so.conf and do
sudo ldconfig

Put
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/share/apps/openblas/lib
in ~/.bashrc and/or queue files.

NWChem
I've added all users who do computations to the group compchem.
sudo mkdir /share/apps/nwchem/
sudo chown $USER:compchem /share/apps/nwchem/

cd /share/apps/nwchem
wget http://www.nwchem-sw.org/download.php?f=Nwchem-6.3-src.2013-05-17.tar.gz
tar xvf Nwchem-6.3-src.2013-05-17.tar.gz 
cd nwchem-6.3-src.2013-05-17/
cd src/
wget http://www.nwchem-sw.org/images/Iswtch.patch.gz
gzip -d Iswtch.patch
patch -p0 < Iswtch.patch
cd ../
export LARGE_FILES=TRUE
export TCGRSH=/usr/bin/ssh
export NWCHEM_TOP=`pwd`
export NWCHEM_TARGET=LINUX64
export NWCHEM_MODULES="all python"
export PYTHONHOME=/opt/rocks
export PYTHONVERSION=2.4
export USE_MPI=y
export USE_MPIF=y
export USE_MPIF4=y
export MPI_LOC=/opt/openmpi
export MPI_INCLUDE=/opt/openmpi/include
export LIBRARY_PATH=$LIBRARY_PATH:/opt/openmpi/lib:/share/apps/openblas
export LIBMPI="-lmpi -lopen-rte -lopen-pal -ldl -lmpi_f77 -lpthread"
export BLASOPT="-L/share/apps/openblas/lib -lopenblas -lopenblas_nehalem-r0.1.1 -lopenblas_nehalemp-r0.1.1"

export ARMCI_NETWORK=SOCKETS

cd $NWCHEM_TOP/src
export FC=gfortran
make clean
make  nwchem_config
make  FC=gfortran
cd ../contrib
./getmem.nwchem
 sudo chmod 755 /share/apps/nwchem/nwchem-6.3-src.2013-05-17 -R

Create a default.nwchemrc in /share/apps/nwchem
nwchem_basis_library /share/apps/nwchem/nwchem-6.3-src.2013-05-17/src/basis/libraries/ ffield amber amber_1 /share/apps/nwchem/nwchem-6.3-src.2013-05-17/src/data/amber_s/ amber_2 /share/apps/nwchem/nwchem-6.3-src.2013-05-17/src/data/amber_x/ amber_3 /share/apps/nwchem/nwchem-6.3-src.2013-05-17/src/data/amber_q/ amber_4 /share/apps/nwchem/nwchem-6.3-src.2013-05-17/src/data/amber_u/ amber_5 /share/apps/nwchem/nwchem-6.3-src.2013-05-17/src/data/custom/ spce /share/apps/nwchem/nwchem-6.3-src.2013-05-17/src/data/solvents/spce.rst charmm_s /share/apps/nwchem/nwchem-6.3-src.2013-05-17/src/data/charmm_s/ charmm_x /share/apps/nwchem/nwchem-6.3-src.2013-05-17/src/data/charmm_x/
and put symmlinks to it in the users' home directories, e.g.
cd ~
ln -s /share/apps/nwchem/default.nwchemrc .nwchemrc

15 May 2013

414. Frequency vs cores? Crude benchmarking on AMD FX 8150

I'm thinking about building my next computational node, and one issue which is preoccupying me is whether to go for lots of cores (e.g. a dual sock mobo with two 16 core 2.1 GHz cpus) or for a balance of cores and frequency (e.g. single-socket mobo with a 3.8 GHz 8 core cpu). Remember, this is built with private money -- not research grants -- so the budget is tight.

I mean, I can't look at something like this without wanting to buy it: http://www.newegg.com/Product/Product.aspx?Item=N82E16819113036. The question is whether I'm better off buying another one or two fx8150 for the price of 16x2 down-clocked cores.

Benchmarking with the FX 8150 actually makes some sense here if one of the newegg reviewers is to be believed, since the Opteron 6272 is described as two 8150s glued together and down-clocked.

The system: 32 gb ram, fx 8150, nwchem 6.1.1 with acml 5.3.1 (gfortran,int64, fma4) and openmpi.

Short of finding benchmarks for the type of applications that interest me (nwchem, mostly), I figure I could get a rough idea by throttling the frequency of my eight-core FX8150 and compare with unthrottled runs where the number of cores is limited.

Two things to take into account when looking at the times below:
  • modern processors are complex beasts -- I don't claim to fully understand threads vs virtual threads and integer vs FPU. In the FX8150 there are four fpus but eight cores. What this really means in practical terms when doing these particular test calculations, I don't know.
  • This isn't my job, and I need my nodes for running job-related calcs, so by necessity I had to use a short test job. There's inevitably some variability in the results, and using longer test jobs might affect the results somewhat.
  • The execution times vary A LOT for 'identical' conditions (see raw data), hence why I repeated the runs in bold ten times at 3.6 GHz to get reasonably solid comparison values. Still not perfect since the distribution isn't properly gaussian.

The specific question I wanted answered is:
Are 8 threads at 2.1 GHz significantly better than 4 threads at 3.6 GHz?
Short answer: No.
Looks like I won't be investing in 2 x 16 core 2.1 GHz cpus after all.


Optimization
c/f     3.60    3.30    2.70    2.10    1.40
8       44/3    49/6    58/1    75/6    110/5  
7       48/3                     72
6       52/1                    106
5       59/4            85       97
4       67/8            93     113/10    156
3       85/7
2      117/10
1      237/24
c=number of cores; f= frequency in GHz.

(times in seconds. 44/3 means 44 s +/- 3 s)

The way I read this is that it's better to have a 4-core 3.6 GHz cpu than an 8-core 2.1 GHz CPU. The whole 4 FPU/8 cores has me confused though, so I'm not sure whether that's affecting the results in a significant way.

The other thing to take into account is that there isn't normally a linear relationship between number of cores and execution times anyway -- doubling the number of cores doesn't normally lead to a halving of the execution time, so 16 cores at 2.10 GHz wouldn't necessarily be anywhere near 75/2=37 s. (again, that's ignoring the 2 cores/1 fpu issue)

-------------
c/f: raw data
--------------
8/3.6: 37.7,47.4,46.9,38.8, 46.8, 42.4,46.6, 43.9,44.7,42.8 => 44+/-3 s
7/3.6: 41.3,48.7,47.9,48.8,47.0,48.8,50.8,42.4,52.1,47.9 => 48+/-3 s
6/3.6: 49.5,53.4,50.5,53.4,52.4,53.3,51.3,53.4,52.5,53.55 => 52+/-1 s
5/3.6: 54.1,57.1, 67.7,52.2,59.6,58.4,59.8,57.6,59.4,58.6 => 59+/-4 s
4/3.6: 83.1,63.5,73.7,70.0,68.6,58.1,58.1,67.2,69.9,58.2 => 67 +/-8 s
3/3.6: 89.5, 86.0, 82.8, 97.9, 74.4,86.2,89.7, 86.3, 74.5, 86.2 => 85 +/-7 s
2/3.6: 114.1,137.4, 118.6, 108.3, 116.3, 123.6, 104.4,124.3,104.7, 120.6 => 117+/-10 s
1/3.6: 242.6,201.9,232.9,242.7, 233.2,202.0,233.1,265.2, 278.9,233.5 => 237+/- 24
8/3.3: 51.9, 42.4,42.7,55.3,43.3,55.8,54.6,48.1,42.4,48.1 => 49+/-6 s
8/2.7: 59.4, 57.3,59.1,57.8,58.9,56.8,59.0,58.5,59.2,56.9 => 58+/-1
8/2.1: 75.6,82.9,73.7,65.1,76.9,84.3,65.4,73.9,76.4,78.1 => 75+/-6 s
8/1.4: 112.5,110.5,112.1,108.6,113.1,114.4,112.4,109.1,97.9 => 110+/-5
4/2.1: 124.9,103.7,104.1, 92.4, 117.6,115.5,117.5,120.1,115.6,120.2 => 113+/-10 s

An alternative would be to report the fastest time (out of e.g. 10 tries) since it represents maximum capacity.



optimization input
scratch_dir /scratch
start benzeneopt 

geometry units angstroms
C  0.100  1.396  0.000
C  1.209  0.698  0.000
C  1.209 -0.698  0.000
C  0.000 -1.396  0.000
C -1.209 -0.698  0.000
C -1.209  0.698  0.000
H  0.000  2.479  0.000
H  2.147  1.240  0.000
H  2.147 -1.240  0.000
H  0.000 -2.479  0.000
H -2.147 -1.240  0.000
H -2.147  1.240  0.000
end

basis
 H library "6-31+g*" 
 c library "6-31+g*"
end
dft
 direct
end

task dft optimize



Setting frequency
The following script was called with the frequency in GHz, e.g. sudo setfreq 3.6

setfreq
/usr/bin/cpufreq-set -c 0 -g userspace
/usr/bin/cpufreq-set -c 1 -g userspace
/usr/bin/cpufreq-set -c 2 -g userspace
/usr/bin/cpufreq-set -c 3 -g userspace
/usr/bin/cpufreq-set -c 4 -g userspace
/usr/bin/cpufreq-set -c 5 -g userspace
/usr/bin/cpufreq-set -c 6 -g userspace
/usr/bin/cpufreq-set -c 7 -g userspace
/usr/bin/cpufreq-set -c 0 -f $1G
/usr/bin/cpufreq-set -c 1 -f $1G
/usr/bin/cpufreq-set -c 2 -f $1G
/usr/bin/cpufreq-set -c 3 -f $1G
/usr/bin/cpufreq-set -c 4 -f $1G
/usr/bin/cpufreq-set -c 5 -f $1G
/usr/bin/cpufreq-set -c 6 -f $1G
/usr/bin/cpufreq-set -c 7 -f $1G

10 May 2013

411. Attempt at OPENMP enabled NWChem 6.1.1 -- not successful...

Update 4 June 2013:
I might return to this later and have a look at how to make the parallel executable in the bin/LINUX64 folder.

Original post:
This is another addition to my growing list over unsuccessful, abandoned or only partially successful builds.
(see e.g.
http://verahill.blogspot.com.au/2013/05/409-failed-attempt-at-compiling-gamess_10.html
http://verahill.blogspot.com.au/2013/05/409a-failed-attempt-at-compiling-gamess.html
http://verahill.blogspot.com.au/2012/08/compiling-dalton-qm-on-debian-in.html
http://verahill.blogspot.com.au/2012/07/quantum-espresso-on-rocks-543-centos-56.html)

In other words -- yes, it builds. But no, it is unusable.

I can build nwchem with openmp support, and it does run in parallel -- but the wall time is enormous since most of the time only a single thread is running.

Maybe someone will read this and see what's missing, or feel inspired to make their own attempt

What I did
ACML libraries were installed as shown in e.g. http://verahill.blogspot.com.au/2013/05/409-failed-attempt-at-compiling-gamess_10.html

Nwchem was downloaded:
sudo mkdir /opt/nwchem
sudo chown $USER:$USER /opt/nwchem
cd /opt/nwchem
wget http://www.nwchem-sw.org/download.php?f=Nwchem-6.1.1-src.2012-06-27.tar.gz
tar xvf Nwchem-6.1.1-src.2012-06-27.tar.gz
cd nwchem-6.1.1-src/

Next I edited src/config/makefile.h
2363 ifdef OPTIMIZE 2364 FFLAGS = $(FOPTIONS) $(FOPTIMIZE) 2365 CFLAGS = $(COPTIONS) $(COPTIMIZE) -fopenmp 2366 else 2367 # Need FDEBUG after FOPTIONS on SOLARIS to correctly override optimization 2368 FFLAGS = $(FOPTIONS) $(FDEBUG) 2369 CFLAGS = $(COPTIONS) $(CDEBUG) -fopenmp 2370 endif 2371 INCLUDES = -I. $(LIB_INCLUDES) -I$(INCDIR) $(INCPATH) 2372 CPPFLAGS = $(INCLUDES) $(DEFINES) $(LIB_DEFINES) 2373 LDFLAGS = $(LDOPTIONS) -L$(LIBDIR) $(LIBPATH) 2374 LIBS = $(NW_MODULE_LIBS) $(CORE_LIBS) -lgomp 2375
I then built using the following build script:
export LARGE_FILES=TRUE export TCGRSH=/usr/bin/ssh export NWCHEM_TOP=`pwd` export NWCHEM_TARGET=LINUX64 export NWCHEM_MODULES="all" export PYTHONVERSION=2.7 export PYTHONHOME=/usr export BLASOPT="-L/opt/acml/acml5.3.1/gfortran64_fma4_mp_int64/lib -lacml_mp -lpthread" export USE_OPENMP=y export LIBRARY_PATH="$LIBRARY_PATH:/opt/acml/acml5.3.1/gfortran64_fma4_mp_int64/lib" cd $NWCHEM_TOP/src make clean make nwchem_config make FC=gfortran 2> make.err 1>make.log cd $NWCHEM_TOP/contrib export FC=gfortran ./getmem.nwchem
So far so good.

Where it fails
A picture is probably in order:
Note that while this is a short run, it is perfectly representative of what I'm seeing with 'real' jobs too -- I get eight threads auto-spawning (as seen by top), but only one thread is active most of the time.

Basically, most of the time only one core is running at 100% (i.e. showing as 12.5 % here since I have 8 cores), with the other cores occasionally kicking in (the 'spikes').

The wall times is 63 seconds, and the 'cpu time' is 83.1 seconds. Ideally, for a fully parallel run the cpu time should be as close to the wall time multiplied with eight for a shared run like this (but is always smaller).

As a comparison, here's an mpi-enabled binary:
Here all cores are active over most of the (short) run. The cpu time was 9.9 seconds and the wall time 11.8 seconds. For an mpi run the wall time should be as close to the cpu time as possible (but is always larger)

So it's not particularly 'parallel' in the OMP case -- but I don't know why. Maybe nwchem 6.1.1 isn't quite ready for OMP yet? I've noticed that it's one of the areas where the upcoming release is supposed to have been improved.


'profiling' with sar -- how-to
sudo apt-get install syssstat

Edit /etc/default/sysstat:
8 # will be overwritten by debconf! 9 ENABLED="true" 10
sudo service sysstat restart

Before launching the run, set sar to run in another windows and collect data before immediately launching the run you want to monitor in a different window:

sar 1 180 >> run.log
collects data every 1 seconds and repeats it 180 times (i.e. 181 seconds) and stores the data in run.log.

17 April 2013

390. NWChem: "Fix collapse/expand in xc_nucder_gen" when using actlist and frequency calc

I ended up with the error in the title when running a job today -- the job does a calc on a complex system, where I'm only interested in optimising and doing normal mode analysis on part of the system. I (am supposed to) achieve this using actlist.

Googling shows the following two posts:
Question: http://www.emsl.pnl.gov/docs/nwchem/nwchem-support/2004/03/0010.Problem_with_Vibrational_frequencies
Answer: http://www.emsl.pnl.gov/docs/nwchem/nwchem-support/2004/03/0012.RE:_Problem_with_Vibrational_frequencies

There doesn't seem to be a follow up between then and now (nine years later).

Note that this is completely different from this post which seems to be due to the OP not putting the actlist in deck A2 (i.e. s/he computes the frequencies for different systems in A and B)

The issue:
The smallest job I can use to trigger this is the following:
scratch_dir /scratch Title "trigger" Start trigger echo charge 0 geometry autosym units angstrom O -1.79757 -0.236404 0.264744 H -2.77094 -0.198938 0.271156 H -1.44498 0.380299 -0.412669 O -0.0458786 1.77247 -0.0605460 H -0.317322 1.32556 0.781197 H 0.894700 1.49387 -0.199468 end ecce_print ecce.out basis "ao basis" cartesian print H library "6-31G" O library "6-31G" END dft mult 1 direct XC b3lyp grid fine iterations 999 mulliken end task dft energy set geometry:actlist 1:3 task dft freq

Here's the error message:
HESSIAN: the one electron contributions are done in 0.1s Fix collapse/expand in xc_nucder_gen 0 ------------------------------------------------------------------------ ------------------------------------------------------------------------ current input line : 0: ------------------------------------------------------------------------ ------------------------------------------------------------------------ This error has not yet been assigned to a category ------------------------------------------------------------------------ For more information see the NWChem manual at http://www.nwchem-sw.org/index.php/NWChem_Documentation
I can't trigger it with just one water and using an actlist of 1:3 -- seems like I need inactive atoms. However, I've run similar calcs in the past without issue.

The solution:
Do a numerical instead of analytical frequency calculation.

This works:
scratch_dir /scratch Title "trigger" Start trigger echo charge 0 geometry autosym units angstrom O -1.79757 -0.236404 0.264744 H -2.77094 -0.198938 0.271156 H -1.44498 0.380299 -0.412669 O -0.0458786 1.77247 -0.0605460 H -0.317322 1.32556 0.781197 H 0.894700 1.49387 -0.199468 end ecce_print ecce.out basis "ao basis" cartesian print H library "6-31G" O library "6-31G" END dft mult 1 direct XC b3lyp grid fine iterations 999 mulliken end task dft energy set geometry:actlist 1:3 task dft freq numerical

15 April 2013

389. Patches for NWChem 6.1.1 on Debian Wheezy/Testing

There are a couple of issues with the current version of NWChem (27th of June 2012):
* PSPW is broken when NWChem is compiled with/run on systems with gcc 4.7 (here)
* Python support requires patching to include -lz -lssl (here)
* for GabEdit to work more detail needs to be printed (here)


To fix all those issues in one go, do the following:

1. Copy the text at the end of the post, and paste it into a file, e.g. diff.patch.

2. Put the patch file in NWCHEM_TOP (i.e. the root of the source code) e.g. /opt/nwchem/nwchem-6.1.1
If you are patching a previously compiled version of nwchem then do

patch -p0 < diff.patch
export LARGE_FILES=TRUE
export TCGRSH=/usr/bin/ssh
export NWCHEM_TOP=`pwd`
export NWCHEM_TARGET=LINUX64
export NWCHEM_MODULES="all python"
export PYTHONVERSION=2.7
export PYTHONHOME=/usr
export BLASOPT="-L/opt/openblas/lib -lopenblas"
export USE_MPI=y
export USE_MPIF=y
export USE_MPIF4=y
export MPI_LOC=/usr/lib/openmpi/lib
export MPI_INCLUDE=/usr/lib/openmpi/include
export LIBRARY_PATH=$LIBRARY_PATH:/usr/lib/openmpi/lib:/opt/openblas/lib
export LIBMPI="-lmpi -lopen-rte -lopen-pal -ldl -lmpi_f77 -lpthread"
export FC=gfortran
cd $NWCHEM_TOP/src/ddscf
make
cd $NWCHEM_TOP/src/nwdft/scf_dft
make
cd $NWCHEM_TOP/src/mcscf
make
cd $NWCHEM_TOP/src
make link
cd $NWCHEM_TOP/contrib
./getmem.nwchem

If it's a freshly extracted source, otherwise look at http://verahill.blogspot.com.au/2012/09/briefly-compiling-nwchem-611-with.html

Patch:
diff -rupN src.original/config/makefile.h src/config/makefile.h --- src.original/config/makefile.h 2013-04-15 12:41:45.016853322 +1000 +++ src/config/makefile.h 2013-04-15 12:38:44.933319544 +1000 @@ -1169,7 +1169,7 @@ endif FOPTIONS = -Wextra #-Wunused #-ffast-math FOPTIMIZE = -O2 -ffast-math -Wuninitialized DEFINES += -DGFORTRAN - _GCC46= $(shell gfortran -dumpversion 2>&1|awk ' /4.6./ {print "Y";exit};/4.7./ {print "Y";exit};{print "N"}') + _GCC46= $(shell gfortran -dumpversion 2>&1|awk ' /4.6/ {print "Y";exit};/4.7/ {print "Y";exit};{print "N"}') ifeq ($(_GCC46),Y) DEFINES += -DGCC46 endif @@ -1298,7 +1298,7 @@ endif FVECTORIZE=-O3 -ffast-math -mtune=native -mfpmath=sse -msse3 -ftree-vectorize -ftree-vectorizer-verbose=1 -fprefetch-loop-arrays -funroll-all-loops # FOPTIMIZE=-O1 # FVECTORIZE=-O1 - _GCC46= $(shell gfortran -dumpversion 2>&1|awk ' /4.6./ {print "Y";exit};/4.7./ {print "Y";exit};{print "N"}') + _GCC46= $(shell gfortran -dumpversion 2>&1|awk ' /4.6/ {print "Y";exit};/4.7/ {print "Y";exit};{print "N"}') ifeq ($(_GCC46),Y) DEFINES += -DGCC46 endif @@ -1890,7 +1890,7 @@ endif FOPTIONS += -ff2c -fno-second-underscore endif DEFINES += -DCHKUNDFLW -DGCC4 - _GCC46= $(shell gfortran -dumpversion 2>&1|awk ' /4.6./ {print "Y";exit};/4.7./ {print "Y";exit};{print "N"}') + _GCC46= $(shell gfortran -dumpversion 2>&1|awk ' /4.6/ {print "Y";exit};/4.7/ {print "Y";exit};{print "N"}') ifeq ($(_GCC46),Y) DEFINES += -DGCC46 endif @@ -1954,7 +1954,7 @@ endif ifeq ($(BUILDING_PYTHON),python) # EXTRA_LIBS += -ltk -ltcl -L/usr/X11R6/lib -lX11 -ldl - EXTRA_LIBS += -lnwcutil -lpthread -lutil -ldl + EXTRA_LIBS += -lnwcutil -lpthread -lutil -ldl -lssl -lz LDOPTIONS = -Wl,--export-dynamic endif ifeq ($(NWCHEM_TARGET),CATAMOUNT) diff -rupN src.original/ddscf/movecs_pr_anal.F src/ddscf/movecs_pr_anal.F --- src.original/ddscf/movecs_pr_anal.F 2013-04-15 12:41:45.036852381 +1000 +++ src/ddscf/movecs_pr_anal.F 2013-04-15 12:23:28.100409225 +1000 @@ -195,7 +195,7 @@ c 22 format(1x,2(' Bfn. Coefficient Atom+Function ',5x)) write(LuOut,23) 23 format(1x,2(' ----- ------------ ---------------',5x)) - do klo = 0, min(n-1,9), 2 + do klo = 0, min(n-1,199), 2 khi = min(klo+1,n-1) write(LuOut,2) ( $ int_mb(k_list+k)+1, diff -rupN src.original/ddscf/rohf.F src/ddscf/rohf.F --- src.original/ddscf/rohf.F 2013-04-15 12:41:45.036852381 +1000 +++ src/ddscf/rohf.F 2013-04-15 12:23:28.100409225 +1000 @@ -153,7 +153,7 @@ c ilo = 1 ihi = nmo endif - call movecs_print_anal(basis, ilo, ihi, 0.15d0, g_movecs, + call movecs_print_anal(basis, ilo, ihi, 0.01d0, g_movecs, $ 'ROHF Final Molecular Orbital Analysis', $ .true., dbl_mb(k_eval), oadapt, int_mb(k_irs), $ .true., dbl_mb(k_occ)) diff -rupN src.original/ddscf/scf_vec_guess.F src/ddscf/scf_vec_guess.F --- src.original/ddscf/scf_vec_guess.F 2013-04-15 12:41:45.036852381 +1000 +++ src/ddscf/scf_vec_guess.F 2013-04-15 12:23:28.100409225 +1000 @@ -505,19 +505,19 @@ c nprint = min(nclosed+nopen+30,nmo) if (scftype.eq.'RHF' .or. scftype.eq.'ROHF') then call movecs_print_anal(basis, 1, - & nprint, 0.15d0, g_movecs, + & nprint, 0.01d0, g_movecs, & 'ROHF Initial Molecular Orbital Analysis', & .true., dbl_mb(k_eval), oadapt, int_mb(k_irs), & .true., dbl_mb(k_occ)) else nprint = min(nalpha+20,nmo) call movecs_print_anal(basis, max(1,nbeta-20), - & nprint, 0.15d0, g_movecs, + & nprint, 0.01d0, g_movecs, & 'UHF Initial Alpha Molecular Orbital Analysis', & .true., dbl_mb(k_eval), oadapt, int_mb(k_irs), & .true., dbl_mb(k_occ)) call movecs_print_anal(basis, max(1,nbeta-20), - & nprint, 0.15d0, g_movecs(2), + & nprint, 0.01d0, g_movecs(2), & 'UHF Initial Beta Molecular Orbital Analysis', & .true., dbl_mb(k_eval+nbf), oadapt, int_mb(k_irs+nmo), & .true., dbl_mb(k_occ+nbf)) diff -rupN src.original/ddscf/uhf.F src/ddscf/uhf.F --- src.original/ddscf/uhf.F 2013-04-15 12:41:45.036852381 +1000 +++ src/ddscf/uhf.F 2013-04-15 12:23:28.096409414 +1000 @@ -144,11 +144,11 @@ C enddo ihi = max(ihi-1,1) 9611 continue - call movecs_print_anal(basis, ilo, ihi, 0.15d0, g_movecs, + call movecs_print_anal(basis, ilo, ihi, 0.01d0, g_movecs, $ 'UHF Final Alpha Molecular Orbital Analysis', $ .true., dbl_mb(k_eval), oadapt, int_mb(k_irs), $ .true., dbl_mb(k_occ)) - call movecs_print_anal(basis, ilo, ihi, 0.15d0, g_movecs(2), + call movecs_print_anal(basis, ilo, ihi, 0.01d0, g_movecs(2), $ 'UHF Final Beta Molecular Orbital Analysis', $ .true., dbl_mb(k_eval+nbf), oadapt, int_mb(k_irs+nmo), $ .true., dbl_mb(k_occ+nbf)) diff -rupN src.original/mcscf/mcscf.F src/mcscf/mcscf.F --- src.original/mcscf/mcscf.F 2013-04-15 12:41:45.000854073 +1000 +++ src/mcscf/mcscf.F 2013-04-15 12:23:23.748613695 +1000 @@ -680,7 +680,7 @@ c if (util_print('final vectors analysis', print_default)) $ call movecs_print_anal(basis, $ max(1,nclosed-10), min(nbf,nclosed+nact+10), - $ 0.15d0, g_movecs, 'Analysis of MCSCF natural orbitals', + $ 0.01d0, g_movecs, 'Analysis of MCSCF natural orbitals', $ .true., dbl_mb(k_evals), .true., int_mb(k_sym), $ .true., dbl_mb(k_occ)) c diff -rupN src.original/nwdft/scf_dft/dft_mxspin_ovlp.F src/nwdft/scf_dft/dft_mxspin_ovlp.F --- src.original/nwdft/scf_dft/dft_mxspin_ovlp.F 2013-04-15 12:41:45.604825677 +1000 +++ src/nwdft/scf_dft/dft_mxspin_ovlp.F 2013-04-15 12:23:28.228403211 +1000 @@ -184,14 +184,14 @@ c call ga_sync() c call movecs_print_anal(basis,int_mb(k_non),int_mb(k_non) - & ,0.15d0,g_alpha,'Alpha Orbitals without Beta Partners', + & ,0.01d0,g_alpha,'Alpha Orbitals without Beta Partners', & .false., 0.0 ,.false., 0 , .false., 0 ) c if (nct.GE.2) then do i = 2,nct ind = int_mb(k_non+i-1) call movecs_print_anal(basis,ind,ind - & ,0.15d0,g_alpha,' ', + & ,0.01d0,g_alpha,' ', & .false., 0.0 ,.false., 0 , .false., 0 ) enddo endif @@ -350,7 +350,7 @@ c endif c endif c 9990 format(/,18x,'THERE ARE',i3,1x,'UN-PARTNERED ALPHA ORBITALS') c - call movecs_print_anal(basis, 1, nalp, 0.15d0, g_ualpha, + call movecs_print_anal(basis, 1, nalp, 0.01d0, g_ualpha, & 'Alpha Orb. w/o Beta Partners (after maxim. alpha/beta overlap)', & .false., 0.0 ,.false., 0 , .false., 0 ) c diff -rupN src.original/nwdft/scf_dft/dft_scf.F src/nwdft/scf_dft/dft_scf.F --- src.original/nwdft/scf_dft/dft_scf.F 2013-04-15 12:41:45.608825490 +1000 +++ src/nwdft/scf_dft/dft_scf.F 2013-04-15 12:23:28.228403211 +1000 @@ -1733,7 +1733,7 @@ c else blob='DFT Final Beta Molecular Orbital Analysis' endif - call movecs_print_anal(ao_bas_han, ilo, ihi, 0.15d0, + call movecs_print_anal(ao_bas_han, ilo, ihi, 0.01d0, & g_movecs(ispin), & blob, & .true., dbl_mb(k_eval(ispin)), oadapt, diff -rupN src.original/nwdft/scf_dft_cg/dft_cg_solve.F src/nwdft/scf_dft_cg/dft_cg_solve.F --- src.original/nwdft/scf_dft_cg/dft_cg_solve.F 2013-04-15 12:41:45.612825303 +1000 +++ src/nwdft/scf_dft_cg/dft_cg_solve.F 2013-04-15 12:23:28.220403588 +1000 @@ -164,7 +164,7 @@ c blob = 'DFT Final Beta Molecular Orbital Analysis' endif call movecs_fix_phase(g_movecs(ispin)) - call movecs_print_anal(basis, ilo, ihi, 0.15d0, + call movecs_print_anal(basis, ilo, ihi, 0.01d0, & g_movecs(ispin),blob, & .true., dbl_mb(k_eval+(ispin-1)*nbf), & oadapt, int_mb(k_irs+(ispin-1)*nbf),

388. NWChem, PSPW and Fortran runtime error

I'm following this post: http://www.emsl.pnl.gov/docs/nwchem/nwchem-support/2012/12/0024.Re:_NWCHEM_errors_in_running_nwchem-6.1.

The issue:
If, on debian testing/wheezy 64 bit with gcc 4.7,  you run the following using NWChem 6.1.1

Title "boric acid" Start boric echo charge 0 geometry autosym units angstrom B 0.00000 0.00000 0.00000 O -4.93432e-17 1.55000 0.00000 H 1.06537 1.92667 0.00000 O 1.34234 -0.775000 0.00000 H 1.13586 -1.88597 0.00000 O -1.34234 -0.775000 0.00000 H -2.20123 -0.0406974 0.00000 end ecce_print ecce.out nwpw mult 1 xc pbe96 cutoff 90.0 np_dimensions -1 -1 tolerances 1e-7 1e-7 car-parrinello nose-hoover 1.000000e+03 2.981500e+02 1.000000e+03 2.981500e+02 time_step 0.500000e+00 fake_mass 1.000000e+02 loop 10 100 scaling 1.000000e+00 1.000000e+00 end end task pspw energy task pspw car-parrinello
You'll end up with
**************************************************** * * * NWPW PSPW Calculation * * * * [ (Grassman/Stiefel manifold implementation) ] * * * * [ NorthWest Chemistry implementation ] * * * * version #5.10 06/12/02 * * * * This code was developed by Eric J. Bylaska, * * and was based upon algorithms and code * * developed by the group of Prof. John H. Weare * * * **************************************************** >>> JOB STARTED AT Mon Apr 15 10:05:42 2013 <<< ================ input data ======================== library name resolved from: compiled reference NWCHEM_NWPW_LIBRARY set to: Generating 1d pseudopotential for B At line 649 of file psp_generator_input.F (unit = 99, file = './junk.inp') Fortran runtime error: Sequential READ or WRITE not allowed after EOF marker, possibly use REWIND or BACKSPACE -------------------------------------------------------------------------- mpirun has exited due to process rank 0 with PID 28100 on node neon exiting without calling "finalize". This may have caused other processes in the application to be terminated by signals sent by mpirun (as reported here). -------------------------------------------------------------------------


The fix
/opt/nwchem/nwchem-6.1.1-src is the NWCHEM_TOP of my particular nwchem source. See e.g. here for build instructions in general and not that e.g. BLASOPT settings etc. correspond to what I need on my system. Modify as necessary.

cd /opt/nwchem/nwchem-6.1.1-src/
wget http://www.nwchem-sw.org/images/Makefile.h.gcc46.patch.gz
gunzip Makefile.h.gcc46.patch.gz
patch -p0 < Makefile.h.gcc46.patch
cd src/nwpw/
touch `egrep -l GCC46 */*/*F`

export LARGE_FILES=TRUE
export TCGRSH=/usr/bin/ssh
export NWCHEM_TOP=/opt/nwchem/nwchem-6.1.1-src
export NWCHEM_TARGET=LINUX64
export NWCHEM_MODULES="all"
export PYTHONHOME=/usr
export BLASOPT="-L/opt/acml/acml5.2.0/gfortran64_fma4_int64/lib -lacml"
export USE_MPI=y
export USE_MPIF=y
export USE_MPIF4=y
export MPI_LOC=/usr/lib/openmpi/lib
export MPI_INCLUDE=/usr/lib/openmpi/include
export LIBRARY_PATH="$LIBRARY_PATH:/usr/lib/openmpi/lib:/opt/acml/acml5.2.0/gfortran64_fma4_int64/lib"
export LIBMPI="-lmpi -lopen-rte -lopen-pal -ldl -lmpi_f77 -lpthread"
export FC=gfortran
make 
cd ../
make link
./getmem.nwchem

Because we just relink instead of recompiling from scratch, recompiling is fairly quick (2-3 minutes in total)

Testing

It now continues as it should:
================ input data ======================== library name resolved from: compiled reference NWCHEM_NWPW_LIBRARY set to: Generating 1d pseudopotential for B

and everything is good from there-on.


NOTE: you need to delete the output files from any previous run, or you might end up with errors during the Car-Parrinello portion (e.g.
At line 170 of file MOTION.F (unit = 19, file = './boric.ion_motion')
Fortran runtime error: Sequential READ or WRITE not allowed after EOF marker, possibly use REWIND or BACKSPACE
)



Stay tuned...
What remains for me is to start collecting the patches that are needed for nwchem to work properly (from my perspective -- I only have to support myself and my debian boxes, not a heterogeneous set of users with equally heterogeneous hardware, so my job is easier than that of the nwchem devs) so that installing and upgrading nwchem becomes less painful (yes, I've been patching by hand up till now...).

The issues that need to be patched are:
* including libz and libssl on debian when building with python support (e.g. step 1 in this post)
* changing the details in the output to support gabEdit (e.g. this post)
* patching for GCC 4.7 (i.e. what you've just read)

Update: The post is here now: http://verahill.blogspot.com.au/2013/04/389-patches-for-nwchem-611-on-debian.html

08 April 2013

380. Modifying NWChem code without a full recompile

This isn't a secret trick or anything, but may still not be immediately obvious to most people.

A full compilation of NWChem can easily take 20-40 minutes, depending on your build machine.

Sometimes you need to make changes to the source code, e.g. if you want to use GabEdit to analyse output -- ECCE is a fantastic piece of software and is great for managing computations, but GabEdit has implemented some pretty interesting routines for orbital analyses. In order for you to be able to reliably import NWChem output into GabEdit you need to modify a handful of fortran files.

See e.g. http://verahill.blogspot.com.au/2013/02/3xx-modifying-nwchem-611-to-work-with.html for patching NWChem, and http://verahill.blogspot.com.au/2012/11/visualising-nwchem-output-with-gabedit.html for how to run your nwchem jobs and how to visualize them in gabEdit (you need to explicitly define your basis sets).

Angelo Rossi made this comment:
Hello:

Thank you so much for this valuable information.

But under the heading of "Compilation" above, the directions lead to a pointer to compile the entire NWChem source. But only one or two subroutines are modified. Shouldn't there be a more surgical way of proceeding after making the suggested changes? Actually this would make a great separate post. That is, provide a procedure to recompile and link when small changes to NWChem are made. I've done this, but I can't remember.

Kind regards,

Angelo

The answer was on this page: http://xray.isc.kharkov.com/ext_docs/NWChem/prog/node12.html
So here we go:

Partial recompile/relinking of NWChem

1. Environmental variables
Define your environmental variable like you would during a normal compile
export LARGE_FILES=TRUE export TCGRSH=/usr/bin/ssh export NWCHEM_TOP=`pwd` export NWCHEM_TARGET=LINUX64 export NWCHEM_MODULES="all python" export PYTHONVERSION=2.7 export PYTHONHOME=/usr export BLASOPT="-L/opt/openblas/lib -lopenblas" export USE_MPI=y export USE_MPIF=y export USE_MPIF4=y export MPI_LOC=/usr/lib/openmpi/lib export MPI_INCLUDE=/usr/lib/openmpi/include export LIBRARY_PATH=$LIBRARY_PATH:/usr/lib/openmpi/lib:/opt/acml/acml5.2.0/gfortran64_int64/lib:/opt/openblas/lib export LIBMPI="-lmpi -lopen-rte -lopen-pal -ldl -lmpi_f77 -lpthread" export FC=gfortran
2. Run make in modified directories
In the GabEdit example we made changes to files in src/ddscf, src/nwdft/scf_dft, and src/mcscf
cd $NWCHEM_TOP/src/ddscf
make
cd $NWCHEM_TOP/src/nwdft/scf_dft
make
cd $NWCHEM_TOP/src/mcscf
make

3. Run make link in the src directory
cd $NWCHEM_TOP/src
make link

4. Do your usual post-compilation memory magic
cd $NWCHEM_TOP/contrib
./getmem.nwchem

Done!

This all in all took around 6 seconds instead of 30-odd minutes.

18 February 2013

340. Issues when compiling Nwchem 6.1.1 -- missing *.fh in src/include

The problem:
If you modify the nwchem sources, build, do a make realclean and then pack them up to export your patched sources to other nodes you might find that compiling doesn't work that well, yielding errors such as (I'll list them all to help google):

make[1]: *** No rule to make target `/opt/nwchem/nwchem-6.1.1-src_mod/src/include/stdio.fh', needed by `/opt/nwchem/nwchem-6.1.1-src_mod/lib/LINUX64/libnwcutil.a(basis.o)'.  Stop.
make: *** [libraries] Error 1

and

make[1]: *** No rule to make target `/opt/nwchem/nwchem-6.1.1-src_mod/src/include/nwc_const.fh', needed by `/opt/nwchem/nwchem-6.1.1-src_mod/lib/LINUX64/libnwcutil.a(basis.o)'.  Stop.
make: *** [libraries] Error 1
nwchem.F:3:0: fatal error: errquit.fh: No such file or directory
compilation terminated.
stubs.F:15:0: fatal error: errquit.fh: No such file or directory

and

make[1]: *** No rule to make target `/opt/nwchem/nwchem-6.1.1-src_mod/src/include/errquit.fh', needed by `/opt/nwchem/nwchem-6.1.1-src_mod/lib/LINUX64/libnwcutil.a(basis.o)'.  Stop.
make: *** [libraries] Error 1

and

make[1]: *** No rule to make target `/opt/nwchem/nwchem-6.1.1-src_mod/src/include/util.fh', needed by `/opt/nwchem/nwchem-6.1.1-src_mod/lib/LINUX64/libnwcutil.a(bas_input.o)'.  Stop.

and

/opt/nwchem/nwchem-6.1.1-src_mod/src/include/util.fh:1:0: fatal error: printlevels.fh: No such file or directory
compilation terminated.
make[1]: *** [/opt/nwchem/nwchem-6.1.1-src_mod/lib/LINUX64/libnwcutil.a(bas_input.o)] Error 1
make[1]: *** Waiting for unfinished jobs....

and

basisP.F: In function 'nbf_from_ucont':
basisP.F:427:0: warning: '__result_nbf_from_ucont' may be used uninitialized in this function [-Wmaybe-uninitialized]
make[2]: warning: -jN forced in submake: disabling jobserver mode.
make[1]: warning: -jN forced in submake: disabling jobserver mode.
make[1]: *** No rule to make target `/opt/nwchem/nwchem-6.1.1-src_mod/src/include/msgids.fh', needed by `/opt/nwchem/nwchem-6.1.1-src_mod/lib/LINUX64/libnwcutil.a(geom.o)'.  Stop.

and

make[1]: *** No rule to make target `/opt/nwchem/nwchem-6.1.1-src_mod/src/include/bitops.fh', needed by `/opt/nwchem/nwchem-6.1.1-src_mod/lib/LINUX64/libnwcutil.a(pstat_alloc.o)'.  Stop.

and

In file included from pstat_alloc.F:12:0:
/opt/nwchem/nwchem-6.1.1-src_mod/src/include/bitops.fh:11:0: fatal error: bitops_decls.fh: No such file or directory
compilation terminated.
make[1]: *** [/opt/nwchem/nwchem-6.1.1-src_mod/lib/LINUX64/libnwcutil.a(pstat_alloc.o)] Error 1

and

In file included from pstat_alloc.F:12:0:
/opt/nwchem/nwchem-6.1.1-src_mod/src/include/bitops.fh:12:0: fatal error: bitops_funcs.fh: No such file or directory
compilation terminated.
make[1]: *** [/opt/nwchem/nwchem-6.1.1-src_mod/lib/LINUX64/libnwcutil.a(pstat_alloc.o)] Error 1

and

make[2]: warning: -jN forced in submake: disabling jobserver mode.
make[2]: *** No rule to make target `/opt/nwchem/nwchem-6.1.1-src_mod/src/include/itri.fh', needed by `/opt/nwchem/nwchem-6.1.1-src_mod/lib/LINUX64/libnwcutil.a(sym_sh_pair.o)'.  Stop.
make[2]: *** Waiting for unfinished jobs....

and

make[2]: *** No rule to make target `/opt/nwchem/nwchem-6.1.1-src_mod/src/include/bgj.fh', needed by `/opt/nwchem/nwchem-6.1.1-src_mod/lib/LINUX64/libnwints.a(exactd_mem.o)'.  Stop.

and

make[1]: *** No rule to make target `/opt/nwchem/nwchem-6.1.1-src_mod/src/include/numerical_constants.fh', needed by `/opt/nwchem/nwchem-6.1.1-src_mod/lib/LINUX64/librimp2.a(rimp2_v_e2.o)'.  Stop.

and

make[2]: *** No rule to make target `/opt/nwchem/nwchem-6.1.1-src_mod/src/include/util_sgroup.fh', needed by `/opt/nwchem/nwchem-6.1.1-src_mod/lib/LINUX64/libdntmc.a(gibbs.o)'.  Stop.

and

nwchem.F:11:0: fatal error: bgj_common.fh: No such file or directory

Solution:
This has been mentioned before on the nwchem forum, but not in explicit enough detail.
The solution is to copy a series of files from src/util and to remove make realclean from your build instructions (or at least do the copying after the make realclean step).

Copy these files
cp src/util/stdio.fh src/include/
cp src/util/nwc_const.fh src/include/
cp src/util/errquit.fh src/include/
cp src/util/util.fh src/include/
cp src/util/printlevels.fh src/include/
cp src/util/msgids.fh src/include/
cp src/util/bitops.fh src/include/
cp src/util/bitops_decls.fh src/include/
cp src/util/bitops_funcs.fh src/include/
cp src/util/itri.fh src/include/
cp src/util/bgj.fh src/include/
cp src/util/numerical_constants.fh src/include/
cp src/util/util_sgroup.fh src/include/
cp src/util/bgj_common.fh src/include/

Then build, using e.g.
export LARGE_FILES=TRUE
export TCGRSH=/usr/bin/ssh
export NWCHEM_TOP=`pwd`
export NWCHEM_TARGET=LINUX64
export NWCHEM_MODULES="all python"
export PYTHONVERSION=2.7
export PYTHONHOME=/usr
export BLASOPT="-L/opt/openblas/lib -lopenblas"
export USE_MPI=y
export USE_MPIF=y
export USE_MPIF4=y
export MPI_LOC=/usr/lib/openmpi/lib
export MPI_INCLUDE=/usr/lib/openmpi/include
export LIBRARY_PATH=$LIBRARY_PATH:/usr/lib/openmpi/lib:/opt/openblas/lib
export LIBMPI="-lmpi -lopen-rte -lopen-pal -ldl -lmpi_f77 -lpthread"
cd $NWCHEM_TOP/src
make nwchem_config
make FC=gfortran 1>make.log 2>make.err
export FC=gfortran
cd ../contrib
./getmem.nwchem

12 February 2013

337. Modifying Nwchem 6.1.1 to work with GabEdit

Karol Strutynski left the following comment on a post about NWChem and Gabedit:

Hello,
I have one important comment:
The vectors coefficients in the nwchem output are incomplete!
The default behaviour of nwchem is to print 10 first coefficients with value bigger than 0.15. For systems with many atoms it is not enough, usually its not even close.

This behaviour is hard-coded in the nwchem source.
To change this you must search each instance of movecs_print_anal in the source code and replace 0.15d0 for smaller value in appropriate calls.
Furthermore you must change one loop in the src/ddscf/movecs_pr_anal.F file and around 200 line there will be loop:
do klo = 0, min(n-1,9), 2
You must increase the range of this loop, for something more reasonable like:
do klo = 0, min(n-1,199), 2

After recompiling the nwchem will print more coefficients and the gabedit will produce more reliable orbitals.

Best regards,
Karol Strutynski

So let's modify NWChem. I'll be modifying the 27th of June release of NWChem 6.1.1, which you'll obtain as Nwchem-6.1.1-src.2012-06-27.tar.gz from http://www.nwchem-sw.org/index.php/Download.


Change the number in red to something smaller (I tried 0.01d0) in the following files:
 /src/ddscf/uhf.F
 146  9611    continue
 147          call movecs_print_anal(basis, ilo, ihi, 0.15d0, g_movecs,
 148      $        'UHF Final Alpha Molecular Orbital Analysis',
 149      $        .true., dbl_mb(k_eval), oadapt, int_mb(k_irs),
 150      $        .true., dbl_mb(k_occ))
 151          call movecs_print_anal(basis, ilo, ihi, 0.15d0, g_movecs(2),
 152      $        'UHF Final Beta Molecular Orbital Analysis',
 153      $        .true., dbl_mb(k_eval+nbf), oadapt, int_mb(k_irs+nmo),
 154      $        .true., dbl_mb(k_occ+nbf)

/src/ddscf/scf_vec_guess.F
506          if (scftype.eq.'RHF' .or. scftype.eq.'ROHF') then
507             call movecs_print_anal(basis, 1,
508      &           nprint, 0.15d0, g_movecs,
509      &           'ROHF Initial Molecular Orbital Analysis',
510      &           .true., dbl_mb(k_eval), oadapt, int_mb(k_irs),
511      &           .true., dbl_mb(k_occ))
512          else
513             nprint = min(nalpha+20,nmo)
514             call movecs_print_anal(basis, max(1,nbeta-20),
515      &           nprint, 0.15d0, g_movecs,
516      &           'UHF Initial Alpha Molecular Orbital Analysis',
517      &           .true., dbl_mb(k_eval), oadapt, int_mb(k_irs),
518      &           .true., dbl_mb(k_occ))
519             call movecs_print_anal(basis, max(1,nbeta-20),
520      &           nprint, 0.15d0, g_movecs(2),
521      &           'UHF Initial Beta Molecular Orbital Analysis',
522      &           .true., dbl_mb(k_eval+nbf), oadapt, int_mb(k_irs+nmo),
523      &           .true., dbl_mb(k_occ+nbf))

/src/ddscf/rohf.F
155          endif
156          call movecs_print_anal(basis, ilo, ihi, 0.15d0, g_movecs,
157      $        'ROHF Final Molecular Orbital Analysis',
158      $        .true., dbl_mb(k_eval), oadapt, int_mb(k_irs),
159      $        .true., dbl_mb(k_occ))

/src/mcscf/mcscf.F
680       if (util_print('final vectors analysis', print_default))
681      $     call movecs_print_anal(basis,
682      $     max(1,nclosed-10), min(nbf,nclosed+nact+10),
683      $     0.15d0, g_movecs, 'Analysis of MCSCF natural orbitals',
684      $     .true., dbl_mb(k_evals), .true., int_mb(k_sym),
685      $     .true., dbl_mb(k_occ))
686 c

/src/nwdft/scf_dft_cg/dft_cg_solve.F
166           call movecs_fix_phase(g_movecs(ispin))
167           call movecs_print_anal(basis, ilo, ihi, 0.15d0,
168      &         g_movecs(ispin),blob,
169      &         .true., dbl_mb(k_eval+(ispin-1)*nbf),
170      &         oadapt, int_mb(k_irs+(ispin-1)*nbf),
171      &         .true., dbl_mb(k_occ+(ispin-1)*nbf))
172         enddo

/src/nwdft/scf_dft/dft_scf.F
1736             call movecs_print_anal(ao_bas_han, ilo, ihi, 0.15d0,
1737      &           g_movecs(ispin),
1738      &           blob,
1739      &           .true., dbl_mb(k_eval(ispin)), oadapt,
1740      &           int_mb(k_ir+(ispin-1)*nbf_ao),
1741      &           .true., dbl_mb(k_occ+(ispin-1)*nbf_ao))

/src/nwdft/scf_dft/dft_mxspin_ovlp.F
186       call movecs_print_anal(basis,int_mb(k_non),int_mb(k_non)
187      & ,0.15d0,g_alpha,'Alpha Orbitals without Beta Partners',
188      &   .false., 0.0 ,.false., 0 , .false., 0 )
189 c
190       if (nct.GE.2) then
191       do i = 2,nct
192       ind = int_mb(k_non+i-1)
193       call movecs_print_anal(basis,ind,ind
194      & ,0.15d0,g_alpha,' ',
195      &   .false., 0.0 ,.false., 0 , .false., 0 )
196       enddo
197       endif

352 c
353        call movecs_print_anal(basis, 1, nalp, 0.15d0, g_ualpha,
354      & 'Alpha Orb. w/o Beta Partners (after maxim. alpha/beta overlap)',
355      &   .false., 0.0 ,.false., 0 , .false., 0 )
356 c
Otherwise once could presumably edit the header in ./src/ddscf/movecs_pr_anal.F directly and substitute thresh. At a minimum you should edit that file according to Karol's instructions: change the number in red below to e.g. 199.

/src/ddscf/movecs_pr_anal.F
198             do klo = 0, min(n-1,9), 2
199                khi = min(klo+1,n-1)
200                write(LuOut,2) (
201      $              int_mb(k_list+k)+1,
202      $              dbl_mb(k_vecs+int_mb(k_list+k)),
203      $              (byte_mb(k_tags+int_mb(k_list+k)*16+m),m=0,15),
204      $              k = klo,khi)
205  2             format(1x,2(i5,2x,f12.6,2x,16a1,4x))
206             enddo

Compilation
At this point you should be able to follow post 242. Briefly: Compiling NWChem 6.1.1 with Python on Debian Testing (Wheezy) and compile nwchem with python etc. Don't forget to edit /src/config/makefile.h for python support as shown in that post. Once you're done with that you can compare the GabEdit plots with and without the modification.

Alternatively, if you're simply making changes to a copy of nwchem that you've compiled before, you can speed thing up by a factor of ca 300 by following this post:
http://verahill.blogspot.com.au/2013/04/380-modifying-nwchem-code-without-full.html



The difference:
I ran a job on benzene as described in post 281. Visualising NWChem output with GabEdit. I chose to run use the ELF (electron localisation function) on output from the unmodified and modified nwchem binaries. It's a pretty big difference:

Original

Modified

07 February 2013

334. Compiling nwchem with openmpi and python on Arch linux

Here's the reason why I gave my virtual machine 30 Gb in post 333 -- to be able to evaluate whether I can figure out how to build all the software that I need on Arch.

Behold my surprise when I realised that there's no need for separate -dev packages, as is the case on Debian i.e. the headers are generally installed together with the package (so e.g. python is enough -- you don't need python-dev as well).

While debian is probably the best choice for my nodes (I want stability -- not the latest flashiest stuff), nwchem is a good test case since I've been playing with it for years, and it's not available in the pacman or AUR repos.

Your mileage with openblas will vary depending on your hardware. ACML is an alternative on e.g. FX-8150. ATLAS doesn't seem to work with NWChem when I try it, but I'm not sure what I'm doing wrong. See the original post for examples on how to link to other math libs.

I'm mainly looking at this post: http://verahill.blogspot.com.au/2012/09/briefly-compiling-nwchem-611-with.html


Dependencies:
pacman -S wget base-devel gcc-fortran tcsh openmpi

Openblas:
Download from http://github.com/xianyi/OpenBLAS/tarball/v0.1.1

sudo mkdir /opt/openblas
sudo chown $USER /opt/openblas
tar xvf xianyi-OpenBLAS-v0.1.1-0-g5b7f443.tar.gz
cd xianyi-OpenBLAS-e6e87a2/
make all BINARY=64 CC=/usr/bin/gcc FC=/usr/bin/gfortran USE_THREAD=0 INTERFACE64=1 1> make.log 2>make.err
make PREFIX=/opt/openblas install
cp lib*.*  /opt/openblas/lib

Nwchem:
sudo mkdir /opt/nwchem
sudo chown $USER /opt/nwchem
cd /opt/nwchem
wget http://www.nwchem-sw.org/images/Nwchem-6.1.1-src.2012-06-27.tar.gz
tar xvf Nwchem-6.1.1-src.2012-06-27.tar.gz
cd nwchem-6.1.1-src/

Edit nwchem-6.1.1-src/src/config/makefile.h and edit line 1957 as shown in this post.

Then continue:
export LARGE_FILES=TRUE
export TCGRSH=/usr/bin/ssh
export NWCHEM_TOP=`pwd`
export NWCHEM_TARGET=LINUX64
export NWCHEM_MODULES="all python"
export PYTHONVERSION=2.7
export PYTHONHOME=/usr
export BLASOPT="-L/opt/openblas/lib -lopenblas"
export USE_MPI=y
export USE_MPIF=y
export USE_MPIF4=y
export MPI_LOC=/usr/lib/openmpi
export MPI_INCLUDE=/usr/include
export LIBRARY_PATH=$LIBRARY_PATH:/usr/lib/openmpi:/opt/openblas/lib
export LIBMPI="-L/usr/lib/openmpi -lmpi -lopen-rte -lopen-pal -ldl -lmpi_f77 -lpthread"
cd $NWCHEM_TOP/src
make clean
make nwchem_config
make FC=gfortran 1> make.log 2>make.err
export FC=gfortran
cd ../contrib
./getmem.nwchem

Note that some of the locations are a little bit different from debian.

Edit your ~/.bashrc and add:
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/openblas/lib export PATH=$PATH:/opt/nwchem/nwchem-6.1.1-src/bin/LINUX64
You can now test your new binary by running a job, e.g. co.nw:
title "co nmr" geometry c 0 0 0 o 0 0 1.13 end basis * library "6-311+g*" end dft direct grid fine mult 1 xc HFexch 0.05 slater 0.95 becke88 nonlocal 0.72 vwn_5 1 perdew91 0.81 end task dft optimize
Run:

nwchem co.nw

or

mpirun -n 2 nwchem co.nw

23 January 2013

325. Compiling ECCE 6.4 on Debian Testing

!NOTE! If you provide ECCE with 'localhost' as the hostname, be aware that this will block outside access: http://www.nwchem-sw.org/index.php/Special:AWCforum/st/id858/#post_3178

There's a new release of ECCE 6.4 out which fixes the following bug: http://verahill.blogspot.com.au/2012/11/minor-bug-in-ecce.html


A note on Java: I've only had luck with the openjdk packages -- not with the Oracle/sun java ones (see here: http://verahill.blogspot.com.au/2012/06/building-ecce-on-debian-testingwheezy.html).

This build was tested in a chrooted Testing/Wheezy i.e. I should've caught most of the necessary packages.


Compiling:
Download the source to ecce from http://ecce.emsl.pnl.gov/using/download.shtml, and put it in e.g. ~/tmp

sudo apt-get install bzip2 build-essential autoconf libtool ant pkg-config gtk+-2.0-dev libxt-dev csh gfortran openjdk-7-jdk python-dev libjpeg-dev imagemagick xterm
cd ~/tmp
tar xvf ecce-v6.4-src.tar.bz2
cd ecce-v6.4/
export ECCE_HOME=`pwd`
cd build/
./build_ecce
Hit return if xterm was found... The /home/sandbox/tmp/ecce-v6.4/scripts/sysdir script identifies the build platform directory as: empty Because this value is empty no platform-specific parent directory will be created for ECCE executables, libraries, etc. This works fine unless your site needs support for multiple platforms. Finished checking prerequisites for building ECCE. Do you want to skip these checks for future build_ecce invocations (y/n)? Y
./build_ecce
Xerces built
./build_ecce
Mesa OpenGL built
./build_ecce
wxWidgets built
./build_ecce
running build_ext wxPython built
./build_ecce
Apache HTTP server built
./build_ecce
Making combined tar file ecce.v6.4.tar Copying NWChem binary distribution nwchem-6.1.1-binary-rhel5-gcc4.1.2-m64.tar.bz2 Copying NWChem common distribution nwchem-6.1.1-binary-common.tar.bz2 Concatenating install script and combined tar file ecce.v6.4.tar create_ecce_bin finished ECCE built and distribution created in /home/sandbox/tmp/ecce-v6.4
cd ../ ./install_ecce.v6.4.csh

And see the next section for the installation steps.
The compilation steps take progressively longer and longer, so be patient during the build.

Installing:
Digital ink is cheap, so I'll show the whole process:
./install_ecce.v6.4.csh
Main ECCE installation menu =========================== 1) Help on main menu options 2) Prerequisite software check 3) Full install 4) Full upgrade 5) Application software install 6) Application software upgrade 7) Server install 8) Server upgrade IMPORTANT: If you are uncertain about any aspect of installing or running ECCE at your site, please refer to the detailed ECCE Installation and Administration Guide at http://ecce.pnl.gov/docs/installation/2864B-Installation.pdf Hit at prompts to accept the default value in brackets. Selection: [1] 3 Host name: [beryllium] localhost Application installation directory: [/home/sandbox/tmp/ecce-v6.4/ecce-v6.4/apps] /home/sandbox/.ecce/apps Server installation directory: [/home/sandbox/.ecce/server] ECCE v6.4 will be installed using the settings: Installation type: [full install] Host name: [localhost] Application installation directory: [/home/sandbox/.ecce/apps] Server installation directory: [/home/sandbox/.ecce/server] Are these choices correct (yes/no/quit)? [yes] Installing ECCE application software in /home/sandbox/.ecce/apps... Extracting application distribution... Extracting NWChem binary distribution... Extracting NWChem common distribution... Extracting client WebHelp distribution... Configuring application software... Configuring NWChem... Installing ECCE server in /home/sandbox/.ecce/server... Extracting data server in /home/sandbox/.ecce/server/httpd... Extracting data libraries in /home/sandbox/.ecce/server/data... Extracting Java Messaging Server in /home/sandbox/.ecce/server/activemq... Configuring ECCE server... ECCE installation succeeded. *************************************************************** !! You MUST perform the following steps in order to use ECCE !! -- Unless only the user 'sandbox' will be running ECCE, start the ECCE server as 'sandbox' with: /home/sandbox/.ecce/server/ecce-admin/start_ecce_server -- To register machines to run computational codes, please see the installation and compute resource registration manuals at http://ecce.pnl.gov/using/installguide.shtml -- Before running ECCE each user must source an environment setup script. For csh/tcsh users add this to ~/.cshrc: if ( -e /home/sandbox/.ecce/apps/scripts/runtime_setup ) then source /home/sandbox/.ecce/apps/scripts/runtime_setup endif For sh/bash users, add this to ~/.profile or ~/.bashrc: if [ -e /home/sandbox/.ecce/apps/scripts/runtime_setup.sh ]; then . /home/sandbox/.ecce/apps/scripts/runtime_setup.sh fi ***************************************************************

Instead of following the instructions above I normally do:
echo 'export ECCE_HOME=/home/sandbox/.ecce/apps' >> ~/.bashrc
echo 'PATH=$PATH:/home/sandbox/.ecce/server/ecce-admin/:/home/sandbox/.ecce/apps/scripts/' >> ~/.bashrc
echo 
source ~/.bashrc

You can now start ecce by either doing
ecce

or if that complains, do
start_ecce_server

then waiting a little while (10 s), followed by
ecce


Apppendix:

Selecting Java version
sudo update-alternatives --config java
There are 7 choices for the alternative java (providing /usr/bin/java). Selection Path Priority Status ------------------------------------------------------------ 0 /usr/lib/jvm/java-6-openjdk-amd64/jre/bin/java 1061 auto mode 1 /usr/bin/gij-4.4 1044 manual mode 2 /usr/bin/gij-4.6 1046 manual mode 3 /usr/bin/gij-4.7 1047 manual mode 4 /usr/lib/jvm/j2re1.6-oracle/bin/java 314 manual mode 5 /usr/lib/jvm/j2sdk1.6-oracle/jre/bin/java 315 manual mode 6 /usr/lib/jvm/java-6-openjdk-amd64/jre/bin/java 1061 manual mode * 7 /usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java 1051 manual mode

I've got ECCE running fine with openjdk 7 as well as openjdk 6.

24 November 2012

281. Visualising NWChem output with GabEdit

Update: please read Karol's comment below. I will put a link here once I've written up a post on how to modify nwchem.

Update 2: Here's the post: http://verahill.blogspot.com.au/2013/02/3xx-modifying-nwchem-611-to-work-with.html .The conclusion is that you MUST edit nwchem. Luckily, it's easy.

Original post:
I've never liked Gabedit much (looks a bit dated, tries to do 'too much') -- until today. Suddenly I have a newfound respect for the developer(s) behind it. It actually doesn't try to do 'too much' -- it simply does A LOT, and actually does it in a pretty transparent way.

Long story short -- you can do things with gabedit which you can't do (easily) with ECCE, and as such it has become an important ally. Besides, it's always nice to have alternatives.

GabEdit is in the Debian repos.

Running your calculations
There are some restrictions"
1. NOTE: you must run your nwchem job with explicit basis sets (i.e. entered as text) -- to do that in ECCE tick the box as shown in the figure below. If you're running 'pure' nwchem, you (probably) have to cut and paste from the basis set directory -- see e.g. section 7.2 here. It's a minor convenience for gaining access to what GabEdit has to offer.


2. You can only open Single point/Energy calculations i.e. Optimizations won't work. So do a single point calculation on your optimized structure.

3. Also, you need to rename/copy your output file so that it ends with .out.
gabedit won't read it otherwise

GabEdit
It's fairly straightforward -- just point and click. One thing which you will want to play with are the iso-surface settings. The defaults are rarely good.

Anyway, I'll let the screenshots do the talking:

Go straight to the Output viewer -- Geometry/Orbital/Density
Click on the M, or right-click anywhere in the window, and load your renamed nwchem output file.


Here's triplet oxygen. The alpha, beta orbitals are listed in the right window

You can do electron localisation

Look at spin density (the unpaired electrons are in the anti-bonding  pi orbitals)

Contour plots are neat -- here showing spin density

Electrostatic potential. 


There's a lot to explore. GabEdit can obviously also prepare and submit jobs, but I'm happy with ECCE in this respect, and content with using GabEdit for post-processing.

31 October 2012

273. NWChem and COSMO: custom radii

There are two approaches described in the nwchem manual for using custom radii in COSMO calculations:

geometry
  H  0 0 0
  H  0 1 0
  O  1 0 0
end
cosmo
       radius 1.1
              1.1
              1.8
end

and
cosmo
end
set cosmo:map custom.par

where custom.par looks like this:
H 1.1
O 1.8
The downside to the first example is that it's a PITA to use -- you need to enter each vdw radius in the order you are listing the atoms in the geometry section. It means that for a 50 atom geometry you need to enter 50 values, even if all 50 atoms are the same element.

The downside with the second example is that you need to first create the run folder, put cosmo.par there and first then can you submit.

An easier approach is to create the custom.par on the fly using task shell:
task shell "echo -e 'H 1.1 \n O 1.8' > mycosmopars.par"
cosmo
end
set cosmo:map mycosmopars.par

30 October 2012

272. Compiling NWChem 6.1.1.1 on ROCKS 5.4.3/CentOS 5.6

Nothing weird with this one and it's all but identical to the build on debian, but here's a step by step anyway to help those who are computational chemists, but not sysadmins.

Preparations:
First compile openblas according to http://verahill.blogspot.com.au/2012/05/building-nwchem-61-on-debian.html 

Next, create e.g. /share/apps/nwchem, like this
sudo mkdir /share/apps/nwchem
sudo chmod 755 /share/apps/nwchem

It will allows you to read, write and execute. It will allow group members and 'world' to read and execute, but not write.

If you've already built earlier versions of nwchem you want to skip the steps above.

NWChem:
You will need to go to http://www.nwchem-sw.org/index.php/Download and download version 6.1.1. Using the direct link (http://www.nwchem-sw.org/images/Nwchem-6.1.1-src.2012-06-27.tar.gz) with wget isn't working for me anymore.

Put your Nwchem-6.1.1-src.2012-06-27.tar.gz in /share/apps/nwchem and expand it.
tar xvf Nwchem-6.1.1-src.2012-06-27.tar.gz
cd nwchem-6.1.1-src/

Create buildconf.sh
export LARGE_FILES=TRUE
export TCGRSH=/usr/bin/ssh
export NWCHEM_TOP=`pwd`
export NWCHEM_TARGET=LINUX64
export NWCHEM_MODULES="all python"
export PYTHONHOME=/opt/rocks
export PYTHONVERSION=2.4
export USE_MPI=y
export USE_MPIF=y
export USE_MPIF4=y
export MPI_LOC=/opt/openmpi
export MPI_INCLUDE=/opt/openmpi/include
export LIBRARY_PATH=$LIBRARY_PATH:/opt/openmpi/lib:/share/apps/openblas
export LIBMPI="-lmpi -lopen-rte -lopen-pal -ldl -lmpi_f77 -lpthread"
export BLASOPT="-L/share/apps/openblas/lib -lopenblas -lopenblas_nehalem-r0.1.1 -lopenblas_nehalemp-r0.1.1"
cd $NWCHEM_TOP/src
export FC=gfortran
make clean
make  nwchem_config
make  FC=gfortran |tee make.log
cd ../contrib
./getmem.nwchem

Before running it, edit src/config/makefile.h and change line 1957:
1957      EXTRA_LIBS +=    -lnwcutil  -lpthread -lutil -ldl -lz -lssl
You are now ready to build.
time sh buildconf.sh

It took about 15 minutes to build -- a clear improvement over 6.1 for me (30 min+)

Create a default.nwchemrc in your /share/apps/nwchem/nwchem-6.1.1-src/ folder
nwchem_basis_library /share/apps/nwchem/nwchem-6.1.1-src/src/basis/libraries/
ffield amber
amber_1 /share/apps/nwchem/nwchem-6.1.1-src/src/data/amber_s/
amber_2 /share/apps/nwchem/nwchem-6.1.1-src/src/data/amber_x/
amber_3 /share/apps/nwchem/nwchem-6.1.1-src/src/data/amber_q/
amber_4 /share/apps/nwchem/nwchem-6.1.1-src/src/data/amber_u/
amber_5 /share/apps/nwchem/nwchem-6.1.1-src/src/data/custom/
spce /share/apps/nwchem/nwchem-6.1.1-src/src/data/solvents/spce.rst
charmm_s /share/apps/nwchem/nwchem-6.1.1-src/src/data/charmm_s/
charmm_x /share/apps/nwchem/nwchem-6.1.1-src/src/data/charmm_x/
Then each user can do
ln -s /share/apps/nwchem/nwchem-6.1.1-src/default.nwchemrc ~/.nwchemrc

You might also want to add nwchem to path -- add
export PATH=$PATH:/share/apps/nwchem/nwchem-6.1.1-src
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/openmpi/lib:/share/apps/openblas
to your ~/.bashrc

20 September 2012

243. My own personal benchmarks for NWChem, gromacs with atlas, openblas, acml on AMD and intel

Update: you can compile against acml on intel as well, and against mkl on amd. Still need to do some performance testing to see how well it works. The artificial penalty of running mkl on AMD is well-publicised and led to a lawsuit, but I don't know how acml performs on mkl.


The title says it all, really. Since I'm back to exploring ways of improving performance for my little cluster I figured I'd break this out as a separate post. Most of this data was found here before: http://verahill.blogspot.com.au/2012/09/new-compute-node-using-amd-fx-8150.html

All units are running up-to-date debian testing (wheezy).

Configuration:
Boron (B): Phenom II X6 2.8 GHz, 8Gb RAM (2.8*6=16.8 GFLOPS predicted)
Neon (Ne): FX-8150 X8 3.6 GHz, 16 Gb RAM (3.6*8=28.8 GFLOPS predicted (int), 3.6*4=14.4 GFLOPS (fpu))
Tantalum (Ta): Quadcore i5-2400 3.1 GHz, 8 Gb RAM (3.1*4=12.4 GFLOPS predicted)
Vanadium (V):  Dual socket 2x Quadcore Xeon X3480 3.06 GHz, 8Gb. CentOS (ROCKS 5.4.3)/openblas.

Results

Gromacs --double (1 ns 6x6x6 nm tip4p water box; dynamic load balancing, double precision, 500k steps)
B  :  10.662 ns/day (11.8  GFLOPS, runtime 8104 seconds)***
B  :    9.921 ns/day ( 10.9 GFLOPS, runtime 8709 seconds)**
Ne:  10.606 ns/day (11.7  GFLOPS, runtime 8146 seconds) *
Ne:  12.375 ns/day (13.7  GFLOPS, runtime 6982 seconds)**
Ne:  12.385 ns/day (13.7  GFLOPS, runtime 6976 seconds)****
Ta:  10.825 ns/day (11.9  GFLOPS, runtime 7981 seconds)***
V :   10.560 ns/dat (11.7  GFLOPS, runtime 8182 seconds)***
*no external blas/lapack.
**using ACML libs
*** using openblas
**** using ATLAS

Gromacs --single (1 ns 6x6x6 nm tip4p water box; dynamic load balancing, single precision, 500 k steps)
B  :   17.251 ns/day (19.0 GFLOPS, runtime 5008 seconds)***
Ne:   21.874 ns/day (24.2 GFLOPS, runtime  3950 seconds)**
Ne:   21.804 ns/day (24.1 GFLOPS, runtime 3963  seconds)****
Ta:   17.345 ns/day (19.2 GFLOPS, runtime  4982 seconds)***
V :   17.297 ns/day (19.1 GFLOPS, runtime 4995 seconds)***
*no external blas/lapack.
**using ACML libs
*** using openblas
**** using ATLAS

NWChem (opt biphenyl cation, cp-md/pspw):
B  :   5951 seconds**
B  :   4084 seconds ***
B  :   5782 seconds ***xy
Ne:    3689 seconds**
Ta :   4102 seconds***
Ta :   4230 seconds***xy
V :    5396 seconds***

*no external blas/lapack.
**using ACML libs
*** using openblas
x Reconfigured using getmem.nwchem

NWChem (opt biphenyl cation, geovib, 6-31G**/ub3lyp):
B  :  2841 seconds **
B  :  2410 seconds***
B  :  2101 seconds ***x
B  :  2196 seconds ***xy
Ne: 1665 seconds **
Ta : 1785 seconds***
Ta : 1789 seconds***xy
V  : 2600 seconds***

*no external blas/lapack.
**using ACML libs
*** using openblas
x Reconfigured using getmem.nwchem
y NWChem 6.1.1

A Certain Commercial Ab Initio Package (Freq calc of pre-optimised H14C19O3 at 6-31+G*/rb3lyp):
B  :    2h 00 min (CPU time 10h 37 min)
Ne:   1h 37 min (CPU time: 11h 13 min)
Ta:   1h 26 min (CPU time: 5h 27 min)
V  :   2h 15 min (CPU time 15h 50 min)
Using precompiled binaries.


Gamess:
(I'm still working on learning how to run gamess efficiently, so take these values with a huge saucer of salt for now). bn.inp does a geometry optimisation of a biphenyl cation (mult 2) at ub3lyp/6-31G**. bn.inp has no $STATPT card while bn3.inp does and it makes a huge difference -- but is this because it does 20 steps (nsteps=20), then kills the run? The default is 50 steps and it does seem like all the runs do the maximum number of steps, then exit.

 Again, still learning. See below for input files. Will fix this post as I learn what the heck I'm doing. The relative run times on each node are still comparable though, but just don't use the numbers to compare the run speed of e.g. nwchem vs gamess.

Gamess using bn.inp with atlas
B:    9079 seconds
Ne: 7252 seconds
Ta:  9283 seconds

Gamess using bn.inp with openblas
B:   9071 seconds
Ta: 9297 seconds

Gamess using bn.inp with acml
Ne: 7062 seconds

Gamess using bn3.inp with atlas. 
B: 4016 seconds
Ne: 3162 seconds
Ta: 4114 seconds

MPQC:
Here I've used the version in the debian repos. I've created a hostfile
neon slots=8 max_slots=8
tantalum slots=4 max_slots=4
boron slots=6 max_slots=6

and then just looked at changing the order and slots assignment as well as total number of cores assigned using mpirun.

Simple test case looking at number of cores/distribution:
n cores:  Seconds: Configuration(cores,exec nodes)
4    :   11   : 4(Ta)
4    :   17   : 4(Ne)
4    :   17   : 4(B)
4    :   42   : 2(Ta)+2(B)
6    :   12   : 6(B)
6    :   13   : 6(Ne)
6    :   74   : 2(Ta)+2(B)+2(Ne)
8    :   12   : 8(Ne)
10  :   43   : 4(Ta)+6(B)
12  :   47   : 4(Ta)+8(Ne)
14  :   55   : 6(B)+8(Ne)
18  :   170 :  4(Ta)+6(B)+8(Ne)

My beowulf cluster doesn't seem to be much of a super computer. All in all, this looks like a pretty good argument in favour of upgrading to infiniband...


bn.inp:
 $CONTRL 
COORD=CART UNITS=ANGS scftyp=uhf dfttyp=b3lyp runtyp=optimize 
ICHARG=1 MULT=2 maxit=100
$END
 $system mwords=2000 $end
 $BASIS gbasis=n31 ngauss=6 ndfunc=1 npfunc=1 $END
 $guess guess=huckel $end

 $DATA
biphenyl
C1
C      6.0      0.0000000000   -3.5630100000    0.0000000000 
C      6.0     -1.1392700000   -2.8592800000   -0.3938400000 
C      6.0     -1.1387900000   -1.4654500000   -0.3941500000 
C      6.0      0.0000000000   -0.7428100000    0.0000000000 
C      6.0      1.1387900000   -1.4654500000    0.3941500000 
C      6.0      1.1392700000   -2.8592800000    0.3938400000 
C      6.0      0.0000000000    0.7428100000    0.0000000000 
C      6.0      1.1387900000    1.4654500000   -0.3941500000 
C      6.0      1.1392700000    2.8592800000   -0.3938400000 
C      6.0     -1.1387900000    1.4654500000    0.3941500000 
C      6.0      0.0000000000    3.5630100000    0.0000000000 
C      6.0     -1.1392700000    2.8592800000    0.3938400000 
H      1.0      0.0000000000   -4.6489600000    0.0000000000 
H      1.0     -2.0282700000   -3.3966200000   -0.7116100000 
H      1.0     -2.0214800000   -0.9282700000   -0.7279300000 
H      1.0      2.0282700000   -3.3966200000    0.7116100000 
H      1.0      2.0282700000    3.3966200000   -0.7116100000 
H      1.0     -2.0214800000    0.9282700000    0.7279300000 
H      1.0      0.0000000000    4.6489600000    0.0000000000 
H      1.0     -2.0282700000    3.3966200000    0.7116100000 
H      1.0      2.0214800000    0.9282700000   -0.7279300000 
H      1.0      2.0214800000   -0.9282700000    0.7279300000 
 $END


bn3.inp:
$CONTRL 
COORD=CART UNITS=ANGS scftyp=uhf dfttyp=b3lyp runtyp=optimize 
ICHARG=1 MULT=2 maxit=100
$END
 $system mwords=2000 $end
 $BASIS gbasis=n31 ngauss=6 ndfunc=1 npfunc=1 $END
 $STATPT OPTTOL=0.0001 NSTEP=20 HSSEND=.TRUE. $END
 $guess guess=huckel $end

 $DATA
biphenyl
C1
C      6.0      0.0000000000   -3.5630100000    0.0000000000 
C      6.0     -1.1392700000   -2.8592800000   -0.3938400000 
C      6.0     -1.1387900000   -1.4654500000   -0.3941500000 
C      6.0      0.0000000000   -0.7428100000    0.0000000000 
C      6.0      1.1387900000   -1.4654500000    0.3941500000 
C      6.0      1.1392700000   -2.8592800000    0.3938400000 
C      6.0      0.0000000000    0.7428100000    0.0000000000 
C      6.0      1.1387900000    1.4654500000   -0.3941500000 
C      6.0      1.1392700000    2.8592800000   -0.3938400000 
C      6.0     -1.1387900000    1.4654500000    0.3941500000 
C      6.0      0.0000000000    3.5630100000    0.0000000000 
C      6.0     -1.1392700000    2.8592800000    0.3938400000 
H      1.0      0.0000000000   -4.6489600000    0.0000000000 
H      1.0     -2.0282700000   -3.3966200000   -0.7116100000 
H      1.0     -2.0214800000   -0.9282700000   -0.7279300000 
H      1.0      2.0282700000   -3.3966200000    0.7116100000 
H      1.0      2.0282700000    3.3966200000   -0.7116100000 
H      1.0     -2.0214800000    0.9282700000    0.7279300000 
H      1.0      0.0000000000    4.6489600000    0.0000000000 
H      1.0     -2.0282700000    3.3966200000    0.7116100000 
H      1.0      2.0214800000    0.9282700000   -0.7279300000 
H      1.0      2.0214800000   -0.9282700000    0.7279300000 
 $END