Friday, September 25, 2009

Quantum Espresso: a nice drink!


Hello,

Just now, I installed Quantum Espresso 4.1 in an IBM super computer [power 6 , 32 core/1node , SuSE 10. 64 bits ...]

for that, I was forced to install these sub-programs, locally (since I couldnt locate all the nasty supporting libraries)


1. MPICH2 [to fetch some static libraries !]
2. fftw3 - [ditto.]


Compilation was ok and see the main-flags as below [from make.sys]
---------------------------------------------------------------
mohan@p6012 espresso-4.1> more make.sys
# make.sys. Generated from make.sys.in by configure.

# compilation rules

.SUFFIXES :
.SUFFIXES : .o .c .f .f90

# most fortran compilers can directly preprocess c-like directives: use
# $(MPIF90) $(F90FLAGS) -c $<
# if explicit preprocessing by the C preprocessor is needed, use:
# $(CPP) $(CPPFLAGS) $< -o $*.F90
# $(MPIF90) $(F90FLAGS) -c $*.F90 -o $*.o
# remember the tabulator in the first column !!!

.f90.o:
$(MPIF90) $(F90FLAGS) -c $<

# .f.o and .c.o: do not modify

.f.o:
$(F77) $(FFLAGS) -c $<

.c.o:
$(CC) $(CFLAGS) -c $<

# DFLAGS = precompilation options (possible arguments to -D and -U)
# used by the C compiler and preprocessor
# FDFLAGS = as DFLAGS, for the f90 compiler
# See include/defs.h.README for a list of options and their meaning
# With the exception of IBM xlf, FDFLAGS = $(DFLAGS)
# For IBM xlf, FDFLAGS is the same as DFLAGS with separating commas

DFLAGS = -D__XLF -D__FFTW3 -D__MASS -D__MPI -D__PARA
FDFLAGS = -D__XLF,-D__FFTW3,-D__MASS,-D__MPI,-D__PARA

# IFLAGS = how to locate directories where files to be included are
# In most cases, IFLAGS = -I../include

IFLAGS = -I../include -I/home/mohan/Learning/mpich/allbin/include

# MODFLAGS = flag used by f90 compiler to locate modules
# You need to search for modules in ./, in ../iotk/src, in ../Modules
# Some applications also need modules in ../PW and ../PH

MODFLAGS = -I./ -I../Modules -I../iotk/src \
-I../PW -I../PH -I../EE -I../GIPAW

# Compilers: fortran-90, fortran-77, C
# If a parallel compilation is desired, MPIF90 should be a fortran-90
# compiler that produces executables for parallel execution using MPI
# (such as for instance mpif90, mpf90, mpxlf90,...);
# otherwise, an ordinary fortran-90 compiler (f90, g95, xlf90, ifort,...)
# If you have a parallel machine but no suitable candidate for MPIF90,
# try to specify the directory containing "mpif.h" in IFLAGS
# and to specify the location of MPI libraries in MPI_LIBS

MPIF90 = xlf90_r
#F90 = xlf90_r
CC = xlc_r
F77 = xlf_r

# C preprocessor and preprocessing flags - for explicit preprocessing,
# if needed (see the compilation rules above)
# preprocessing flags must include DFLAGS and IFLAGS

CPP = cpp
CPPFLAGS = -P -traditional $(DFLAGS) $(IFLAGS)

# compiler flags: C, F90, F77
# C flags must include DFLAGS and IFLAGS
# F90 flags must include MODFLAGS, IFLAGS, and FDFLAGS with appropriate syntax

CFLAGS = -O3 $(DFLAGS) $(IFLAGS)
F90FLAGS = $(FFLAGS) -qfree=f90 -WF,$(FDFLAGS) $(IFLAGS) $(MODFLAGS)
FFLAGS = -O4 -qsuffix=cpp=f90 -qdpc -qalias=nointptr -Q

# compiler flags without optimization for fortran-77
# the latter is NEEDED to properly compile dlamch.f, used by lapack

FFLAGS_NOOPT = -O0

# Linker, linker-specific flags (if any)
# Typically LD coincides with F90 or MPIF90, LD_LIBS is empty

LD = xlf90_r
LDFLAGS =
LD_LIBS =

# External Libraries (if any) : blas, lapack, fft, MPI

# If you have nothing better, use the local copy : ../flib/blas.a

BLAS_LIBS = /home/mohan/Learning/Pwscf/BLAS/libblas.a

# The following lapack libraries will be available in flib/ :
# ../flib/lapack.a : contains all needed routines
# ../flib/lapack_atlas.a: only routines not present in the Atlas library
# For IBM machines with essl (-D__ESSL): load essl BEFORE lapack !
# remember that LAPACK_LIBS precedes BLAS_LIBS in loading order

LAPACK_LIBS = ../flib/lapack.a

# nothing needed here if the the internal copy of FFTW is compiled
# (needs -D__FFTW in DFLAGS)

FFT_LIBS = /home/mohan/Learning/fftw/allfft/lib/libfftw3.a

# For parallel execution, the correct path to MPI libraries must
# be specified in MPI_LIBS (except for IBM if you use mpxlf)

MPI_LIBS = /home/mohan/Learning/mpich/allbin/lib/libmpich.a /home/mohan/Learning/mpich/allbin/lib/libfmpich.a /home/moha
n/Learning/mpich/allbin/lib/libmpe_f2cmpi.a

# IBM-specific: MASS libraries, if available and if -D__MASS is defined in FDFLAGS

MASS_LIBS = -lmassvp4_64 -lmass_64

# pgplot libraries (used by some post-processing tools)

PGPLOT_LIBS =

# ar command and flags - for most architectures: AR = ar, ARFLAGS = ruv
# ARFLAGS_DYNAMIC is used in iotk to produce a dynamical library,
# for Mac OS-X with PowerPC and xlf compiler. In all other cases
# ARFLAGS_DYNAMIC = $(ARFLAGS)

AR = ar
ARFLAGS = ruv
ARFLAGS_DYNAMIC= ruv

# ranlib command. If ranlib is not needed (it isn't in most cases) use
# RANLIB = echo

RANLIB = ranlib

# all internal and external libraries - do not modify

LIBOBJS = ../flib/ptools.a ../flib/flib.a ../clib/clib.a ../iotk/src/libiotk.a ../Multigrid/mglib.a
LIBS = $(LAPACK_LIBS) $(BLAS_LIBS) $(FFT_LIBS) $(MPI_LIBS) $(MASS_LIBS) $(PGPLOT_LIBS) $(LD_LIBS)

------------------------------- END --------------------------

Furthurmore, I did make pwall

and it was a smooth compilation albeit a longer [1-2 hrs !]

Note:

Their 'confugure' script is fine ! and IF its log prints,

...
Parallel environment not detected (is this a parallel machine?).
Configured for compilation of serial executables.

...

, then you should worry, because it will make only serial [1 cpu] binaries!!!

Thats why I made an MPICH2 installation and set

mohan@p6012 espresso-4.1> ./configure BLAS_LIBS=/home/mohan/Learning/Pwscf/BLAS/libblas.a MPI_LIBS="/home/mohan/Learning/mpich/allbin/lib/libmpich.a /home/mohan/Learning/mpich/allbin/lib/libfmpich.a /home/mohan/Learning/mpich/allbin/lib/libmpe_f2cmpi.a"

to get a happy log,

...
Parallel environment detected successfully.
Configured for compilation of parallel executables.

...

NOTE:

If you didnt find from your QE output like:
-------------------------------------------
Program PWSCF v.4.1 starts ...
Today is 25Sep2009 at 10: 1:41

Parallel version (MPI)

Number of processors in use: 32
R & G space division: proc/pool = 32

For Norm-Conserving or Ultrasoft (Vanderbilt) Pseudopotentials or PAW


-------------------------------------------

something was wrong hence time to figure it out what went wrong in your make/job-submitting process (nice excercise, eh. ?)

Now the time to comment on QEspresso !



Timing/parallelizing:

using an example script, co.rx.in [a relaxation calc.], I have been noticed the CPU timings, (in an interactive section such as: mpirun -np 8 pw.x -in co.rx.in )

np ------ CPU Time ----------------------- Wall Time -------

32 -- 02.48 PWSCF : 02.48s CPU time, 30.31s wall time
16 -- 02.74 PWSCF : 02.74s CPU time, 10.62s wall time
08 -- 03.35 PWSCF : 03.35s CPU time, 08.31s wall time
04 -- 04.46 PWSCF : 04.46s CPU time, 08.68s wall time
02 -- 06.49 PWSCF : 06.49s CPU time, 08.82s wall time
01 -- 10.50 PWSCF : 10.55s CPU time, 12.88s wall time

(Also, please see the picture on the top of this post)

Normal modes and Symmetry

Since my work is really based from the normal modes of CH4 , I was curious about the vibrational calculation of it (see QEspressos example files: ch4.scf.in ch4.nm.in)

MAIN OUTPUTS:

omega( 1) = -0.006518 [THz] = -0.217422 [cm-1]
omega( 2) = -0.006372 [THz] = -0.212561 [cm-1]
omega( 3) = -0.006327 [THz] = -0.211048 [cm-1]
omega( 4) = 0.954034 [THz] = 31.823361 [cm-1]
omega( 5) = 0.954131 [THz] = 31.826612 [cm-1]
omega( 6) = 0.954171 [THz] = 31.827933 [cm-1]
omega( 7) = 36.532096 [THz] = 1218.587644 [cm-1]
omega( 8) = 36.532102 [THz] = 1218.587842 [cm-1]
omega( 9) = 36.532103 [THz] = 1218.587893 [cm-1]
omega(10) = 43.470865 [THz] = 1450.041624 [cm-1]
omega(11) = 43.470871 [THz] = 1450.041803 [cm-1]
omega(12) = 87.787181 [THz] = 2928.284610 [cm-1]
omega(13) = 91.590826 [THz] = 3055.161407 [cm-1]
omega(14) = 91.590828 [THz] = 3055.161485 [cm-1]
omega(15) = 91.590833 [THz] = 3055.161645 [cm-1]

___________________________________________________

Mode symmetry, T_d (-43m) point group:

omega( 7 - 9) = 1218.6 [cm-1] --> T_2 G_15 P_4 I+R
omega( 10 - 11) = 1450.0 [cm-1] --> E G_12 P_3 R
omega( 12 - 12) = 2928.3 [cm-1] --> A_1 G_1 P_1 R
omega( 13 - 15) = 3055.2 [cm-1] --> T_2 G_15 P_4 I+R

___________________________________________________

From my experiance with planewave (PW) code, this was SHOCKING-result, I never expected such a symmetry-adapted calculations in a PW code.

Congrats QEspresso team !!!

I will post another article on symmetry-adapted PW-DFT calcs

_______________________Bottomline______________________


I have faced some problems with vibrational calculations at the beginnig; since its basically a restart calculation using the scf-wavefunctions informations and Kpoints from a pre-defined directory.

So care must be taken in the compilation time, otherwise it make you teased :| (ie. if it failed to read scf informations )

i ran the programs:

mpirun -np 32 /home/mohan/Learning/Pwscf/espresso-4.1/bin/ph.x -in ch4.scf.in
mpirun -np 32 /home/mohan/Learning/Pwscf/espresso-4.1/bin/ph.x -in ch4.nm.in

May be you have to define Parallel variables (in the job script) as:



export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/mohan/Learning/mpich/allbin/lib:/home/mohan/Learning/fftw/allfft/lib:/home/mohan/Learning/fftw/allfft/bin

export PATH=$PATH:/home/mohan/Learning/mpich/allbin/bin:/home/mohan/Learning/mpich/allbin/lib:/home/mohan/Learning/fftw/allfft/lib:/home/mohan/Learning/fftw/allfft/bin

export MP_HOSTFILE=$HOME/myhostfile
export MP_PROCS=32
> $MP_HOSTFILE
for i in `seq $MP_PROCS` ; do
hostname >> $MP_HOSTFILE
done

mpd &


please look to : http://www2.hlrn.de/doc/espresso/index.html for setting a job-file