Re: Issue regarding to the speed of QM/MM

From: Marcelo C. R. Melo (melomcr_at_gmail.com)
Date: Mon Sep 07 2020 - 13:47:09 CDT

Hi Alexander,
This situation you mentioned sure makes sense, I just never saw it happen
before :)

In practice, oversubscribing CPU cores with NAMD or ORCA never gave me
better results, and performance was always much worse than only a 50% drop.

Still, I would agree that benchmarking is always the best solution,
particularly in QM/MM, which tends to vary a lot on a "case-by-case" basis.

Best,
Marcelo

On Mon, 7 Sep 2020 at 14:36, Alex Balaeff <abalaeff_at_polarisqb.com> wrote:

> Thanks a lot for your comments Marcelo. Throwing in my 2 cents (in
> hope to be criticized if these are wrong cents :) : there are
> situations when using every thread makes sense.
>
> For example, say, I need to run 112 similar jobs on the CPU cores in
> question. And let's say the performance of 1 job per thread is 50%
> worth than that of 1 job per core.
>
> In that case, option 1 is to run two successive batches of 56 jobs
> each. If a job takes time T, my whole simulation takes 2T.
>
> Option 2 is to run all 112 jobs simultaneously. They will finish in
> 1.5*T -- still better than the 2T timing of option 1??
>
> Best,
>
> Alexander.
>
> On Mon, Sep 7, 2020 at 2:13 PM Marcelo C. R. Melo <melomcr_at_gmail.com>
> wrote:
> >
> > Hi Zhihong,
> >
> > The performance of a QM/MM simulation will (almost always) be determined
> by the performance of the QM calculation itself. In this case, you are
> using ORCA to run DFT using 4 CPU cores (by asking for "PAL4").
> >
> > In QM calculations, it is important to know what is the size of the QM
> region, that is, how many atoms are in the QM region? 10 atoms, 100 atoms?
> This will make a gigantic difference in performance.
> >
> > The best bet for you is to balance the number of cores dedicated to NAMD
> with the number of cores dedicated to ORCA, and absolutely never overlap
> the CPU cores for both.
> > Something else that has been discussed in this list extensively is the
> use of hiperthreading. In your example, since you have two 28-core CPUs,
> you should only allocate a total of 56 processes between NAMD and ORCA, no
> more than that. Using all the 112 threads will probably lead to terrible
> performance.
> >
> > I would suggest starting with 10 cores for NAMD and 46 for ORCA. (I am
> assuming based on your performance that you have many atoms in your QM
> region, which will benefit from more CPU cores).
> > You will need to use ORCA's long format for parallelism instead of using
> "PAL4", and I see you already have a line like that in your NAMD config
> file asking for 10 cores.
> > Try benchmarking the ratio of NAMD/ORCA CPU cores, and do not exceed 56
> (or maybe 54, to leave a couple of cores for the OS, since you are running
> in a workstation).
> >
> > Best,
> > Marcelo
> >
> > On Mon, 7 Sep 2020 at 04:42, 辛志宏 <xzhfood_at_njau.edu.cn> wrote:
> >>
> >> Dear all,
> >>
> >> I am running a enzyme complex (298 amino acid and 1 ligand and 90
> thousand water molecules ) molecular dynamic simulation by QM/MM using
> NAMD, but it is very slowly with which only 25 steps being done every day
> (24 hours) in a
> >>
> >> minimization simulation (minimize 100, run 2000), I wonder if there
> are some isses regarding to the parameters of config file, any suggestion
> to improve the speed for running QM/MM will be much appreciated.
> >>
> >>
> >> The hardware for my computer (8173M workstation) is fine with 384GB
> memory and two physical memory (28 core per CPU, and 112 threads) , the
> command is as follows:
> >>
> >>
> >> charmrun ++local +p20 +isomalloc_sync namd2 YZZ-config.ORCA-1.namd |
> tee YZZ-config.ORCA-1.namd.log
> >>
> >>
> >> Thank you in advance.
> >>
> >>
> >> Zhihong Xin,
> >>
> >>
> >>
> >> The config file is as follows:
> >>
> >> ## Single QM region with MM water box
> >>
> >> structure ionized.psf
> >>
> >> coordinates ionized.pdb
> >>
> >> #Continuing a job from the restart files
> >>
> >> if {1} {
> >>
> >> set inputname YZZ_equil_MM
> >>
> >> binCoordinates $inputname.coor
> >>
> >> extendedSystem $inputname.xsc
> >>
> >> }
> >>
> >> cellBasisVector1 64.945 0 0
> >>
> >> cellBasisVector2 0 65.353 0
> >>
> >> cellBasisVector3 0 0 67.919
> >>
> >> cellOrigin 55.318 57.874 55.561
> >>
> >> seed 7910881
> >>
> >> # Output Parameters
> >>
> >> binaryoutput no
> >>
> >> outputname YZZ-QM-min-out
> >>
> >> outputenergies 1
> >>
> >> outputtiming 1
> >>
> >> outputpressure 1
> >>
> >> binaryrestart yes
> >>
> >> dcdfile YZZ-QM-min-out.dcd
> >>
> >> dcdfreq 1
> >>
> >> XSTFreq 1
> >>
> >> restartfreq 100
> >>
> >> restartname YZZ-QM-min-out.restart
> >>
> >> # mobile atom selection:
> >>
> >> constraints on
> >>
> >> consexp 2
> >>
> >> consref YZZ-restraint.pdb
> >>
> >> conskfile YZZ-restraint.pdb
> >>
> >> conskcol B
> >>
> >> constraintScaling 2.0
> >>
> >> # PME Parameters
> >>
> >> PME on
> >>
> >> PMEGridspacing 1
> >>
> >> set temperature 300
> >>
> >> temperature $temperature
> >>
> >> # Thermostat Parameters
> >>
> >> langevin on
> >>
> >> langevintemp $temperature
> >>
> >> langevinHydrogen on
> >>
> >> langevindamping 50
> >>
> >> # Barostat Parameters
> >>
> >> usegrouppressure yes
> >>
> >> useflexiblecell no
> >>
> >> useConstantArea no
> >>
> >> langevinpiston on
> >>
> >> langevinpistontarget 1.01325
> >>
> >> langevinpistonperiod 200
> >>
> >> langevinpistondecay 100
> >>
> >> langevinpistontemp $temperature
> >>
> >> surfacetensiontarget 0.0
> >>
> >> strainrate 0. 0. 0.
> >>
> >> wrapAll on
> >>
> >> wrapWater on
> >>
> >> # Integrator Parameters
> >>
> >> timestep 0.5
> >>
> >> firstTimestep 0
> >>
> >> fullElectFrequency 1
> >>
> >> nonbondedfreq 1
> >>
> >> # Force Field Parameters
> >>
> >> paratypecharmm on
> >>
> >> parameters ../CHARMpars/toppar_all36_carb_glycopeptide.str
> >>
> >> parameters ../CHARMpars/toppar_water_ions_namd.str
> >>
> >> parameters ../CHARMpars/toppar_all36_na_nad_ppi_gdp_gtp.str
> >>
> >> parameters ../CHARMpars/par_all36_carb.prm
> >>
> >> parameters ../CHARMpars/par_all36_cgenff.prm
> >>
> >> parameters ../CHARMpars/par_all36_lipid.prm
> >>
> >> parameters ../CHARMpars/par_all36_na.prm
> >>
> >> parameters ../CHARMpars/par_all36_prot.prm
> >>
> >> parameters ../common/DMP_ABD769.prm
> >>
> >> #printExclusions on
> >>
> >> exclude scaled1-4
> >>
> >> 1-4scaling 1.0
> >>
> >> rigidbonds none
> >>
> >> cutoff 12.0
> >>
> >> pairlistdist 14.0
> >>
> >> switching on
> >>
> >> switchdist 10.0
> >>
> >> stepspercycle 1
> >>
> >> # Truns ON or OFF the QM calculations
> >>
> >> qmForces on
> >>
> >> qmParamPDB "YZZ-namd-QM-0.pdb"
> >>
> >> qmColumn "beta"
> >>
> >> qmBondColumn "occ"
> >>
> >> #Link Atoms
> >>
> >> qmBondDist on
> >>
> >> # Number of simultaneous QM simulations per node
> >>
> >> QMSimsPerNode 20
> >>
> >> QMElecEmbed on
> >>
> >> QMSwitching on
> >>
> >> QMSwitchingType shift
> >>
> >> QMPointChargeScheme none
> >>
> >> QMBondScheme "cs"
> >>
> >> #qmBaseDir "/dev/shm/YZZ-NAMD_MIN"
> >>
> >> # Directory where QM calculations will be ran.
> >>
> >> qmBaseDir "/dev/shm/NAMD_Example1"
> >>
> >> ## ORCA
> >>
> >> qmConfigLine "! B3LYP 6-31G Grid4 PAL4 EnGrad TightSCF"
> >>
> >> qmConfigLine "%%output PrintLevel Mini Print\[ P_Mulliken \] 1
> Print\[P_AtCharges_M\] 1 end"
> >>
> >> #qmConfigLine "%%pal nprocs 10 end"
> >>
> >> # construction of ORCA's input file.
> >>
> >> qmMult "1 2"
> >>
> >> qmCharge "1 -1"
> >>
> >> qmSoftware "orca"
> >>
> >> qmExecPath
> "/home/xzhfood/software/orca_4_1_2_linux_x86-64_openmpi313/orca"
> >>
> >> QMOutStride 1
> >>
> >> QMPositionOutStride 1
> >>
> >> # Number of steps in the QM/MM simulation.
> >>
> >> minimize 100
> >>
> >> run 2000
> >>
> >>
>
>
> --
> -----
> Dr. Alexander Balaeff
> Polaris Quantum Biotech
> www.PolarisQB.com
> (919)-270-5772
>

This archive was generated by hypermail 2.1.6 : Fri Dec 31 2021 - 23:17:09 CST