From: René Hafner TUK (hamburge_at_physik.uni-kl.de)
Date: Mon Nov 23 2020 - 07:22:36 CST
Dear all,
I am trying to get an (e)ABF simulation running with multi-copy
algorithm on a multiGPU node.
I tried as describe in
http://www.ks.uiuc.edu/Research/namd/2.13/notes.html :
charmrun ++local namd2 myconf_file.conf +p16 +replicas 2
+stdout logfile%d.log
I am using the precompiled binaries from the Download page: NAMD 2.13
Linux-x86_64-netlrts-smp-CUDA (Multi-copy algorithms, single process per
copy)
And for both NAMD2.13 and NAMD2.14 I get the error:
FATAL ERROR: Number of devices (2) is not a multiple of number of
processes (8). Sharing devices between processes is inefficient.
Specify +ignoresharing (each process uses all visible devices) if not
all devices are visible to each process, otherwise adjust number of
processes to evenly divide number of devices, specify subset of devices
with +devices argument (e.g., +devices 0,2), or multiply list shared
devices (e.g., +devices 0,1,2,0).
But even with using +devices 0,1 !
I obtain the same error. Why should the number of devices be a multiple
of the number of processes at all?
Shouldn't it be the otherway around? 8 cores + 1 gpu PER replica for my
example
Can anyone give me some support here?
Kind regards
René Hafner
-- -- Dipl.-Phys. René Hafner TU Kaiserslautern Germany
This archive was generated by hypermail 2.1.6 : Fri Dec 31 2021 - 23:17:10 CST