142.dmilc (su3imp)
Submitted by Steven Gottlieb for the MILC collaboration
Steven Gottlieb <sg [at] fuji.physics.indiana.edu>
Department of Physics SW117
Indiana University
Bloomington IN 47405
Physics / Quantum Chromodynamics (QCD)
The MILC Code is a set of codes written in C developed by the MIMD Lattice Computation (MILC) collaboration for doing simulations of four dimensional SU(3) lattice gauge theory on MIMD parallel machines. The code is used for millions of node hours at DOE and NSF supercomputer centers.
The program generates a gauge field, and is used in lattice gauge theory applications involving dynamical quarks. Lattice gauge theory involves the study of some of the fundamental constituents of matter, namely quarks and gluons. In this area of quantum field theory, traditional perturbative expansions are not useful. Introducing a discrete lattice of space-time points is the method of choice.
MPI functions MPI_Isend, MPI_Irecv, along with a variety of variations of MPI_Bcast and MPI_Allreduce are used. The input grid is partitioned into subgrids, choosing an optimal configuration for the requested number of ranks.
A SPEC MPI2007 sample input file with comments explaining the different parameters is included below. The only difference between the three data sets, test, train and ref, is in the grid size.
prompt 0 [set to 1 for interactive running and you will be prompted for input] nflavors 2 [defines number of quarks] nx 30 [size of X dimension of grid] ny 30 [size of Y dimension of grid] nz 56 [size of Z dimension of grid] nt 84 [size of T dimension of grid] iseed 1234 [random number seed, if you change all the output will too!] warms 0 [warmup "trajectories" before measurements start] trajecs 1 ["trajectories" to run with measurements] traj_between_meas 1 [how often to measure] beta 5.6 [strength of strong coupling] mass 0.125 [mass of quark] u0 1. [no need to worry about this unless you are a physicist] microcanonical_time_step 0.2 [how big each simulation step is] steps_per_trajectory 1 [how many steps in each trajectory] max_cg_iterations 100 [number of iterations before restarting conjugate gradient routine] error_per_site .125 [desired accuracy during updating] error_for_propagator .125 [desired accuracy during measurements] fresh forget warms 0 trajecs 1 traj_between_meas 1 beta 5.6 mass 0.0125 u0 1. microcanonical_time_step 0.01 steps_per_trajectory 1 max_cg_iterations 300 error_per_site 1.e-5 error_for_propagator 1.e-5 continue forget
Non-timing sections of output have been left untouched and are used to verify correctness. Here is a partial example:
PLAQ: 2.562772 2.562844 P_LOOP: 6.842201e-02 -6.267143e-03 G_LOOP: 0 0 4 2.562751e+00 ( 0 1 7 6 ) G_LOOP: 0 1 4 2.562708e+00 ( 0 2 7 5 ) G_LOOP: 0 2 4 2.562641e+00 ( 0 3 7 4 ) G_LOOP: 0 3 4 2.562858e+00 ( 1 2 6 5 ) G_LOOP: 0 4 4 2.563030e+00 ( 1 3 6 4 ) G_LOOP: 0 5 4 2.562859e+00 ( 2 3 5 4 ) G_LOOP: 1 0 6 2.336636e+00 ( 0 0 1 7 7 6 ) G_LOOP: 1 1 6 2.336620e+00 ( 0 0 2 7 7 5 ) G_LOOP: 1 2 6 2.336534e+00 ( 0 0 3 7 7 4 ) G_LOOP: 1 3 6 2.336838e+00 ( 1 1 0 6 6 7 ) G_LOOP: 1 4 6 2.336661e+00 ( 1 1 2 6 6 5 ) G_LOOP: 1 5 6 2.337294e+00 ( 1 1 3 6 6 4 ) G_LOOP: 1 6 6 2.336563e+00 ( 2 2 0 5 5 7 ) G_LOOP: 1 7 6 2.336893e+00 ( 2 2 1 5 5 6 ) G_LOOP: 1 8 6 2.336724e+00 ( 2 2 3 5 5 4 ) G_LOOP: 1 9 6 2.336737e+00 ( 3 3 0 4 4 7 ) G_LOOP: 1 10 6 2.336931e+00 ( 3 3 1 4 4 6 ) G_LOOP: 1 11 6 2.336876e+00 ( 3 3 2 4 4 5 ) G_LOOP: 2 0 6 2.344903e+00 ( 0 1 2 7 6 5 ) G_LOOP: 2 1 6 2.345119e+00 ( 0 1 5 7 6 2 ) G_LOOP: 2 2 6 2.344987e+00 ( 0 6 2 7 1 5 ) G_LOOP: 2 3 6 2.344909e+00 ( 0 6 5 7 1 2 ) G_LOOP: 2 4 6 2.344858e+00 ( 0 1 3 7 6 4 ) G_LOOP: 2 5 6 2.344861e+00 ( 0 1 4 7 6 3 ) G_LOOP: 2 6 6 2.344654e+00 ( 0 6 3 7 1 4 ) G_LOOP: 2 7 6 2.344803e+00 ( 0 6 4 7 1 3 ) G_LOOP: 2 8 6 2.344888e+00 ( 0 2 3 7 5 4 ) G_LOOP: 2 9 6 2.344783e+00 ( 0 2 4 7 5 3 ) G_LOOP: 2 10 6 2.344791e+00 ( 0 5 3 7 2 4 ) G_LOOP: 2 11 6 2.344803e+00 ( 0 5 4 7 2 3 ) G_LOOP: 2 12 6 2.344863e+00 ( 1 2 3 6 5 4 ) G_LOOP: 2 13 6 2.344841e+00 ( 1 2 4 6 5 3 ) G_LOOP: 2 14 6 2.344769e+00 ( 1 5 3 6 2 4 ) G_LOOP: 2 15 6 2.345116e+00 ( 1 5 4 6 2 3 ) GACTION: 2.225217e+00
In addition, the original code can be compiled to put out some useful timing information about several phases of its work, by using the portability flags -DCGTIME -DGFTIME -DFFTIME, to focus attention on the time consuming aspects. These have been turned off in SPEC MPI2007 for validation reasons.
CONGRADwall time = 7.670021e+00 iters = 4 mflops = 6.190335e+00 CONGRAD5: time = 5.900000e-01 iters = 4 mflops = 8.047458e+01 GFTIME: 2.026000e+01 FFTIME: 4.017000e+01
C
Last updated: 16 March 2007