Avoiding Hot-Spots on Two-Level Direct Networks
International Conference for High Performance Computing, Networking, Storage and Analysis (SC) 2011
Publication Type: Talk
Repository URL:
Summary
A low-diameter, fast interconnection network is going to be a pre-requisite for
building exascale machines. A two-level direct network has been proposed by
several groups as a scalable design for future machines. IBM's PERCS topology
and the dragonfly network discussed in the DARPA exascale hardware study are
examples of this design. The presence of multiple levels in this design leads
to hot-spots on a few links when processes are grouped together at the lowest
level to minimize total communication volume. This is especially true for
communication graphs with a small number of neighbors per task. Routing and
mapping choices can impact the communication performance of parallel
applications running on a machine with a two-level direct topology. This paper
explores intelligent topology aware mappings of different communication
patterns to the physical topology to identify cases that minimize link
utilization. We also analyze the trade-offs between using direct and indirect
routing with different mappings. We use simulations to study communication and
overall performance of applications since there are no installations of
two-level direct networks yet. This study raises interesting issues regarding
the choice of job scheduling, routing and mapping for future machines.
People
Research Areas