Highly Scalable Parallel Sorting
IEEE International Parallel and Distributed Processing Symposium (IPDPS) 2010
Publication Type: Paper
Repository URL: 200905_ParallelSorting
Abstract
Sorting is a commonly used process with a wide breadth of
applications in the high performance computing field. Early
research in parallel processing has provided us with comprehensive
analysis and theory for parallel sorting algorithms. However,
modern supercomputers have advanced rapidly in size and changed
significantly in architecture, forcing new adaptations to these
algorithms. To fully utilize the potential of highly parallel
machines, tens of thousands of processors are used. Efficiently
scaling parallel sorting on machines of this magnitude is inhibited
by the communication-intensive problem of migrating large amounts
of data between processors. The challenge is to design a highly
scalable sorting algorithm that uses minimal communication,
maximizes overlap between computation and communication, and uses
memory efficiently. This paper presents a scalable extension of the
Histogram Sorting method, making fundamental modifications to the
original algorithm in order to minimize message contention and
exploit overlap. We implement Histogram Sort, Sample Sort, and
Radix Sort in Charm++ and compare their performance. The choice of
algorithm as well as the importance of the optimizations is
validated by performance tests on two predominant modern
supercomputer architectures: XT4 at ORNL (Jaguar) and Blue Gene/P
at ANL (Intrepid).
TextRef
Edgar Solomonik and Laxmikant V. Kale, Highly Scalable Parallel Sorting, IPDPS, 2010
People
Research Areas