MPI (Message Passing Interface) is the dominant programming model
for highly scalable computational science applications. Applications
are currently using as many as 131072 processes and will be using
163840 by the end of 2008 and over 250000 within the next few years.
MPI provides a way to describe parallel programs in terms of
communicating processes in a distributed memory environment.
Internally, the MPI implementation must
maintain information sufficent to allow the processes to communicate.
A naive (but efficient) implementation will maintain tables that are
as large as the number of processes. As large scale systems are
approaching a million cores, these tables can consume significant
space; further, simply reading these tables can take signficant time
(a typical MPI communication has a latency of less than 2
microseconds). This project has several aims:
- Analyze the memory needs of a high-performance MPI implementation
as a function of number of processes/processors/cores
- Develop and prototype alternatives to the internal tables,
exploiting implicit representations and/or compression (such as sparse
tables)
- Evaluate the memory/performance tradeoffs
References:
MPI Forum Web Site, including copies
of the MPI standards.
Blue Waters will be
NSF's most powerful computing platform for computational science and
is part of NCSA. Many applications will use MPI and rely on over
100,000 processes.
MPICH2 is a
widely-used, high performance, open source implementation of MPI.
MPICH2 is the basis of the MPI used by the IBM Blue
Gene and Cray XT systems.