Re-designed AWP-ODC, a popular earthquake simulation application and a finalist for the 2010 Gordon Bell Prize, using various MPI-2 RMA communication and synchronization operations. Conducted scalability studies on 8192 cores and profiled the application with mpiP. Analyzed the impact of process-grid mapping on application performance. This was an NSF-HECURA funded project, carried out in collaboration with TACC and SDSC.
Currently involved in design and development of MVAPICH/MVAPICH2 - an open source, high performance and scalable MPI (Message Passing Interface) implementation for clusters with the modern interconnects.<\p>
- Design of MPI for heterogeneous clusters with NVIDIA GPU accelerators and Intel MIC co-processors
- Design of MPI one-sided communication model for clusters with InfiniBand and 10GigE/iWARP interconnects