Research
Interests
I am
a member of Network-Based
Computing Laboratory. My advisor is Prof. D.K. Panda.
My research
interests include High Performance Computing and Network File System.
I
have been working on two projects: MVAPICH and NFS/RDMA.
MVAPICH:
High Performance MPI over InfiniBand and other
RDMA-enabled Networks
InfiniBand Architecture and iWARP
are emerging as open interconnect and protocol standards for HPC clusters, data
centers, and file/storage systems. MVAPICH is a high performance MPI library
over InfiniBand and other RDMA-enabled Interconnects.
MVAPICH project includes both MVAPICH (MPI-1 implementation) and MVAPICH2
(MPI-2 implementation). MVAPICH/MVAPICH2 software is being used by more than
465 organizations world-wide to extract the potential of InfiniBand
and other RDMA-enabled interconnect networking technologies for designing
high-end computing systems and servers. This software is also being distributed
by many InfiniBand and RDMA-enabled interconnect
vendors in their software distributions. Two versions (MVAPICH and MVAPICH2)
are also available with the OpenFabrics stack.
I have been working on the
intra-node communication component of MVAPICH, and also MVAPICH over uDAPL interface. Typical clusters are built by SMP nodes. I
am exploring optimizations for MPI intra-node communication. Our current
approach is to use user space shared memory, and organize the buffers to
efficiently use the L2 cache. I have also designed and implemented MVAPICH over
uDAPL interface. uDAPL is a set of transport-independent,
platform-independent user space APIs that exploit the RDMA (remote direct
memory access) capabilities of next-generation interconnect technologies such
as InfiniBand, and iWARP. uDAPL is defiend
by DAT Collaborative. By building MVAPICH on top of uDAPL,
we have essentially allowed MVAPICH to run on any RDMA-enabled networks.
Current
work and progress
Multi-core processors are
emerging as a growing industry trend as single-core processors have rapidly
reached the speed limit. Multi-core means to place two or more computing cores
on one single chip. Multi-core processors bring many novel architecture
designs, such as shared L2 cache. However, to efficiently take advantage of the
multi-core architecture, applications and middleware should be designed in a
multi-core aware manner. With the emergence of multi-core systems, more
communication will go through intra-node than before, which makes intra-node
communication performance more critical. Currently I am working on optimizing
MPI intra-node communication for multi-core systems. We have designed a user
space shared memory based scheme that uses L2 cache more efficiently than the
traditional design. We are currently exploring novel approaches, such as
dedicating a processing core for intra-node communication, to further improve
performance.
Publications
• L. Chai, Q. Gao
and D. K. Panda, Understanding
the Impact of Multi-Core Architecture in Cluster Computing: A Case Study with
Intel Dual-Core System, The 7th IEEE
International Symposium on Cluster Computing and the Grid (CCGrid
2007), May 2007.
• L. Chai, A. Hartono and D. K.
Panda, Designing
High Performance and Scalable MPI Intra-node Communication Support for Clusters, The IEEE
International Conference on Cluster Computing (Cluster 2006), September 2006.
• L. Chai, R. Noronha and D. K. Panda, MPI
over uDAPL: Can High Performance and Portability Exist Across
Architectures?, The 6th IEEE
International Symposium on Cluster Computing and the Grid (CCGrid
2006), May 2006.
• S. Sur, L. Chai, H.-W. Jin and D. K. Panda, Shared Receive
Queue Based Scalable MPI Design for InfiniBand Clusters, International Parallel and Distributed
Processing Symposium (IPDPS 2006), April, 2006.
• S. Sur, H.-W. Jin, L. Chai and D. K. Panda, RDMA
Read Based Rendezvous Protocol for MPI over InfiniBand: Design Alternatives and
Benefits, Symposium on
Principles and Practice of Parallel Programming (PPOPP 2006), March 2006.
• L. Chai, S. Sur, H. -W.
Jin, and D. K. Panda, Analysis of
Design Considerations for Optimizing Multi-Channel MPI over InfiniBand, Proceedings of Workshop on Communication
Architecture for Clusters (CAC 2005); In Conjunction with the International
Parallel and Distributed Processing Symposium (IPDPS 2005), April 2005
Links
to Related Websites
OpenSolaris NFS over RDMA Project
The aim of this project is to
design and develop a high performance RDMA-enabled RPC transport for the NFS
client and server in OpenSolaris. Existing transports
based on TCP and UDP are limited by copying overhead, resulting in low
aggregate bandwidth and high CPU utilization. In addition, the current
RDMA-Read (Read-Read design) based design has performance and security
limitations. The objectives of this project include security, performance, and
interoperability. This prototype is being incorporated into the OpenSolaris
kernel code base.
Current
work and progress
I am currently working on
scalability study of NFS/RDMA with multiple clients. We are also exploring pNFS (parallel NFS) design and optimization in OpenSolaris, to eliminate the single server bottleneck.
Technical
Reports
• Ranjit Noronha,
Lei Chai, Thomas Talpey and Dhabaleswar
K. Panda, Better NFS through RDMA and Efficient Memory Registration.
OSU-CISRC-1/07--TR06. (pdf)
• Weikuan Yu, Ranjit
Noronha, Lei Chai, Shuang Liang, and Dhabaleswar K. Panda,
Optimizing OpenSolaris NFS over RDMA. OSU-CISRC-4/060TR43.
(pdf)
Links
to Related Work