Designing High-End Computing Systems with InfiniBand and High-speed Ethernet

A Tutorial to be presented at The 24th International Conference on Supercomputing (2010)
by
Dhabaleswar K. (DK) Panda (The Ohio State University), Pavan Balaji (Argonne National Laboratory) and Sayantan Sur (The Ohio State University)


When: June 1, 2010 (Afternoon)
Where: Tsukuba, Japan


InfiniBand (IB) and High-speed Ethernet interconnects are generating a lot of excitement towards building next generation High Performance Computing (HPC) systems and enterprise datacenters. This tutorial will provide an overview of these emerging interconnects, their offered features, their current market standing, and their suitability for prime-time HPC. It will start with a brief overview of IB, high-speed Ethernet and their architectural features. An overview of the emerging OpenFabrics stack which encapsulates both IB and Ethernet in a unified manner, and hardware technologies such as Virtual Protocol Interconnect (VPI), RDMA over Ethernet and RDMA over Converged Enhanced Ethernet that aim at converged hardware solutions will be presented. IB and high-speed Ethernet hardware/software solutions and the market trends will be highlighted. Finally, sample performance numbers highlighting the performance these technologies can achieve in different environments such as MPI, Sockets, Parallel File Systems, Multi-tier Datacenters, and Virtual Machines, will be shown.

Targeted Audience and Scope

This tutorial is targeted for various categories of people working in the areas of high performance communication and I/O architecture, storage, networking, middleware, virtualization, and applications related to high-end systems:

The tutorial content is planned for half-a-day. The content level is broken down as 20% beginner, 50% intermediate, and 30% advanced. There is no fixed pre-requisite. As long as the attendees have a general knowledge in high performance computing, networking, and storage, they will be able to understand and appreciate it. The tutorial is designed in such a way that an attendee gets exposed to the topics in a smooth and progressive manner.

Brief Biography of Speakers

Dr. Dhabaleswar K. (DK) Panda is a Professor of Computer Science at the Ohio State University. He obtained his Ph.D. in computer engineering from the University of Southern California. His research interests include parallel computer architecture, high performance computing, communication protocols, files systems, network-based computing, and Quality of Service. He has published over 260 papers in major journals and international conferences related to these research areas. Dr. Panda and his research group members have been doing extensive research on modern networking technologies including InfiniBand and 10GigE/iWARP. His research group is currently collaborating with National Laboratories and leading InfiniBand and 10GigE/iWARP companies on designing various subsystems of next generation high-end systems. The MVAPICH/MVAPICH2 (High Performance MPI over InfiniBand and iWARP) open-source software packages, developed by his research group, are currently being used by more than 1,150 organizations worldwide (in 59 countries). This software has enabled several InfiniBand clusters (including the 5th ranked one) to get into the latest TOP500 ranking. These software packages are also available with the Open Fabrics stack for network vendors (InfiniBand and iWARP), server vendors and Linux distributors. Dr. Panda's research is supported by funding from US National Science Foundation, US Department of Energy, and several industry including Intel, Cisco, SUN, Mellanox, QLogic and NetApp. He is an IEEE Fellow and a member of ACM. More details about Dr. Panda, including a comprehensive CV and publications are available here.

Dr. Pavan Balaji holds a joint appointment as an Assistant Computer Scientist at the Argonne National Laboratory and as a research fellow of the Computation Institute at the University of Chicago. He had received his Ph.D. from the Computer Science and Engineering department at the Ohio State University. His research interests include high-speed interconnects, efficient protocol stacks, parallel programming models and middleware for communication and I/O, and job scheduling and resource management. He has more than 60 publications in these areas and has delivered nearly 75 talks and tutorials at various conferences and research institutes. He has received several awards for his research activities including an Outstanding Researcher award at the Ohio State University, the Director's Technical Achievement award at Los Alamos National Laboratory, and several best paper and other awards. Dr. Balaji has also served as a chairman or editor in more than a dozen journals, conferences and workshops including ICPP 2010, Hot Interconnects 2010, P2S2 workshop 2010, ICCCN 2010, CCGrid 2010, and JHPCA 2010, and as a technical program committee member in numerous conferences and workshops. He is a member of the IEEE and ACM. More details about Dr. Balaji are available at available here .
Last Updated: Jun 1, 2010