InfiniBand and Ethernet Architectures for Scientific and Enterprise Computing: Opportunities and Challenges

A Tutorial to be presented at The 16th IEEE International Symposium on High-Performance Computer Architecture (2010)
by
Dhabaleswar K. (DK) Panda (The Ohio State University) and Pavan Balaji (Argonne National Laboratory)


When: January 10, 2010 (Afternoon)
Where: Bangalore, India


Abstract

InfiniBand Architecture (IB) and 10-Gigabit Ethernet (10GE) architectures are generating a lot of excitement towards building next generation High Performance Computing (HPC) systems and enterprise datacenters. This tutorial will provide an overview of these emerging interconnect architectures, their offered features, their current market standing, and their suitability for prime-time HPC. It will start with a brief overview of IB, 10GE and their architectural features. An overview of the emerging OpenFabrics stack which encapsulates both IB and 10GE in a unified manner will be presented. IB and 10GE hardware/software solutions and the market trends will be highlighted. Finally, sample performance numbers highlighting the performance these technologies can achieve in different environments such as MPI, Sockets, Parallel File Systems, Multi-tier Datacenters, and Virtual Machines, will be shown.

Targeted Audience and Scope

This tutorial is targeted for various categories of people working in the areas of high performance communication and I/O architecture, storage, networking, middleware, virtualization, and applications related to high-end systems:

The tutorial content is planned for half-a-day. The content level is broken down as 50% beginner, 30% intermediate, and 20% advanced. There is no fixed pre-requisite. As long as the attendee has a general knowledge in high performance computing, networking, and storage, he/she will be able to understand and appreciate it. The tutorial is designed in such a way that an attendee gets exposed to the topics in a smooth and progressive manner.

Outline of the Tutorial

Brief Biography of Speakers

Dr. Dhabaleswar K. (DK) Panda is a Professor of Computer Science at the Ohio State University. He obtained his Ph.D. in computer engineering from the University of Southern California. His research interests include parallel computer architecture, high performance computing, communication protocols, files systems, network-based computing, and Quality of Service. He has published over 250 papers in major journals and international conferences related to these research areas. Dr. Panda and his research group members have been doing extensive research on modern networking technologies including InfiniBand and 10GigE/iWARP. His research group is currently collaborating with National Laboratories and leading InfiniBand and 10GigE/iWARP companies on designing various subsystems of next generation high-end systems. The MVAPICH/MVAPICH2 (High Performance MPI over InfiniBand and iWARP) open-source software packages, developed by his research group, are currently being used by more than 1,000 organizations worldwide (in 53 countries). This software has enabled several InfiniBand clusters (including the 5th ranked one) to get into the latest TOP500 ranking. These software packages are also available with the Open Fabrics stack for network vendors (InfiniBand and iWARP), server vendors and Linux distributors. Dr. Panda's research is supported by funding from US National Science Foundation, US Department of Energy, and several industry including Intel, Cisco, SUN, Mellanox, QLogic and NetApp. He is an IEEE Fellow and a member of ACM. More details about Dr. Panda, including a comprehensive CV and publications are available here.

Dr. Pavan Balaji holds a joint appointment as an Assistant Computer Scientist at the Argonne National Laboratory and as a research fellow of the Computation Institute at the University of Chicago. He had received his Ph.D. from the Computer Science and Engineering department at the Ohio State University. His research interests include high-speed interconnects, efficient protocol stacks, parallel programming models and middleware for communication and I/O, and job scheduling and resource management. He has more than 60 publications in these areas and has delivered nearly 75 talks and tutorials at various conferences and research institutes. He has received several awards for his research activities including an Outstanding Researcher award at the Ohio State University, the Director's Technical Achievement award at Los Alamos National Laboratory, and several best paper and other awards. Dr. Balaji has also served as a chairman or editor in more than a dozen journals, conferences and workshops including ICPP 2010, Hot Interconnects 2010, P2S2 workshop 2010, ICCCN 2010, CCGrid 2010, and JHPCA 2010, and as a technical program committee member in numerous conferences and workshops. He is a member of the IEEE and ACM. More details about Dr. Balaji are available at available here .
Last Updated: December 18, 2009