Designing High-End Computing Systems with InfiniBand and High-speed Ethernet
A Tutorial to be presented at
The 24th International Conference on Supercomputing (2010)
by
Dhabaleswar K. (DK) Panda (The Ohio State University),
Pavan Balaji
(Argonne National Laboratory) and Sayantan Sur (The Ohio State University)
When: June 1, 2010 (Afternoon)
Where: Tsukuba, Japan
InfiniBand (IB) and High-speed Ethernet interconnects are
generating a lot of excitement towards building next generation High
Performance Computing (HPC) systems and enterprise datacenters. This
tutorial will provide an overview of these emerging interconnects,
their offered features, their current market standing, and their
suitability for prime-time HPC. It will start with a brief overview of
IB, high-speed Ethernet and their architectural features. An overview
of the emerging OpenFabrics stack which encapsulates both IB and
Ethernet in a unified manner, and hardware technologies such as
Virtual Protocol Interconnect (VPI), RDMA over Ethernet and RDMA over
Converged Enhanced Ethernet that aim at converged hardware solutions
will be presented. IB and high-speed Ethernet hardware/software
solutions and the market trends will be highlighted. Finally, sample
performance numbers highlighting the performance these technologies
can achieve in different environments such as MPI, Sockets, Parallel
File Systems, Multi-tier Datacenters, and Virtual Machines, will be
shown.
Targeted Audience and Scope
This tutorial is targeted for various categories of people working in
the areas of high performance communication and I/O architecture,
storage,
networking, middleware, virtualization, and applications related to
high-end systems:
- Newcomers to the field of high-speed networking
architecture who are looking
to get familiar with IB and high-speed Ethernet technologies.
- Scientists, engineers, and researchers working on the design and
development of next generation high-end systems including clusters,
data centers, storage centers.
- System administrators of large-scale clusters
- Developers of next generation networked computing middleware and
applications
- Managers and administrators responsible for setting-up next
generation high-end systems and facilities in their
organizations/laboratories.
The tutorial content is planned for half-a-day.
The content level is
broken down as 20% beginner, 50% intermediate,
and 30% advanced. There
is no fixed pre-requisite. As long as the attendees have
a general
knowledge in high performance computing, networking, and
storage, they will
be able to understand and appreciate it.
The tutorial is designed in
such a way that an attendee gets exposed to the
topics in a smooth and
progressive manner.
Brief Biography of Speakers
Dr. Dhabaleswar K. (DK)
Panda is a Professor of Computer Science at the Ohio State
University. He obtained his Ph.D. in computer engineering from the
University of Southern California. His research interests include
parallel computer architecture, high performance computing,
communication protocols, files systems, network-based computing, and
Quality of Service. He has published over 260 papers in major journals
and international conferences related to these research
areas. Dr. Panda and his research group members have been doing
extensive research on modern networking technologies including
InfiniBand and 10GigE/iWARP. His research group is currently
collaborating with National Laboratories and leading InfiniBand and
10GigE/iWARP companies on designing various subsystems of next
generation high-end
systems. The
MVAPICH/MVAPICH2 (High Performance MPI over InfiniBand and iWARP)
open-source software packages, developed by his research group, are
currently being used by more than 1,150 organizations worldwide (in 59
countries). This software has enabled several InfiniBand clusters
(including the 5th ranked one) to get into the latest TOP500
ranking. These software packages are also available with the Open
Fabrics stack for network vendors (InfiniBand and iWARP), server
vendors and Linux distributors. Dr. Panda's research is supported by
funding from US National Science Foundation, US Department of Energy,
and several industry including Intel, Cisco, SUN, Mellanox, QLogic
and NetApp.
He is an IEEE Fellow and a member of ACM.
More details about Dr. Panda, including a comprehensive CV
and publications are available
here.
Dr. Pavan Balaji
holds a joint appointment as an Assistant Computer
Scientist at the Argonne National Laboratory and as a research fellow
of the Computation Institute at the University of Chicago. He had
received his Ph.D. from the Computer Science and Engineering
department at the Ohio State University. His research interests
include high-speed interconnects, efficient protocol stacks, parallel
programming models and middleware for communication and I/O, and job
scheduling and resource management. He has more than 60 publications
in these areas and has delivered nearly 75 talks and tutorials at
various conferences and research institutes. He has received several
awards for his research activities including an Outstanding Researcher
award at the Ohio State University, the Director's Technical
Achievement award at Los Alamos National Laboratory, and several best
paper and other awards. Dr. Balaji has also served as a chairman or
editor in more than a dozen journals, conferences and workshops
including ICPP 2010, Hot Interconnects 2010, P2S2 workshop 2010, ICCCN
2010, CCGrid 2010, and JHPCA 2010, and as a technical program
committee member in numerous conferences and workshops. He is a member
of the IEEE and ACM. More details about Dr. Balaji are available at
available here .
Last Updated: Jun 1, 2010