Designing High-End Computing Systems and Programming Models with
InfiniBand and High-Speed Ethernet
When: May 31, 2011 afternoon
Where: Tucson, Arizona, USA
InfiniBand (IB) and High-Speed Ethernet (HSE) technologies
are generating a lot of excitement towards building next generation
High-End Computing (HEC) systems and enterprise datacenters. This
tutorial will provide an overview of these emerging interconnect
architectures, their offered features, their current market standing,
and their suitability for prime-time HEC and data centers. It will
start with a brief overview of IB, HSE, and their architectural
features. An overview of the emerging OpenFabrics stack which
encapsulates both IB and HSE in a unified manner will be presented. IB
and HSE hardware/software solutions and the market trends will be
highlighted. Finally, sample performance numbers highlighting the
performance these technologies can achieve in different environments
such as MPI, UPC, Global Arrays,
Sockets, Parallel File Systems, Multi-tier Datacenters,
and Virtual Machines, will be shown.
Targeted Audience and Scope
This tutorial is targeted for various categories of people working in
the areas of high performance communication and I/O, storage,
networking, middleware, programming models and applications related to
high-end systems. Specific audience whom this tutorial is aimed at
include:
- Newcomers to the field of high-speed networking
architecture and programming models who are looking
to get familiar with IB and 10GE technologies.
- Scientists, engineers, and researchers working on the design and
development of next generation high-end systems and programming
models.
- System administrators of large-scale clusters
- Developers of next generation networked computing middleware and
applications
- Managers and administrators responsible for setting-up next
generation high-end systems and facilities in their
organizations/laboratories.
The tutorial content is planned for half-a-day.
The content level is
broken down as 20% beginner, 50% intermediate,
and 30% advanced. There
is no fixed pre-requisite.
As long as the attendee has a general
knowledge in high performance computing,
cluster computing, networking, and
storage, he/she will
be able to understand and appreciate it.
The tutorial is designed in
such a way that an attendee gets exposed to the
topics in a smooth and
progressive manner.
Outline of the Tutorial
-
What are InfiniBand and 10GigE/iWARP?
- TCP vs. User-level communication protocols
- Requirements (communication, I/O, performance, cost,
RAS) from
the perspective of designing next generation
high-end systems and
scalable data centers
- Overview of InfiniBand Architecture
- Architecture and Basic Hardware Components
- Novel Features (Hardware Protocol Offload and
Communication Semantics)
- IB Verbs Interface
- Management and Services (Subnet Management and
Hardware Support for Scalable Network Management)
- Overview of High-Speed Ethernet Family
- Architecture and Basic Components (Similarity
and Differences with InfiniBand)
- Out-of-Order Data Placement
- Dynamic and Fine-grained Data Rate Control
- Existing Implementations of 10GE/iWARP
- InfiniBand/High-Speed Ethernet Convergence Technologies
- Overview of InfiniBand and High-Speed Ethernet Products
(hardware and software)
- Vendors, Switches, and Host Channel Adapters
- Overview of ConnectX architecture
- Overview of OpenFabrics Architecture and Convergence
- Pointers to IBA and 10GigE installations
- Designing High-end Systems with InfiniBand and High-Speed Ethernet:
Research
Challenges, Case Studies and Performance Evaluation
- High-end clusters with MPI-1 and MPI-2
- PGAS Models (Global Array and UPC)
- Sockets (SDP and IPoIB)
- File Systems (Lustre and NFS over RDMA)
- Multi-tier Datacenters
- Virtualization Support (Xen-IB)
- Conclusions, Final Q&A, and Discussion
Brief Biography of Speakers
Dr. Dhabaleswar K. (DK)
Panda is a Professor of Computer Science at the Ohio State
University. He obtained his Ph.D. in computer engineering from the
University of Southern California. His research interests include
parallel computer architecture, high performance computing,
communication protocols, files systems, network-based computing, and
Quality of Service. He has published over 285 papers in major journals
and international conferences related to these research
areas. Dr. Panda and his research group members have been doing
extensive research on modern networking technologies including
InfiniBand and 10GigE/iWARP. His research group is currently
collaborating with National Laboratories and leading InfiniBand and
10GigE/iWARP companies on designing various subsystems of next
generation high-end
systems. The
MVAPICH/MVAPICH2 (High Performance MPI over InfiniBand and iWARP)
open-source software packages, developed by his research group, are
currently being used by more than 1,550 organizations worldwide (in 60
countries). This software has enabled several InfiniBand clusters
(including the 6th and 7th ranked ones) to get into the latest TOP500
ranking. These software packages are also available with the Open
Fabrics stack for network vendors (InfiniBand and iWARP), server
vendors and Linux distributors (RedHat and SUSE).
Dr. Panda's research is supported by
funding from US National Science Foundation, US Department of Energy,
and several industry including Intel, Cisco, SUN, Mellanox, QLogic
and NetApp.
He is an IEEE Fellow and a member of ACM.
More details about Dr. Panda, including a comprehensive CV
and publications are available
here.
Dr. Sayantan Sur
Dr. Sayantan Sur is a Research Scientist at the
Department of Computer Science at The Ohio State University. His research
interests include high speed interconnection networks, high performance
computing, fault tolerance and parallel computer architecture. He has published
more than 20 papers in major conferences and journals related to these research
areas. He is a member of the Network-Based Computing Laboratory lead by Dr. D.
K. Panda. He is currently collaborating with National Laboratories,
Supercomputer Centers, and leading InfiniBand companies on designing various
subsystems of next generation high performance computing platforms. He has
contributed significantly to the MVAPICH/MVAPICH2 (High Performance MPI over
InfiniBand and 10GigE/iWARP) open-source software packages. The software
developed as a part of this effort is currently used by over 1,550
organizations in 60 countries. In the past, he has held the position of
Post-doctoral researcher at IBM T. J. Watson Research Center, Hawthorne and
Member Technical Staff at Sun Microsystems. Dr. Sur received his Ph.D. degree
from The Ohio State University in 2007.
More details about Dr. Sur are
available here .
Last Updated: May 7, 2011