InfiniBand and High-Speed Ethernet Architectures for Scientific and
Enterprise Computing: Opportunities and Challenges
When: February 25, 2012 (1:00pm - 4:30pm)
Where: New Orleans, Louisiana, USA
Abstract
InfiniBand (IB) and High-Speed Ethernet (HSE) interconnects
are generating a lot of excitement towards building next generation
High Performance Computing (HPC) systems, enterprise
datacenters and cloud computing systems.
This tutorial will provide an overview of these emerging
interconnects, their offered features, their current market standing,
and their suitability for prime-time HPC. It will start with a brief
overview of IB, HSE and their architectural features. An overview of
the emerging OpenFabrics stack which encapsulates both IB and 10GE in
a unified manner will be presented. IB and HSE hardware/software
solutions and the market trends will be highlighted. Finally, sample
performance numbers highlighting the performance these technologies
can achieve in different environments such as MPI, Sockets, Parallel
File Systems, Multi-tier Datacenters, Virtual Machines and
Cloud computing will be
shown.
Targeted Audience and Scope
This tutorial is targeted for various categories of people working in
the areas of high performance communication and I/O architecture,
storage,
networking, middleware, virtualization, and applications related to
high-end systems:
- Newcomers to the field of high-speed networking
architecture who are
interested in familiarizing themselves with IB and HSE.
- Scientists, engineers, and researchers working on the design and
development of next generation high-end systems including clusters,
data centers, storage centers and cloud computing systems.
- System administrators of large-scale clusters
- Developers of next generation networked computing middleware and
applications
- Managers and administrators responsible for setting-up next
generation high-end systems and facilities in their
organizations/laboratories.
The tutorial content is planned for half-a-day.
The content level is
broken down as 20% beginner, 50% intermediate,
and 30% advanced. There
is no fixed pre-requisite.
As long as the attendee has a general
knowledge in high performance computing,
cluster computing, networking, and
storage, he/she will
be able to understand and appreciate it.
The tutorial is designed in
such a way that an attendee gets exposed to the
topics in a smooth and
progressive manner.
Outline of the Tutorial
-
What are InfiniBand and HSE?
- TCP vs. User-level communication protocols
- Requirements (communication, I/O, performance, cost,
RAS) from
the perspective of designing next generation
high-end systems and
scalable data centers
- Overview of InfiniBand Architecture
- Architecture and Basic Hardware Components
- Novel Features (Hardware Protocol Offload and
Communication Semantics)
- IB Verbs Interface
- Management and Services (Subnet Management and
Hardware Support for Scalable Network Management)
- Overview of High-Speed Ethernet Family
- Architecture and Basic Components (Similarity
and Differences with InfiniBand)
- Out-of-Order Data Placement
- Dynamic and Fine-grained Data Rate Control
- Existing Implementations of HSE (10 GigE with/without iWARP, 40GigE, RoCEE, proprietary stacks, etc.)
- Overview of IB and HSE Convergence
- Overview of IB and HSE Products
(hardware and software)
- Vendors, Switches, and Host Channel Adapters
- Overview of OpenFabrics Architecture and Convergence
- Pointers to IB and HSE installations
- Sample Performance numbers with IB and HSE
- High-end clusters with MPI-1 and MPI-2
- Sockets (SDP and IPoIB)
- File Systems (Lustre and NFS over RDMA)
- Multi-tier Datacenters
- Virtualization Support (Xen-IB)
- Cloud Computing Systems (Hadoop, HBase and Memcached)
- Conclusions, Final Q&A, and Discussion
Brief Biography of Speakers
Dr. Dhabaleswar K. (DK)
Panda is a Professor of Computer Science at the Ohio State
University. He obtained his Ph.D. in computer engineering from the
University of Southern California. His research interests include
parallel computer architecture, high performance computing,
communication protocols, files systems, network-based computing, and
Quality of Service. He has published over 300 papers in major journals
and international conferences related to these research
areas. Dr. Panda and his research group members have been doing
extensive research on modern networking technologies including
InfiniBand, HSE and RDMA over
Converged Enhanced Ethernet (RoCE). His research group is currently
collaborating with National Laboratories and leading InfiniBand and
10GigE/iWARP companies on designing various subsystems of next
generation high-end
systems. The
MVAPICH/MVAPICH2 (High Performance MPI over InfiniBand and iWARP)
open-source software packages, developed by his research group, are
currently being used by more than 1,840 organizations worldwide (in 65
countries). This software has enabled several InfiniBand clusters
(including the 5th and 7th ones) to get into the latest TOP500
ranking. These software packages are also available with the Open
Fabrics stack for network vendors (InfiniBand and iWARP), server
vendors and Linux distributors. Dr. Panda's research is supported by
funding from US National Science Foundation, US Department of Energy,
and several industry including Intel, Cisco, SUN, Mellanox,
QLogic, NVIDIA
and NetApp.
He is an IEEE Fellow and a member of ACM.
More details about Dr. Panda, including a comprehensive CV
and publications are available
here.
Last Updated: February 4, 2012