Network-Based Computing for HPC, Big Data, Deep Learning, and Cloud

Instructor: Prof. Dhabaleswar K. (DK) Panda

Credits: 3

Course Description:

The objective of this course is to understand the principles and the practice of the emerging network-based computing paradigm. We will study advances in networking, computing, accelerator, and storage technologies as well as the computational/networking demands of current and emerging applications. Brief overview of the emerging network-based computing applications will be covered. System architectures for different kinds of network-based computing systems (HPC, Big Data, Deep Learning, and Cloud) will be analyzed. The role of computing technologies (multi-core/many-core), networking technologies (InfiniBand, High-Speed Ethernet, Omni-Path, proprietary), accelerators (GPGPUs and FPGAs), storage technologies (SSDs, NVMe-based SSDs, Burst Buffers), programming models and runtimes (MPI, PGAS, Hybrid MPI+PGAS, Hadoop MapReduce, Spark, Caffe, CNTK, and TensorFlow), storage/file systems (parallel file systems, HDFS, and Ceph), fault-tolerance, virtualization and power-/energy-aware issues in designing next-generation exascale systems will be focussed. Limitations of current solutions, impact of next generation technologies on designing future exascale computing systems and applications and future research challenges will be discussed.

Topics to be Covered (Tentative)

The objective of this course is to understand the principles and practice of the emerging network-based computing paradigm. We will study advances in networking and computing technologies as well as the computational/networking demands of current and emerging applications. Brief overview of the emerging network-based computing applications will be covered. System architectures for different kinds of network-based computing systems will be analyzed. Computational, communication, networking, I/O, and QoS requirements of emerging network-based computing applications together with emerging trends (such as GPUs/accelerators, Partitioned Global Address Systems (PGAS), Hadoop/MapReduce/HBase/Spark, Deep Learning, and virtualization) will be analyzed. Challenges and research issues in designing network-based computing systems and applications will be discussed in detail. Limitations of current solutions, impact of next generation networking technologies on designing future network-based computing systems and applications, and future research challenges will be discussed.


Last Updated: August 5, 2018