High Perfomance Computing
Objectives
Knowledge and understanding goals:
- Parallel architectures
- Parallel programming paradigms
- General parallel algorithm design methodologies
- Problem-type specific parallel algorithmic strategies
- Techniques to optimize the performance of parallel algorithms in the studied parallel architectures
Know how to:
- Optimize sequential programs
- Implement parallel algorithms to handle compute-intensive or data-intensive applications on GPU(s)(CUDA or OpenCL) and/or on distributed memory architectures (MPI or Spark).
- Measure and analyze the performance of a parallel computation.
Soft-Skills:
- Reason about and critically evaluate the algorithmic and technological alternatives available for solving a problem.
- Design a solution for a problem a before going into the implementation phase.
- Team work.
- Structure and write project reports.
General characterization
Code
11165
Credits
6.0
Responsible teacher
Hervé Miguel Cordeiro Paulino, Vítor Manuel Alves Duarte
Hours
Weekly - 4
Total - 56
Teaching language
Português
Prerequisites
Students should have knowledge about computer architecture, computer networks, and operating systems, and good programming skills. The exercises will use the C/C++ and Java programming languages.
Bibliography
As usual in HPC courses, there is no required textbook. There are several books that cover the fundamental concepts of HPC. A list follows:
- P Pacheco.An Introduction to Parallel Programming. 2nd Ed. Morgan Kaufmann,2022
- T Sterling, M Brodowicz, M Anderson. High Performance Computing - Modern Systems and Practices. Morgan Kaufmann, 2017.
- T Rauber and G Rünger. Parallel Programming for Multicore and Cluster Systems. Springer, 2013
- Ian Foster. Designing and Building Parallel Programs. Addison-Wesley, 1995,
GPU computing:
- NVIDIA documentation
- J Sanders, Edward Kandrot, CUDA by Example: An Introduction to General-Purpose GPU Programming, Addison-Wesley, 2010
Data-intensive computing:
- A Shook and D Miner. MapReduce Design Patterns. O’Reilly Media, 2012
- High Performance Spark. O’Reilly Media, 2017
Teaching method
The lectures aim to present the course''''''''''''''''s concepts topics and discuss the most relevant questions.
The lab sessions take place in a general purpose lab with access to PCs (which are multi-core) that enable the execurtion of the shared memory programming exercises. Concerning the programing of GPUs and of distributed memory architectures, a cluster of multiple machines equipped with multi-core processors and NVIDIA GPUs is accessible.
All laboratorial exercises are available from a GIT repository and include all software dependencies (other than the needed programing languages). They also include automated tests to help the student assess the correctness of its implementations.
Evaluation method
Evaluation elements
Two intermediate tests or final exam (60% of the final grade). The tests ahave the same relative weight.
Two programming assignments (40% of the final grade). The assignments have the same relative weight.
All grades are rounded to the nearest tenth.
Frequency
Average of the grades of the programming assignments >= 9.
Final grade
NTP = average of the grade of the tests or grade of the final exam
NP = average of the grades of the programming assignments
if NTP < 9 then final grade = NTP
else final grade = NTP*0.6 + NP*0.4
Subject matter
Motivation
- Why Parallel Computing?
- Why High Performance Computing?
- The convergence of the Big Compute and Big Data trends of thought
Fundamentals of Parallel Computing
- Parallel Architectures
- Parallel Performance
- Parallel Programming Paradigms
- Designing Parallel Algorithms
Parallel Computing
- Parallel Programming Patterns and Strategies
- Shared Memory Processing
- GPU Computing
- Message-passing programming
Data-Centric High Performance Computing
- MapReduce
- Apache Spark
Parallel Algorithms (Putting it All Together)
- Graph processing algorithms
- Machine learning algorithms
The Future of High Performance Computing
- Challenges in the industry
- Open research topics