Instructor Guide

Course Overview

This course focuses on Python for High-Performance Computing, designed for researchers and developers looking to optimize their computational workflows on HPC systems.

Course Structure

The course is organized into two days:

Day 1: Python Fundamentals & HPC Basics

  • D1_01: Python Essentials

  • D1_02: Virtual Environment Setup

  • D1_03: HPC Modules

  • D1_04: Benchmarking and Profiling

  • D1_05: NumPy

Day 2: Advanced Optimization & Parallelization

  • D2_01: Cython for Performance

  • D2_02: Dask for Distributed Computing

  • D2_03: Numba for JIT Compilation

  • D2_04: SLURM and MPI for Parallel Processing

  • D2_05: Containerization and Distributed Computing

Learning Outcomes

By the end of this course, participants should be able to:

  1. Write efficient Python code optimized for HPC systems

  2. Understand and use parallel programming paradigms (Dask, MPI)

  3. Leverage low-level optimization techniques (Numba, Cython)

  4. Profile and benchmark Python applications

  5. Utilize container technologies for reproducible HPC workflows

  6. Submit and manage jobs on HPC clusters

Prerequisites

  • Basic Python programming experience

  • Familiarity with command-line interfaces

  • Optional: Basic understanding of linear algebra

Resources

  • Karolina Supercomputer access for hands-on sessions

  • Jupyter Notebook environment for interactive learning

  • Supporting code examples in content/episodes/code/

  • Apptainer/Singularity container definitions in content/resources/

Technical Setup

Dependencies

See content/resources/ for environment specifications:

  • apptainer-demo-env.yaml: Docker-based environment

  • apptainer-mpi-env.yaml: MPI-enabled environment

  • Conda requirements in requirements.txt

Running Locally

  1. Create a virtual environment using one of the provided YAML files

  2. Install dependencies from requirements.txt

  3. Launch JupyterLab and open the notebooks

  4. For MPI examples, use the provided shell scripts with appropriate job schedulers

Lab Exercises

Solutions are available in content/solution/ for reference. Encourage participants to attempt exercises before reviewing solutions.

Assessment

Each episode includes:

  • Conceptual understanding through explanations and examples

  • Hands-on code exercises

  • Benchmarking comparisons to demonstrate improvement

Additional Resources

  • Original VSC materials: https://gitlab.tuwien.ac.at/vsc-public/training/python4hpc

  • Karolina Documentation: https://docs.it4i.cz/

  • Python Documentation: https://docs.python.org/

  • NumPy Guide: https://numpy.org/doc/

  • Dask Documentation: https://docs.dask.org/

  • Numba Guide: https://numba.readthedocs.io/

  • MPI4PY Documentation: https://mpi4py.readthedocs.io/