CS 411/511

From CS Wiki
Jump to: navigation, search

CS 411/511: Parallel Programming

Catalog Description: Analysis, mapping, and the application of parallel programming software to high-performance systems; the principles of spatial- and temporal-locality of data memory hierarchies in performance tuning; architectural considerations in the design and implementation of a parallel program; the tradeoff between threaded (shared memory) and message-passing (distributed memory) programming styles and performance. Additional projects/assignments required for graduate credit.

Type: CS 411 is a technical elective for CS majors.

Total Credits: 3

Course Coordinator: Robert Hiromoto

URL: None

Syllabus: CS 411/511 Syllabus

Prerequisites: Analysis of Algorithms, Operating Systems, Concurrent Systems and Computer Architectures, or instructor permission.

Recommended preparation: Proficiency in programming using a modern language such as C or C++.

Textbook: An Introduction to Parallel Programming, Peter Pacheco, Morgan Kauffmann, ISBN-10: 0123742609

Detailed Description

This course teaches the principles of parallel programming to upper division and graduate level computer science students. Topics to be covered include programming for symmetric multi-core processors using Pthreads and OpenMP, and distributed memory workstation clusters using the message-passing MPI communication library. Programming tools such as gdb and gprof, parallel programming semantics, and parallel program performance issues will also be covered. Several programming assignments will be used to reinforce the concepts learned in the class room, and a final parallel programming project will be required.

Major Topics Covered

  1. Spatial and temporal locality
  2. Parallel programming concepts
    1. Task and Data decomposition
    2. Race-conditions
    3. Critical-section (atomic updates) – Process starvation (live lock)
    4. Dead-lock
    5. Performance evaluation
  3. Primitives to control shared resources
    1. Busy-wait
    2. Mutual exclusion (mutexes)
    3. Semaphores
    4. Thread-safe system libraries
  4. Shared memory programming using threads (PThreads and OpenMP)
    1. Shared memory architectural model
    2. Asynchronous computations
    3. Parallel programming advantages/disadvantages
  5. Message passing using MPI
    1. Block algorithms
    2. Single program, multiple data (SPMD)
  6. Performance issues
    1. Bottlenecks and speedups
    2. Impact of communication
    3. Tuning programs

Course Outcomes

  1. Understand the concepts of spatial and temporal locality, and their implications for program efficiency
  2. Be able to decompose an algorithm into parallelizable partitions based on task or data
  3. Understand factors that limit performance of parallel programs
  4. Understand mechanisms that can cause program failure: race conditions, deadlock
  5. Be able to implement primitives to synchronize access to shared resources: Busy-wait, mutex, semaphores
  6. Understand shared-memory architectures and their use
  7. Be able to write simple message passing programs using MPI
  8. Describe how to tune a program to mitigate performance issues such as communication bottlenecks