Parallel Curriculum Development
The Intel Parallel Computing Center at the University of Oregon has as its goal the development of an undergraduate parallel computing course to be offered each year in the Department of Computer and Information Science. However, the larger objective is to share our experiences and materials with others in the parallel computing community. In this spirit, offered below are lecture and lab curriculum components of a prototypical 10-week course.
|Introduction||Structured Parallel Programming (Ch. 1)||PDF / PPT|
|Parallel computer architecture||Structured Parallel Programming (Ch. 2)||PDF / PPT|
|Parallel performance theory (1)||Structured Parallel Programming (App. B, C)||PDF / PPT|
|Parallel performance theory (2)||Structured Parallel Programming (App. B, C)||PDF / PPT|
|Parallel programming patterns - Map||Structured Parallel Programming (Ch. 3, 4)||PDF / PPT|
|Parallel programming patterns - Collective||Structured Parallel Programming (Ch. 5)||PDF / PPT|
|Parallel programming patterns - Data reorganization||Structured Parallel Programming (Ch. 6)||PDF / PPT|
|Parallel programming patterns - Stencil and recurrence||Structured Parallel Programming (Ch. 7)||PDF / PPT|
|Parallel programming patterns - Fork-join||Structured Parallel Programming (Ch. 8)||PDF / PPT|
|Parallel programming patterns - Pipeline||Structured Parallel Programming (Ch. 9)||PDF / PPT|
|Distributed memory message passing||MPI programming tutorials (online)||PDF / PPT|
|Parallel algorithms||Designing Parallel Algorithms (Ch. 4 and 5)||PDF / PPT|
|Parallel performance tools (1)||Methods for parallel performance analysis (online tutorials)||PDF / PPT|
|Parallel performance tools (2)||TAU Performance System tutorial||PDF / PPT|
|MPC: Multi-Processor Computing Framework (guest lecture)||MPC website||PDF / PPT|
|Computer Architecture and Structured Parallel Programming (guest lecture)||Intel Xeon Phi Coprocessor High-Performance Programming||PDF / PPT|
|Manycore computing and GPUs||NVIDIA CUDA training courses (online)||PDF / PPT|
A group term project can be used as a major source for student evaluation. Projects should be non-trivial to implement and should include more than one of the parallel patterns from the course. For the University of Oregon 2014 delivery, we elected to have students design and pitch their own projects. The course project has 4 deliverables paced over the duration of the course. The deliverables and pacing are design to improve project project success for the students and provide documentation for basis of grade.
The project proposal component aims to provide some practice for pitching projects. Proposals are expected to be short but provide enough detail to show that the project has sufficient parallel complexity and that the students have plotted a path to success. The proposal has 5 sections and an example is provided.
Executive Summary - A short (2 paragraph) description of the
context of the project. This section can be used to evaluate if the
students can frame motivation for development of a software system.
Project Description - A description of what the program will do. This section can be used to evaluate if the student project meets the functional requirements and check for project complexity.
Highlevel Architecture - A technical description of the components involved in implementing the solution, the component functionalities, and the interconnecting between components. This section can be used to verify that there are sufficient pieces to be distributed between the students in the group for implementation.
Parallel Plan - A description of parallelization targets within the project. This section can be used to verify that students are thinking about opportunities to apply parallel patterns and that there are enough parallel regions for each student to implement one.
Project Schedule - A breakdown of work targets with no larger than one week granularity. This section provides an opportunity of the students to practice project estimation and can help students to self identify projects that are too large or too small. This section is also valuable for following up during the term on group progress.
Each group should maintain a repository that the instructional staff has access to. The instructional staff can check for updates in the repository and log entries to verify that groups are making progress. Access to the repository can also simplify assisting students during office hours or via email. Repository history can be used to help support basis of grade.
Each student should independently provide feedback to help instructional stuff understand issues that impacted student and project success. Team evaluation results can also support identification of unbalanced student effort. For the 2014 University of Oregon delivery the following questions were used near the end of the project:
- How well have your time estimates and project plan worked out? Did you have big surprises?
- What part of the project do you think came together particularly well (ie. module, integration point, design decision/compromise, performance experimentation, group meetings)? Please tell us briefly about that.
- Please tell us briefly about the part of the project you worked on. What design/implementation challenges did you face and how did you overcome them?
For each of your group members, please answer the following three
- Team member name?
- What part of the project did he/she work on?
- What do you feel his/her largest positive contribution to the success of the project was? This could be clever code but also includes things like code review, design/architecture, organizing meetings/documentation, etc. Please provide some details/examples of how their contribution benefitted the project/team.
Project Report and Presentation
The group project report and presentation are primary tools used to support grading. The report details the project's final design and implementation responsibilities, interesting implementation details, experimental and performance observations, and lessons learned during the project. The presentations are expected to be short and provide an opportunity for students to present their projects to the rest of the class. The presentation period also supports instructional staff asking questions about the project.
To give the labs and project context we introduced a fictitious organization to motivate lab activities. The Office of Strategic National Alien Planning (OSNAP) is charged with planning for and executing plans related to real and imagined extraterrestrial encounters. Students are free to ignore OSNAP completely.
The pattern labs are designed to be primarily coding activities for the students. Labs use a short slide deck to provide some context for the code to be parallelized and a brief refresher regarding the pattern to be used. Students are then expected to work with the code and ask questions. Spontaneous discussion around performance and design implications is encouraged.
An archive file containing the pattern labs starting code and lab slides can found here.
|Map||OSNAP Executives need a tool to recover their PIN from the associated hash. You will write a program to do so in each of the three technologies.||slides|
|Collective||OSNAP has and handles a large amount of sensitive data. Some of the data is classified such that only specific sets of people should be able to access and the data should only be accessed when all of them are present (eg. alien autopsy footage).
Our security consultants have devised an XOR based encryption strategy to support the requirements and have provided a serial reference implementation. XOR is associative and commutative, so there may be an opportunity to improve performance with a parallel collective. OSNAP needs parallel implementations to conduct performance studies with.
|Data Reorganization||Following the public release of the HeartBleed exploit against OpenSSL, OSNAP executive management no longer trusts software written and maintained by outside development organizations. A large number of OSNAP projects require efficient linear algebra code.
In this lab, you are to improve the performance of the matrix multiplication code using data reorganization. You may need to modify the algorithm and/or add additional logic to move the data. Performance measurement should only be done for the actual multiply logic. Time spent outside of the matrix_multiply function need not be measured.
|Stencil||OSNAP is investigating computer vision to aid in detection and classification of UFOs. One of the common first steps for computer vision is to compute edges. Counter intuitively, we get better edges if the image is blurred before applying a gradient filter.
You have been provided code for a serial gaussian blur filter and code for Prewitt kernels to help highlight edges. Please improve the performance using parallelism and add the Prewitt gradient filters to the image processing.
|Fork/Join||OSNAP is investigating faster than light travel using a ring shaped device and exotic particles. You have been provided a serial simulation implementation that we would like to have accelerated.||slides|
|Pipeline||OSNAP has intercepted a video signal of interest. We would like to view the signal in realtime however several filters must be applied. The provided serial implementation cannot keep up with the frame rate.
Due to space constraints, the sequence of video frames has not been included in the archive.