This is the home page for Peter Chapin's CIS-4230/5230 course notes for the Spring 2018 semester. Here you will find electronic versions of class slides, homework assignments, program samples, and links to other references of interest. If you are a student taking Parallel Programming you should bookmark this page.

• The course syllabus gives an overview of the course and its content, lists course resources, and describes the grading policy and related issues.
• The homework submission area and gradebook is on Moodle (graduate version) but all other course resources are here.
• I have prepared a short document on using Git to help you get familiar with the system. Some of the samples I will use in this course are on GitHub.
• Solarium is a solar system simulator that we will use as an example of a parallel programming application. It is a particular instance of a general style of problem called the "n-body problem."
• Spica is a library of my own construction. Clone this repository into a folder that is a sibling of the folder where Solarium is cloned; this configuration is assumed by the various Solarium project files.

## Topics

• 2018-01-15. Course introduction and overview.
• 2018-01-17. Detailed description of the Sum sample program that adds up all the items in an array.
• 2018-01-22. Discussed the dynamic processor version of the Sum sample program. Described the Voltage problem (Homework #2).
• 2018-01-24. Described Solarium (serial version), and sketched how to parallelize Solarium using pthreads in a manner similar to what Homework #2 requires.
• 2018-01-29. Amdahl's Law and Flynn's Taxonomy.
• 2018-01-31. Demonstarted Solarium (parallel version with and without barriers).
• 2018-02-05. Demonstrated Solarium (parallel version using thread pools), and introduced the Gaussian elimination algorithm.
• 2018-02-07. No class due to weather.
• 2018-02-12. More on Gaussian elimination using pthreads. Discussed Homework #3.
• 2018-02-14. No class.
• 2018-02-26. Introduced recursive parallel decomposition using QuickSort sample.
• 2018-02-28. Discussed the Sudoku sample and Homework #4.
• 2018-03-05. Reviewed the QuickSort example and detailed the steps required for solving Homework #4.
• 2018-03-07. Introduced OpenMP. Demonstrated the OpenMP version of Solarium.
• 2018-03-12. Introduced the Barnes-Hut algorithm for solving the n-body problem.
• 2018-03-14. Showed the OpenMP demonstration program.
• 2018-03-19. Discussed the organization of the VTC cluster and demonstrated how to execute MPI programs on it. Showed the MPI demonstration program.
• 2018-03-21. Demonstrated Solarium on the cluster and started coverage of MPI specifics.
• 2018-03-26. More details on the MPI version of Solarium (creating user defined datatypes). Gave an overview of the coming homework assignment.
• 2018-03-28. Discussed general multi-machine programming concepts (see slides).
• 2018-04-09. No class.

## Homework

1. Homework #1 Parallel Introduction. In this assignment you will experiment with a simple parallel program that adds elements in an array. Due: 2018-01-22
2. Homework #2 Voltage Fields. In this assignment you will implement a parallel program that computes the voltages inside a square metal box. Due: 2018-01-29
3. Homework #3 Gaussian Elimination. In this assignment you will enhance the Gaussian elimination program for better performance. Due: 2018-02-28
4. Homework #4 Sudoku Solver. In this assignment you will write a parallel Sudoku solver. Due: 2018-03-14
5. Homework #5 Barnes-Hut. In this assignment you will enhance the serial version of Solarium's Barnes-Hut implementation to use MPI and OpenMP. Due: 2018-04-18
6. Homework #6 Prime Counting Function. In this assignment you will write a parallel program that computes the number of primes less than a specified value.

## Machine Information

• You will be using lemuria extensively in this course. This document provides information on configuring your lemuria account for effective access to the cluster and for running MPI programs on the cluster.
• Intel Xeon E5620 is the processor in lemuria. It is a 64 bit quad core processor with hyperthreading, clocked at 2.40 GHz, with an 12 MiB L2 cache. Lemuria is a dual processor machine (with two E5620 processors for a total of 16 hardware threads).
• Intel Xeon W3520 is the processor in the VTC cluster nodes. It is a 64 bit quad core processor with hyperthreading, clocked at 2.67 GHz, and with an 8 MiB L3 cache. The formal datasheets for the 3500 series Xeon processors are here: Volume 1, Volume 2.
• The GPU on the cluster nodes is NVIDIA's GTX 285. This GPU is CUDA-capable.
• Intel64 and Intel IA-32 documentation direct from Intel.

Last Revised: 2018-04-25