Programming Assignment 1

The aim of this programming Assignment is to introduce you to parallel programming. You are going to use the CAC cluster computing environment. Please apply for an account immediately, if you have not done so already. You would write a sequential program without any parallelism overheads and a parallel implementation with pthreads.

Due date: 11:59pm 28th Sep, 2009

PThreads

Pthreads POSIX threads or Pthreads is a library implementation of a standardized C threads API that enables shared memory programming. In particular, the API provides two major functionalities -
  • Thread Management: Functions for creating, detaching, joining threads etc...
  • Synchronization: Mutexes, Conditional variables etc.
For more information on the Pthreads, check this tutorial from Lawrence Livermore National Labs.

Programming Task

In this assignment, you would implement a serial and a parallel prefix sum calculator. The prefix sum problem is stated as follows: Given a sequence of numbers x1, x2, ..., xn, the prefix sums are the partial sums
      s1 = x1
      s2 = x1 + x2
        ...
      sn = x1 + x2 + ... + xn

Program Specifications:

  • Number of processors: This is a user-specified variable (either user-input or command line parameter). You can assume that the input would be in the range of 1-16.

  • Number of integers: This is a user-specified variable. Say the user inputs n.

  • Integers: The integers in the input sequence should be randomly generated inside your program. The program will randomly generate n integers.

  • Output: Your program should print out (either to the standard output or to a file) the prefix sums computed.

  • Timing: Print timing statistics for your program's execution. The important metric is the time taken to compute the prefix sums (exclude the time taken to generate the input sequence or account for it separately.)
You would write 2 programs:
  • Implement a sequential version of your algorithm for computing the prefix sums. This algorithm CANNOT be one of your parallel algorithms using one processor. It should be an algorithm devoid of the overhead of parallelism.

  • Next, implement a parallel version using pthreads. Discussion on parallel prefix computation could be found here.

Analysis and Report

Submit a brief report with the following sections:

  • Theoritical Analysis: Describe your sequential and parallel algorithms. Provide a concise description and a pseudo-code for each algorithm. Provide an analysis of the time and work complexity of your algorithms using asympotitic notation.

  • Experimental Setup: In this section, describe some specifics of your program implementation, which should include the following:
    • Machine specification.
    • How did you generate input sets? Which random function did you use?
    • What is your timing mechanism? Use of Unix's gethrtime() is recommended for Pthread programs.
    • How many times did you repeat each experiment?

  • Experimental Results: In this section, you should compare the performance (running time) of the experiments to their theoretical complexity. Here are a few suggested results that you can show:
    • Speedup (y-axis) vs. #processors (x-axis): For a reasonble data size, show the speedup for the parallel version as you vary the number of processors (1,2,4,8,..). The speedup should be calculated with respect to your sequential implementation. Also, report the absolute execution time of your sequential implementation.
    • Speedup (y-axis) vs. data size: For eight processor node, show the speedup over sequential implementation by varying n.
    • Comment on theoretically expected peformance and compare it to the observed performance.

Collaboration:

All assignments are to be completed on your own. We expect adherence to the Engineering Honor Code in all assignments.

Late Policy:

25% penalty for each late day.

Resources

A sample parallel program using pthread is posted here.. (use "save as.." to download it). You can go throught it and use any construct that you like.

Introduction to Parallel Computing
POSIX Thread Tutorial
MPI Tutorial
Old Tutorial: Introduction to Programming with Threads
CAC Cluster FAQs

Acknowledgment:

This assignment is derived from the CPSC 626: Parallel Algorithm Design and Analysis course at Texas A&M.
We thank Mark Hill from University of Wisconsin-Madison for the sample pthread program.