Due Monday, March 12th, 6pm, EECS 380 box in room 2420 EECS

You are to turn this in with all parts *stapled* together.
Please use two-sided printing to reduce the amount of paper in homeworks,
if you can (graders are having trouble carrying all homeworks!).
Make sure to mention the following on the first page:
your name, uniquename and the number/time of your discussion section.

- Racing STL's sort() versus the qsort() function from stdlibc
- Improving the worst-case time of the quicksort algorithm
- Incremental sorting
- Visualization of bubble sort and shaker sort

A sample call to STL's sort is given in the STL's
documentation.
A sample call to qsort is given in Sedgewick on p. 303.
Additionally, all functions from the C standard library have
man pages (try `man qsort`

, a caveat in this man page
is discussed in the newsgourp).

You need to use the following comparison methodology:

- For a given N, populate an array with N randomly generated large
integers. For this use the srand() and rand() calls (or srandom()
and random()). Type
`man rand`

and`man random`

for details on those two (here's an optional journal article on rand() and random()). It is recommended that you print your random numbers when testing the code to make sure they "look random". - You will need two copies of the array --- one for sort and the
other for qsort. Run sort and qsort once, and measure the cpu time
for each. Make sure that N is large enough to give reliable readings
without repeated runs. You must compile your code with the compiler
flag
`-O3`

- Change the random seed (otherwise, you will always generate the same
"random" numbers), repeat the experiment for the same N,
compute the average runtime over all experiments done so far
for the same N. You can stop when the average stabilizes to within 10%.
- For a given N, you need to figure out which sorting call is faster and by how much on average (by which factor). Use three values N>10000, each at least 2x apart from each other.

`-O3`

, and are very likely to get a wrong answer otherwise.
Feel free to explain your results, but explanations will not be graded.

`n`

is half the size of the container).
For this, build a plot of its average run time as
a function of the input size for at least 6 (significantly
different) values of N. Every datapoint needs to be averaged as
explained in part1, but you don't need to compare it to anything.
Assume that you are given a function `median()`

(with whatever arguments you wish) that computes the median
in *worst-case* linear time (STL's implementation
apparently does not). A very non-trivial algorithm for this
is explained in detail in the CLR book (pp.189-196), but you are
not required to read that --- just assume that you can use its
implementation.

- Given the example of worst-case input for the original quick sort
algorithm described in the Sedgwick book (page 321); explain why
the median-of-three pivoting and choosing a pivot at random, each,
do not improve the worst-case big-theta complexity of qsort.
- Consider modifying the original qsort algorithm using linear-time median selection so that the modified algorithm will have better worst-case complexity. Consider two cases: (i) each element in the array can only repeat O(1) times, (ii) the general case. You need to provide a proof of its worst-case complexity (it can refer to the proof of worst case complexity of qsort and simply analyze modifications).

`qsort()`

may not be competitive
with original `qsort`

on average, despite better
asymptotic worst-case performance and asymptotically equivalent
average-case performance.
- Implement a generic sorting algorithm that takes linear worst-case time on inputs generated as explained in the previous paragraph --- you are allowed to paste any code from Chapter 6 of our text. Empirically show that it takes linear time on average (similar to experiments in part 2) and compare its runtime to the faster of sort()/qsort() (similar to the experiments in part 1). Remember that all inputs must be almost-sorted ranges from 3a).

- You need to have a working implementation of bubble sort for arrays
(can paste from Sedgewick). Shaker sort is a variant of bubble sort
in which passes are performed left-to-right and right-to-left in turns,
rather than in one direction as in bubble sort (why would this be useful?).
You need a shaker sort implementation as well. Both implementations
should terminate once the range is sorted (rather than perform "dummy"
passes).
- Using the index-sorting implementation in the
handout,
write a function plot() that takes a range and saves its sorted index
so that the index can be plotted by gnuplot (i.e., two unsigned's
per line).
- Use N=200 for your experiments, use your student ID as random seed. Generate an array of N random numbers as in part 1. Find out how many passes bubble sort and shaker sort perform on this array. Modify the codes of bubble sort and shaker sort so that plot() is called half way through (e.g., if there are 90 passes total, plot() should be called at pass 45). You need to produce two pictures. They will probably look similar to Fig 6.7 in Sedgewick.

Part 1: * Source code (one file). * A table with 9 numbers. * A recommendation in English as to whether sort() or qsort() should be used when speed matters. Part 2: * A plot showing that STL's median selection runs in linear time on average. * A description of worst-case inputs for qsort. * An explanation of why both median-of-three and random pivoting do not change worst-case big-theta. * A [theoretical] proof of the worst-case run time when using the median as the pivot. [You do not need to implement this modified quicksort] Part 3: * Source code (one file). * One plot showing that your algorithm has linear average-case complexity on the almost-sorted ranges considered (plot as in part 2) * A comparison between your algorithm and the faster of sort()/sqort() --- table as in part 1. Part 4: * Source code for bubble and shaker sort. * An explanation why it makes sense to perform bubble-passes in both directions. * Two plots. (Plots produced by different students must be different. If plots submitted by different students are the same, both homeworks may be disqualified (0 score for the whole homework)).