Solution for the afternoon midterm

1. Complexity

 

a. Give the formal definition of big-O and big-theta

 

 

b. Prove or disprove that for any three functions f(x), g(x) and s(x): s(x)=O(f(x)) and s(x)=O(g(x)) imply s(x) = O(f(x) + g(x)), but does not imply (f(x) + g(x)) = O(s(x))

Note: This is FAR more than was needed to get full credit on this problem. We just wanted to be as complete as possible!

 

Suppose:  and

 

                         (1)

                          (2)

 

(1) and (2) imply that we can introduce the two real number C1 and C2 and the two integers n1 and n2 such that:

 

          (3)

(4)

 

Let:

 and

 

Then from (3) and (4), we have:

         (5)

          (6)

 

Summing term to term the equations (5) and (6) leads to:

Which basically implies that:

 

Which means:

This is what we wanted to prove.

 

 

 

Now suppose again:  and

The following values are possible for the functions s, f and g:

Note that:

 

These function respect the hypothesis, but the following relation does not hold:

 

Therefore s(x)=O(f(x)) and s(x)=O(g(x)) does not imply (f(x) + g(x)) = O(s(x)).

 

 

 

c. Consider  and prove or disprove the these two statements:

h(x)=O(k(x)) and k(x)=O(h(x))

 

We know:

          (1)

              (2)

   (3)

 

We have:

            (4)

 

The same way, we know:

          (5)

          (6)

  (7)

 

Then have:

           (8)

 

Finally, (4) and (8) imply:

 

Which means:

  • 25 pts Search
    1. 15 pts: You are given two sorted arrays of doubles, a and b. There are N non-repeating doubles in a and M non-repeating doubles in b. Importantly, M3<N (for example N=1001, M=10).

      Implement the function

      unsigned countMissing(double a[], double b[], unsigned M, unsigned N);
      to count the doubles in a that are missing from b. No function calls should be performed in the body of this function. Your goal is to achieve as low asymptotic complexity as possible.
      #include <iostream>
      
      unsigned countMissing(double a[], double b[], unsigned M, unsigned N)
      // N is the size of a
      // M is the size of b
      // M is much smaller than N
      // Find: the number of elements in a which are not found in b
      // Approach: binary-search for each b[i] in a[]
      //           count elements found (K)
      //           return N-K
      {
         unsigned i,K=0;
         for(i=0; i!=M; i++)
         {
            // binary-search for b[i] in a[]; K++ if found
            unsigned left=0, right=N-1;
            if (a[left]==b[i] || a[right]==b[i]) { K++; continue; }
            while (left+1 < right) // binary search for b[i] in a[], takes log N
            {
               unsigned mid=(left+right)/2;
               if       (a[mid] < b[i]) left =mid;
               else if  (a[mid] > b[i]) right=mid;
               else     { /* cout << " found " << endl; */ K++;  break; }
            }
         }
         return N-K;
      }
      
      int main()
      {
         const unsigned N=10;
         const unsigned M=2;
         double a[N]={2.3, 3.2, 4.81, 5.16, 6.0, 6.1, 6.35, 100.1, 131.2, 131.3 };
         double b[M]={4.81, 4.82};
         cout <<  countMissing(a,b,M,N) << endl;
      }
      
      Note: the testing program, i.e., the main() function is given here so that you can paste this code into a file, compile and run. It was not needed on the exam.

    2. 5 pts: Give the tightest possible big-O complexity estimate for your program in terms of N; explain your answer. Refer to the known complexity results for binary search.

      Answer: O(N1/3log N)
      The inner while loop performs binary search in a[], and its complexity is that of binary search - O(log N). This loop is invoked M times, hence the algorithm takes O(M log N) time. For M< N1/3 this translates into O(N1/3log N).

    3. 5 pts: Describe an algorithm that is faster in the case M=N/10. Justify your answer by tightest possible big-O complexity of this algorithm.

      Linear-time set-difference, as in homework 2, will compute the answer in time O(N+M), which for M=N/10 translates into O(N). Note that this is very different from the above expression for the case M< N1/3.

  • 25 pts Sorted lists
    1. Your employer, Microserf, tracks the users of their product code-named IX.
      Each user is represented by the structure:
          struct User
          {
            char firstName[40]; 
            char lastName[40];  
          }; 

       

      Implement the function

      void printIllegalUsers(User* owners, User* renters, User* suspects);
      whose arguments are assumed to be sorted arrays without repetitions:
      • array owners contains the legitimate owners of IX.
      • array renters contains legitimate renters of IX.
      • array suspects contains suspected illegal users of IX.
      All arrays end with a sentinel element, bearing the name _Gill _Bates (the leading underscores make this name last in the sorted order).

      The function must print first names and last names of all suspected illegal users who do not own or rent a copy of IX.

      The asymptotic complexity of your algorithm will affect the grade. Your code should be compilable with g++. You may assume that operator<() and operator==() are available for the User structure. You cannot use STL for this question.

      /*void printillegal
       *
       *function to print out users who are:
       * in the suspect list, but not in the owner or the renter list
       *
       * Notes:  1) operator< has been overloaded for struct user
       *         2) all lists have entry "_Gill _Bates" as last entry (sentinel)
       * Credits: Vivek Shende and Kevin Lochner
       */
      
      void printillegal(user* rent, user* own, user* suspect)
      {
              user last;
              strcpy(last.first,"_Gill");
              strcpy(last.last,"_Bates");
              
              while(*suspect < last) {
                      while ((*own < *suspect) && (*own < last)) 
                              own++;
                      while ((*rent < *suspect) && (*rent < last))
                              rent++;
                      if ((*suspect < *own) && (*suspect < *rent))
                              cout << suspect->last << ", " << suspect->first << endl;
                      suspect++;
              }
      }
      

  • 25 pts Algorithms for sorting
    1. 10 pts: Examine the following sequence of ranges that gradually become sorted, mention all sorting algorithms (out of selection sort, bubble sort, insertion sort, shaker sort, quick sort and merge sort) that are consistent with this sequence.
      0 7 2 4 5 3 9 1
      0 7 2 4 3 5 1 9
      0 2 4 7 1 3 5 9
      0 1 2 3 4 5 7 9
      
      Answer: merge sort.

      Can't be selection or insertion sort because they would take 8 or 7 passes (some of passes in insertion sort may take constant time). Also, the last transition cannot be done by selection or insertion (they would have all but 1 number sorted at the last step).

      The last transition cannot be done by shaker or bubble sort either because 1 moves more than one position to the left, and 7 moves more than one position to the right.

      quicksort cannot produce that sequence because there is no consistent set of pivots.

    2. 15 pts: Implement shaker sort and prove its worst-case and best-case big-theta complexity (without referring to Chapter 6 in Sedgewick). Show best-case and worst-case inputs.

      See solutions to HWK #3 when they are posted.