CSE 373 Assignment 4 solutions

Problem 1

(a) The BuildHeap function in the book treats the initial array as a heap and then uses PercolateDown on the first N/2 elements to lower them into their correct position. This algorithm runs in O(N) time, and is different from simply doing N Inserts on an empty heap, which is slower. You needed to use the correct BuildHeap algorithm as given in the book.

Initial organization

Step 1,2,3: PercolateDown on 83, then 9, then 17

Step 4,5: PercolateDown on 40, then 12

Step 6: PercolateDown on 35

Final array: 2 5 35 6 9 36 40 17 66 41 12 83

(b) Each time an element is tested to move down one step in PercolateDown, two comparisons are needed. One to figure out which child is smaller, and one to figure out if the element is larger than its smallest child. The total number of element comparisons is 16.

The element at index 1 of the original array is 35. When it is percolated down (in step 6 in the diagram), the following comparisons are made: 2-5, 35-2, 36-40, 35-36.

(c) We put 1 in the next free slot and let it rise to its proper position. The final array looks like:

1 5 2 6 9 35 40 17 66 41 12 83 36

Problem 2

(a)

(b)

(c)

Problem 3

Some confusion was caused by the leaf nodes. The C and C++ books slightly vary in how many values a leaf node can hold. One says M/2 to M, the other says L/2 to L, for any arbitrary L. I was using yet another definition, where leaves can hold 1 to L values. All three are possible; there are many variations of B-trees out there. For the purposes of this assignment, some people may have felt that 48 should be combined with the 25,26 node. That is ok, since there were these various definitions floating around.

(a) Root has max 6 children, each child has max 6 leaf children. Total: 36.

(b) Root has min 2 children, each child has min 3 leaf children. Total: 6.

(c)

Deleting 59 empties out a leaf node, and its sibling doesn't have extra values to lend. It is removed, leaving the rightmost interior node with only one child. Again, it can't borrow values from its sibling, so it is merged with its sibling. This reduces the root to only 2 children, which is ok.

(d) You don't have to do anything, since all the interior navigation values still work. However, the book implies that the navigation values should be equal to the smallest value in that subtree, not just less-than-or-equal. In that case, the root's value would need to change to 24. Either answer is acceptable.

Problem 4

(a) T(N) = 3 T(N/3) + 1
T(0) = T(1) = T(2) = 1

(b) T(N) = 3 T(N/3) + N

(c) T(N) = T(N/3) + 1
T(0) = 1

(d) T(N) = T(N/3) + 1 = T(N/9) + 1 + 1 = T(N/27) + 1 + 1 + 1 ...
Therefore, T(N) = O(log3 N) = O(log N)

Problem 5

(a) If the input is sorted, then the middle element of the array will always be the median. Ignoring the issue of duplicate elements, or better yet handling duplicate elements correctly (i.e. dividing them evenly among the two partitions), this will lead to a perfect split of the array into two equal halves at each step. Quicksort's best case is O(N log N), which we achieve here.

(b)  At each step, we want the partitioned halves to have either the maximum or minimum element in the middle, so that each partition results in a size 0 half and a size N-1 half. After selecting the pivot, we swap it with the last element of the array to get it out of the way

2 4 6 8 1 5 3 7
2 4 6 7 1 5 3 8
2 4 6 3 1 5 7 8
2 4 5 3 1 6 7 8
2 4 1 3 5 6 7 8
2 3 1 4 5 6 7 8
2 1 3 4 5 6 7 8
1 2 3 4 5 6 7 8

Problem 6

Most people found that Insertion sort was only fastest on N=20, although Quicksort was often close or even better. This is because we used a naive cost function that treats all operations the same. Flipping through an old reference manual, I estimate the following true costs (these vary depending on processor, and the true details of instruction timing are much more complex than this simple summary):

Comparison: 1 clock cycle
Assignment: 1-2 clock cycles
Math: 2-20 clock cycles, depending on whether a multiply/divide is involved
Function call: 20-35 clock cycles

If we had used these weightings, we'd see that Insertion sort does quite well for small N since it does no math or function calls. The others are significantly more expensive than they appear, since they do so many function calls.