MIT6 Solution File
MIT6 Solution File
MIT6 Solution File
(b) T F [2 points] Radix sort runs correctly when using any correct sorting algorithm to
sort each digit.
(c) T F [2 points] Given an array A[1 . . n] of integers, the running time of Counting Sort
is polynomial in the input size n.
Solution: False. Counting Sorts running time depends on the size of the num-
bers in the input, so it is pseudo-polynomial.
(d) T F [2 points] Given an array A[1 . . n] of integers, the running time of Heap Sort is
polynomial in the input size n.
Solution: True. Heap Sort runs in O(n log n) time on a RAM machine.
(e) T F [2 points] Any n-node unbalanced tree can be balanced using O(log n) rotations.
Solution: False. The worst-case unbalanced tree is a list, and balancing it re-
quires (n) rotations.
(f) T F [2 points] If we augment an n-node AVL tree to store the size of every rooted
subtree, then in O(log n) we can solve a range query: given two keys x and y,
how many keys are in the interval [x, y]?
Solution: True. AVL trees can be used to sort N numbers in O(N log N ) time,
by inserting all the numbers in the tree, and iteratively calling N EXT-L ARGEST
N times.
Solution: False. The level of a vertex only provides the length of the shortest
path from s.
(i) T F [2 points] Depth-first search will take (V 2 ) time on a graph G = (V, E) repre-
sented as an adjacency matrix.
Solution: True. In this case, finding the neighbors of a vertex takes O(V ) time,
which makes the total running time (V 2 ).
6.006 Final Exam Solutions Name 3
Solution: False. The adjacency list structure needs to be traversed to find the
incoming edges for each vertex. This structure has total size (V + E), so this
takes (V + E) time to compute.
Solution: True. We need to first perform a modification like the one seen in the
recitation notes.
(n) T F [2 points] For every dynamic program, we can assign weights to edges in the
directed acyclic graph of dependences among subproblems, such that finding a
shortest path in this DAG is equivalent to solving the dynamic program.
(a) [3 points] You are running a library catalog. You know that the books in your col-
lection are almost in sorted ascending order by title, with the exception of one book
which is in the wrong place. You want the catalog to be completely sorted in ascending
order.
1. Insertion Sort
2. Merge Sort
3. Radix Sort
4. Heap Sort
5. Counting Sort
1. I NIT(N ): Initialize the data structure for N empty rooms numbered 1, 2, . . . , N , in polyno-
mial time.
2. C OUNT(l, h): Return the number of available rooms in [l, h], in O(log N ) time.
3. C HECK I N(l, h): In O(log N ) time, return the first empty room in [l, h] and mark it occupied,
or return N IL if all the rooms in [l, h] are occupied.
4. C HECKO UT(x): Mark room x as not occupied, in O(log N ) time.
(a) [6 points] Describe the data structure that you will use, and any invariants that your
algorithms need to maintain. You may use any data structure that was described in a
6.006 lecture, recitation, or problem set. Dont give algorithms for the operations of
your data structure here; write them in parts (b)(e) below.
Solution: We maintain a range tree, where the nodes store the room numbers of the
rooms that are not occupied.
Recall from Problem Set 3 that a range tree is a balanced Binary Search Tree, where
each node is augmented with the size of the nodes subtree.
1
Conferences often reserve a contiguous block of rooms, and attendees want to stay next to people with similar
interests.
6.006 Final Exam Solutions Name 7
(b) [3 points] Give an algorithm that implements I NIT(N ). The running time should be
polynomial in N .
Solution: All the rooms are initially empty, so all their numbers (1 . . . N ) must be
inserted into the range tree.
I NIT(N )
1 for i 1 . . . N
2 I NSERT(i)
Solution: The C OUNT method in range trees returns the desired answer. The num-
ber of tree nodes between l and h is exactly the number of unoccupied rooms in the
[l, h] interval.
6.006 Final Exam Solutions Name 8
(d) [5 points] Give an algorithm that implements C HECK I N(l, h) in O(log N ) time.
Solution: Finding the first available room with number l is equivalent to finding
the successor of l 1 in the BST. The
C HECK I N(l, h)
1 r = N EXT-L ARGEST(l 1)
2 if r.key > h
3 return N IL
4 D ELETE(r.key)
5 return r.key
(e) [3 points] Give an algorithm that implements C HECKO UT(x) in O(log N ) time.
Solution: When a guest checks out of a room, the room becomes unoccupied, so its
number must be inserted into the range tree.
C HECKO UT(x)
1 I NSERT(x)
6.006 Final Exam Solutions Name 9
(a) [5 points] Assume that there are exactly k slots in the table that are completely full.
What is the probability s(k) that the first probe is successful, given that there are
exactly k full slots?
Solution: There are m k possibilities for a successful landing of the first probe out
of m total landings. The probability of landing in any slot is m1 . Therefore, success
probability is mk
m
.
(b) [5 points] Assume that p(k) is the probability that there are exactly k slots in the table
that are completely full, given that there are already n keys in the table. What is the
probability that the first probe is successful in terms of p(k)?
Solution: n
2
X (m k)
p(k)
k=0
m
Solution: p(0) is essentially the probability that no keys collide. The probability
that the first element doesnt collide with any previous keys is 1. The probability that
the second element doesnt collide with any previous keys is 1 1/m. In general, the
probability that the ith element doesnt collide with any previous keys, conditioned
on the assumption that previous keys did not collide and thus occupy i 1 slots, is
1 (i 1)/m. Therefore the overall probability is the product
n
Y i1 m! 1
1 = n.
i=1
m (m n)! m
6.006 Final Exam Solutions Name 10
Solution: There are two solutions to this problem. The first is the direct application of Newtons
method.
The second way is to use the formula for the roots of a quadratic equation and compute
3 to d digits of precision.
For the first method, we use Newtons formula:
f (xi )
xi+1 = xi
f 0 (xi )
(a) [10 points] Suppose you are also given a lookup table T where T [u] for u V is
a list of guests that u knows. If u knows v, then v knows u. You are required to
arrange the seating such that any guest at a table knows every other guest sitting at the
same table either directly or through some other guests sitting at the same table. For
example, if x knows y, and y knows z, then x, y, z can sit at the same table. Describe
an efficient algorithm that, given V and T , returns the minimum number of tables
needed to achieve this requirement. Analyze the running time of your algorithm.
N UM -TABLES(V, T )
1 visited = {}
2 n =0
3 for s V
4 if s
/ visited
5 n = n +1
6 add s to visited
7 DFS-V ISIT(s, T, visitied )
8 return n
(b) [10 points] Now suppose that there are only two tables, and you are given a different
lookup table S where S[u] for u V is a list of guests who are on bad terms with u.
If v is on bad terms with u, then u is on bad terms with v. Your goal is to arrange the
seating such that no pair of guests sitting at the same table are on bad terms with each
other. Figure 1 below shows two graphs in which we present each guest as a vertex
and an edge between two vertices means these two guests are on bad terms with each
other. Figure 1(a) is an example where we can achieve the goal by having A, C sitting
at one table and B, E, D sitting at another table. Figure 1(b) is an example where we
cannot achieve the goal. Describe an efficient algorithm that, given V and S, returns
T RUE if you can achieve the goal or FALSE otherwise. Analyze the running time of
your algorithm.
conflict
A B C A B C
D E D E
(a) (b)
Solution: Let G = (V, E) be the undirected graph where V is the set of guests
and (u, v) E if u and v are on bad terms. S represents the adjacency lists. We
can achieve the goal only if there is no cycle with odd length in the graph. We can
find out this by iterating through s V . If s is not visited, color it as W HITE, and
call DFS-V ISTI(s, S) or BFS(s, S). During the traversal, if v is not visited, mark it
as visited and color it B LACK if its parent is W HITE and vice versa. If v is visited,
and the color we want to apply is different from its current color, we find a conflict
(Figure 1(b)), and we can terminate and return FALSE. If there is no conflict after
iterating through all the vertices (Figure 1(a)), return T RUE. The running time is again
O(V + E). Below is the pseudocode.
C AN -S EPARATE(V, S)
1 color = {}
2 W HITE = 0
3 for s V
4 if s
/ color // s is not visited
5 if DFS-V ISIT(s, S, W HITE, color ) == FALSE
6 return FALSE
7 return T RUE
6.006 Final Exam Solutions Name 13
DP (i + 1, X) + xi ,
(a) DP (i, X) = max
DP (i + 1, X xi ) + x2i if X xi
1. Exponential
2. Polynomial
3. Pseudo-polynomial
4. Infinite
Solution: Pseudo-polynomial
DP (i + 1, S) + xi ,
(b) DP (i, X) = max
DP (0, X xi ) + x2i if X xi
1. Exponential
2. Polynomial
3. Pseudo-polynomial
4. Infinite
Solution: Infinite
6.006 Final Exam Solutions Name 15
DP (i + 1, 0) + xi ,
(c) DP (i, X) = max
DP (0, X xi ) + x2i if X xi
1. Exponential
2. Polynomial
3. Pseudo-polynomial
4. Infinite
DP (i + 1, X) + xi ,
(d) DP (i, X) = max
DP (i + 1, 0) + x2i
1. Exponential
2. Polynomial
3. Pseudo-polynomial
4. Infinite
Solution: Polynomial
DP (i + 1, X S) + ( S)2
P P
(e) DP (i, X) = max
for every subset S {x0 , x1 , . . . , xn1 }
1. Exponential
2. Polynomial
3. Pseudo-polynomial
4. Infinite
Solution: Exponential
Solution: pseudopolynomial
infinite
pseudopolynomial
polynomial
exponential
6.006 Final Exam Solutions Name 16
x1 = 13 x2 = 93 x3 = 86 x4 = 50 x5 = 63 x6 = 4
Then DP (5, T RUE) = 4, because the longest possible alternating sequence ending in x5 with an
increase at the end is is x1 , x2 , x4 , x5 or x1 , x3 , x4 , x5 . However, DP (5, FALSE) = 3, because if the
sequence has to decrease at the end, then x4 cannot be used.
(a) [4 points] Compute all values of DP (i, b) for the above sequence. Place your answers
in the following table:
i=1 i=2 i=3 i=4 i=5 i=6
b = T RUE
b = FALSE
The second mistake was over the definition of DP (i, b). In the problem, we explicitly
define DP (i, b) to be the length of the longest subsequence that ends on xi and is
increasing iff b is T RUE. As a result, DP (6, T RUE) is equal to 1, not 4, because the
only ascending subsequence ending on the value x6 = 4 is the subsequence hx6 i.
(b) [4 points] Give a recurrence relation to compute DP (i, b).
The most common mistake for this problem involved confusion over the definition of
DP (i, b). Many people gave or attempted to give the following recurrence:
DP (i 1, T RUE)
DP (i, T RUE) = max
DP (i 1, FALSE) + 1 if xi > xi1
DP (i 1, FALSE)
DP (i, FALSE) = max
DP (i 1, T RUE) + 1 if xi < xi1
Unfortunately, this recurrence does not compute the value that we asked for. The value
DP (i, b) is specifically defined as the length of the longest alternating subsequence
that ends with xi , and ends in an ascending pair if and only if b is T RUE. The above
recurrence relation instead computes the length of the longest alternating subsequence
of x1 , . . . , xi , not necessarily ending on xi , that ends in an ascending pair if and only
if b is T RUE.
(c) [4 points] Give the base cases of your recurrence relation.
Solution: The base cases matching the recurrence relation above are:
DP (i, T RUE) = 1 if xi = min{x1 , . . . , xi }
DP (i, FALSE) = 1 if xi = max{x1 , . . . , xi }
Solution: The correct order is to iterate through the values of i in increasing order,
and compute DP (i, T RUE) and DP (i, FALSE) for each i. The recurrence relation has
DP (i, b) dependent only on values DP (j, b) for j < i, so increasing order will give us
what we want.
(e) [3 points] If you were given the values of DP (i, b) for all 1 i n and all b
{T RUE, FALSE}, how could you use those values to compute the length of the longest
alternating subsequence of x1 , x2 , . . . , xn ?
6.006 Final Exam Solutions Name 18
Solution: There were multiple acceptable answers here. Its sufficient to either take
the maximum of DP (n, T RUE) and DP (n, FALSE), or to take the maximum over all
values in the table.
(f) [2 points] When combined, parts (b) through (e) can be used to write an algorithm
such as the following:
Solution: Computing the recurrence for DP (i, b) takes time (i). When we sum
this up over the values of i ranging from 1 to n, we get (n2 ) for our running-time.
Note, however, that what mattered for this question was correctly analyzing the run-
time for the recurrence relation you gave, so answers of O(n2 ) would be marked wrong
(asymptotically loose) if the recurrence relation given actually resulted in a runtime of
(n).
6.006 Final Exam Solutions Name 19
Wrong answer: 6 + (0 6) = 6 + 0 = 6. Wrong answer: 0.1 (0.1 + 0.1) = 0.1 0.2 = 0.02.
Right answer: (6 + 0) 6 = 6 6 = 36. Right answer: (0.1 0.1) + 0.1 = 0.01 + 0.1 = 0.11.
To save yourself from tedium, but still impress your friends, you decide to implement an algorithm
to solve these puzzles. The input to your algorithm is a sequence x0 , o0 , x1 , o1 , . . . , xn1 , on1 , xn
of n + 1 real numbers x0 , x1 , . . . , xn and n operators o0 , o1 , . . . , on1 . Each operator oi is either
addition (+) or multiplication (). Give a polynomial-time dynamic program for finding the optimal
(maximum-outcome) parenthesization of the given expression, and analyze the running time.
Solution: The following dynamic program is the intended correct answer, though it ignores a
subtle issue detailed below (which only three students identified, and received bonus points for).
It is similar to the matrix-multiplication parenthesization dynamic program we saw in lecture, but
with a different recurrence.
1. For subproblems, we use substrings xi , oi , . . . , oj 1 , xj , for each 0 i j n. Thus there are
(n2 ) subproblems.
2. To solve DP [i, j], we guess which operation ok is outermost, where i k < j. There are
j i = O(n) choices for this guess.
3. The resulting recurrence relation is
j 1
DP [i, j] = max DP [i, k] ok DP [k + 1, j] .
k=i
The subtle issue is that this dynamic program assumes that, in order to maximize the sum or product
of two numbers, we aim to maximize the two arguments. This assumption is true if the numbers
are all nonnegative, as in the examples. If some numbers can be negative, however, then it is not so
easy to maximize the product of two numbers. If both of the numbers are negative, so the product is
negative, then the goal is to minimize both numbers (i.e., maximizing their absolute values); but if
exactly one of the numbers is negative, so the product is negative, then maximization is equivalent
to maximizing the negative number and minimizing the positive number (i.e., minimizing their
absolute values).
To deal with this issue, we can define two subproblems: DPmax [i, j] is the maximum possible
value for the substring xi , . . . , xj , as above, while DPmin [i, j] is the minimum possible value for
the same substring. Instead of working out which of the two subproblems we need, we can simply
guess among the four possibilities, and choose the best. The recurrence relation thus becomes
j1
DPm [i, j] = m m DPm1 [i, k] ok DPm2 [k + 1, j] .
k=i m1 ,m2 {max,min}
A magic balance scale with 3 pans. When given 3 balls of fluff, the scale will point out the
ball with the median weight. The scale only works reliably when each pan has exactly 1 ball
of fluff in it. Let M EDIAN(x, y, z) be the result of weighing balls x, y and z, which is the
ball with the median weight. If M EDIAN(x, y, z) = y, that means that either x < y < z or
z < y < x.
A high-precision classical balance scale. This scale takes 2 balls of fluff, and points out which
ball is lighter; however, because fluff is very light, the scale can only distinguish between the
overall lightest and the overall heaviest balls of fluff. Comparing any other balls will not
yield reliable results. Let L IGHTEST(a, b) be the result of weighing balls a and b. If a is the
lightest ball and b is the heaviest ball, L IGHTEST(a, b) = a. Conversely, if a is the heaviest
ball and b is the lightest ball, L IGHTEST(a, b) = b. Otherwise, L IGHTEST(a, b)s return value
is unreliable.
On the bright side, you can assume that all N balls have different weights. Naturally, you want to
sort the balls using as few weighings as possible, so you can escape your dream quickly and wake
up before 4:30pm!
To ponder this challenge, you take a nap and enter a second dream within your first dream. In the
second dream, a fairy shows you the lightest and the heaviest balls of fluff, but she doesnt tell you
which is which.
(a) [2 points] Give a quick example to argue that you cannot use M EDIAN alone to
distinguish between the lightest and the heaviest ball, but that L IGHTEST can let you
distinguish.
(b) [4 points] Given l, the lightest ball l pointed out by the fairy, use O(1) calls to
M EDIAN to implement L IGHTER(a, b), which returns T RUE if ball a is lighter than
ball b, and FALSE otherwise.
Solution:
L IGHTER(a, b)
1 if a == l
2 return a
3 if b == l
4 return b
5 if M EDIAN(l, a, b) == a
6 return a
7 else
8 return b
After waking up from your second dream and returning to the first dream, you realize that there is
no fairy. Solve the problem parts below without the information that the fairy would have given
you.
(c) [6 points] Give an algorithm that uses O(N ) calls to M EDIAN to find the heaviest
and lightest balls of fluff, without identifying which is the heaviest and which is the
lightest.
6.006 Final Exam Solutions Name 23
Solution: The pseudo-code below starts out by weighing the first 3 balls, and repeat-
edly replaces the median with a new ball, until the balls runs out. The two remaining
balls must be the extremes, because an extreme will never be a median, and therefore
will never be eliminated.
E XTREMES(b, N )
1 x, y = b1 , b2
2 for i 3 . . . N
3 z = bi
4 m = M EDIAN(x, y, z)
// Set x and y to non-median balls
5 if x == m
6 x, y = y, z
7 if y == m
8 y=z
9 return (x, y)
It is not sufficient to call M EDIAN on all 3 groups of adjacent balls and hope that it
will rule out all the balls except for the two extremes. Example: given 4 balls with
weights 3471, M EDIAN would point at the 2nd ball twice.
6.006 Final Exam Solutions Name 24
(d) [2 points] Explain how the previous parts should be put together to sort the N balls
of fluff using O(N log N ) calls to M EDIAN and O(1) calls to L IGHTEST.
Solution: Call E XTREMES (the answer to part c) to obtain the lightest and heaviest
balls, then call L IGHTEST to obtain the lightest ball. Last, use L IGHTER (the answer
to part b) as the comparison operator in a fast (O(N log N ) time) comparison-based
sorting algorithm.
Out of the algorithms taught in 6.006, insertion sort with binary search makes the
fewest comparisons. Other acceptable answers are merge-sort and heap-sort, as they
all use O(N log N ) comparisons.
(e) [6 points] Argue that you need at least (N log N ) calls to M EDIAN to sort the N
fluff balls.
Solution: The argument below closely follows the proof of the (N log N ) lower
bound for comparison-based sorting.
A call to M EDIAN has 3 possible outcomes, so a decision tree based on M EDIAN
calls would have a branching factor of 3. There are N ! possible ball permutations,
so the decision tree needs (log3 N !) = (log N !) = (N log N ) levels to cover all
possible N ! permutations.
L OWEST only provides useful information if it is called once, and it reduces the pos-
sible permutations to N2 ! . This doesnt change the result above, because the constant
factor gets absorbed by the asymptotic notation.
The lower bound obtained from comparison-based sorting cannot be used without ar-
gument, because it is not obvious that this problem is harder than comparison-based
sorting. To use this bound correctly, a solution would have to prove that comparison-
based sorting can be reduced to this problem, by implementing M EDIAN and L IGHTEST
with O(1) comparisons each.
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.