CSE 548: (Design And) Analysis of Algorithms

Intro Aggregate Charging Potential Table resizing Disjoint sets
CSE 548: (Design and) Analysis of Algorithms

Amortized Analysis
R. Sekar
1 / 33
Intro Aggregate Charging Potential Table resizing Disjoint sets Motivation
Amortized Analysis
Amortization
The spreading out of capital expenses for intangible assets over a
specific period of time (usually over the asset’s useful life) for
accounting and tax purposes.
A clever trick used by accountants to average large one-time costs

over time.
In algorithms, we use amortization to spread out the cost of
expensive operations.
Example: Re-sizing a hash table.
2 / 33
Intro Aggregate Charging Potential Table resizing Disjoint sets Motivation
Topics
1. Intro Amortized Rehashing

Motivation Vector and String Resizing
6. Disjoint sets
2. Aggregate
Inverted Trees
3. Charging
Union by Depth
4. Potential Threaded Trees
5. Table resizing Path compression
3 / 33
Summation or Aggregate Method
Some operations have high worst-case cost, but we can show that
the worst case does not occur every time.
In this case, we can average the costs to obtain a better bound
Summation
Let T (n) be the worst-case running time for executing a sequence of
n operations. Then the amortized time for each operation is T (n)/n.
Note: We are not making an “average case” argument about inputs.

We are still talking about worst-case performance.
4 / 33
Summation Example: Binary Counter
What is the worst-case runtime of incr?

Simple answer: O(log n), where n = # of incr’s
Incr(B[0..]) performed
i=0 What is the amortized runtime for n incr’s?
while B[i] = 1 Easy to see that an incr will touch B[i] once every
B[i] = 0 2i operations.
i ++ Number of operations is thus
B[i] = 1 log n
X 1
n = 2n
2i
i=0
Thus, amortized cost per incr is O(1)
5 / 33
Charging Method
Certain operations charge more than their cost so as to pay for other
operations. This allows total cost to be calculated while ignoring the
second category of operations.
In the counter example, we charge 2 units for each operation to

change a 0-bit to 1-bit.
Pays for the cost of later flipping the 1-bit to 0-bit.
Important: ensure you have charged enough.
We have satisfied this: a bit can be flipped from 1 to 0 only once after it is
flipped from 0 to 1.
Now we ignore costs of 1 to 0 flips in the algorithm

There is only one 0-to-1 bit flipping per call of incr!
So, incr only costs 2 units for each invocation! 6 / 33
Stack Example
Consider a stack with two operations:

push(x): Push a value x on the stack
pop(k): Pop off the top k elements
What is the cost of a mix of n push and pop operations?
Key problem: Worst-case cost of a pop is O(n)!
Solution:
Charge 2 units for each push: covers the cost of pushing, and also the
cost of a subsequent pop
A pushed item can be popped only once, so we have charged enough
Now, ignore pop’s altogther, and trivially arrive at O(1) amortized cost
for the sequence of push/pop operations!
7 / 33
Potential Method
Define a potential for a data structure that is initially zero, and is

always non-negative. The amortized cost of an operation is the cost
of the operation minus the change in potential.
Analogy with “potential” energy. “Potential” is prepaid cost that

can be used subsequently
as the data structure changes and “releases” stored energy
A more sophisticated technique that allows “charges” or “taxes” to

be stored within nodes in a data structure and used subsequently
at a later time.
8 / 33
Potential Method: Illustration
Stack:
Each push costs 2 units because a push increases potential
energy by 1.
Pops can use the energy released by reduction in stack size!
Counter:
Define potential as the number one 1-bits
Changing a 0 to 1 costs 2 units, one for the operation and one
to pay for increase in potential
Changes of 1 to 0 can now be paid by released potential.
9 / 33
Intro Aggregate Charging Potential Table resizing Disjoint sets Amortized Rehashing Vector and String Resizing
Hash Tables
To provide expected constant time access, collisions need to be

limited
This requires hash table resizing when they become too full
But this requires all entries to be deleted from current table and
inserted into a table that is larger — a very expensive operation.
Options:
1. Try to guess the table size right; if you guessed wrong, put up with the
pain of low performance.
2. Quit complaining, bite the bullet, and rehash as needed;
3. Amortize: Rehash as needed, and prove that it does not cost much!
10 / 33
Amortized Rehashing
Amortize the cost of rehashing over other hash table operations
Approach 1: Rehash after a large number (say, 1K) operations.

Total cost of 1K ops = 1K for the ops + 1K for rehash = 2K
Note: We may have at most 1K elements in the table after 1K
operations, so we may need to rehash at most 1K times.
So, amortized cost is just 2!
Are we done?
11 / 33
Amortized Rehash (2)
Are we done?
Consider total cost after 2K, 3K, and 4K operations:
T (2K) = 2K + 1K (first rehash) + 2K (second rehash) = 5K

T (3K) = 3K + 1K (1st rehash) + 2K (2nd rehash) + 3K (3rd ...) = 9K
T (4K) = 4K + 1K + 2K + 3K + 4K = 14K
Hmmm. This is growing like n2 , so amortized cost will be O(n)
Need to try a different approach.
12 / 33
Approach 2: Double the hash table whenever it gets full

Say, you start with an empty table of size N. For simplicity, assume
only insert operations.
You invoke N insert operations, then rehash to a 2N table.
T (N) = N + N (rehashing N entries) = 2N
Now, you can insert N more before needing rehash.
T (2N) = T (N) + N + 2N (rehashing 2N entries) = 5N
Now, you can insert 2N more before needing rehash:
T (4N) = T (2N) + 2N + 4N (rehashing 4N entries) = 11N
The general recurrence is T (n) = T (n/2) + 1.5n, which is linear.
So, amortized cost is constant!
13 / 33

Alternatively, we can think in terms of charging.
Each insert operation can be charged 3 units of cost:
One for the insert operation
One for rehashing of this element at the end of this run of inserts
One for rehashing an element that was already in the hash table
when this run began
A run contains as many elements as the hash table at the beginning
of run — so we have accounted for all costs.
Thus, rehashing
increases the costs of insertions by a factor of 3.
lookup costs are unchanged.
14 / 33

Alternatively, we can think in terms of potential.
Hash table as a spring: as more elements are inserted, the spring
has to be compressed to make room.
Let |H| denote the capacity and α the occupancy of H
Define potential as 0 when α ≤ 0.5 and 2(α − 0.5)|H| otherwise.
Immediately after resize, let the hash table capacity be k. Note
α ≤ 0.5 so potential is 0.
Each insert (after α reaches 0.5) costs 3 units: one for the
operation, and 2 for the increase in potential.
When α reaches 1, the potential is 2k. After resizing to 2k, potential
falls to 0, and the released 2k cost pays for rehashing 2k elements.
15 / 33
What if we increase the size by a factor less than 2?

Is there a threshold t > 1 such that expansion by a factor less than t
won’t yield amortized constant time?
What happens if we want to support both deletes and inserts, and
want to make sure that the table never uses more than k times the
actual number of elements?
Is there a minimum value of k for which this can be achieved?
Do you need a different threshold for expansion and contraction? Are
there any constraints on the relationship between these two thresholds
to ensure amortized constant time?
16 / 33
Amortized performance of Vectors vs Lists

Linked lists: Data structures of choice if you don’t know the total
number of elements in advance.
Space inefficient: 2x or more memory for very small objects.
Poor cache performance: Pointer chasing is cache unfriendly.
Sequential access: No fast access to kth element.
Vectors: Dynamically-sized arrays have none of these problems. But
resizing is expensive.
Is it possible to achieve good amortized performance?
When should the vector be expanded/contracted?
What operations can we support in constant amortized time?
Inserts? insert at end? concatenation?
Strings: We can raise similar questions as Vectors.
17 / 33
t, arranged in Charging
Intro Aggregate no particular order,
Potential Table resizing and each
Disjoint sets has parent
Inverted Trees pointers
Union by Depththat eventually
Threaded lead up
Trees Path compression
e root of the tree. This root element is a convenient representative, or name, for the set.
distinguished from the other elements by the fact that its parent pointer is a self-loop.
Disjoint Sets
In addition to a parent pointer π, each node also has a rank that, for the time being, shou
interpreted as the height of the subtree hanging from that node.
Represent disjoint sets as “inverted trees”
procedure makeset(x)
π(x) = x
Each element has a
rank(x) = 0
parent pointer π
To compute
function find(x)the union of set A with B, simply make B’s root the
while x �= of
parent π(x)
A’s: root.
x = π(x)
gure 5.5 A directed-tree representation of two sets {B, E} and {A, C, D, F, G, H}.
E H
B C D F
G A
18 / 33
et, arranged in no particular order, and each has parent pointers that eve
Intro Aggregate Charging Potential Table resizing Disjoint sets Inverted Trees Union by Depth Threaded Trees Path compression
he root of the tree.

Disjoint SetsThis root element is a convenient representative, or na
(2)
distinguished from the other elements by the fact that its parent pointer
In addition to a parent pointer π, each node also has a rank that, for the
Complexity
e interpreted as the height of the subtree hanging
makeset from
takes O(1)that node.
time
procedure makeset(x) find takes time equal to depth of
π(x) = x
rank(x) = 0 set: O(n) in the worst case.
union takes O(1) time on a root
function find(x)
while x �= π(x) : x = π(x) element; in the worst case, its
return x complexity matches find.
igure 5.5 A directed-tree
procedure Amortized
union(x, y) representation complexity
of two sets {B, E} and {A, C, D, F
s canrxbe
= expected,
find(x) makeset is a constant-time operation.
Can you construct On the other h
a worst-case
arentrypointers
= find(y)to the root of the tree and therefore takes time proportion
example, where
E built via H N operations take
he tree. The tree actually gets the third operation, union, and
π(ry ) = rx
ure that this procedure keeps trees shallow.
O(N 2 ) time?
Merging two sets is easy: make the Can
rootwe
of one pointthis?
improve to the root of the o
choice here. If the representatives (roots) of the sets are r and r ,19 /do
33
nion by rank
to ry or the other way around? tree. This
Since
way,tree the height
overallisheight the main increas im
Inverted Trees Union by Depth Threaded Trees Path compression
ne way to store a set is

efﬁciency, a good strategy as a directed tree
is to make
Instead (Figure
ofDepth 5.5). Nodes
the root computing
explicitly of
of the shorter the tree
tree ap
heights
Disjoint
t, arranged in Sets
no with
particular
tree. This way, the overall
Union
order, and by each
height increases has parent
onlythis pointers
if the that
two trees evebe
nodes—which is why scheme is call
e root ofInstead
the tree. of This
explicitlyroot element
computing is aheights
convenient representative,
of trees, we will useorthe na
distinguished
nodes—which from theisother why this elements scheme byisthe fact union
called that its byparent
rank. pointer
In addition to a parent pointer π, each node also has
procedure a rank that,
union(x, y) for the t
interpreted as the height of the subtree rx =hanging
find(x) from that node.
procedure union(x, y)
ry = find(y)
procedure find(x)
rx =makeset(x)
if rx = ry : return
π(x) = x y r = find(y)
if rank(rx ) > rank(ry ):
rank(x) =if 0 rx = ry : return
if rank(rx ) > rank(ry ): π(ry ) = rx
π(ry ) = rx else:
function find(x)
while x else: �= π(x) : x = π(x) π(rx ) = ry
return x π(rx ) = ry if rank(rx ) = rank(ry ) : ra
if rank(rx ) = rank(ry ) : rank(ry ) = rank(ry ) + 1
gure 5.5 A directed-tree representation of two sets {B, E} and {A, C, D, F
s can be expected, makeset is a constant-time See Figure operation.5.6 for an example.On the other h
See Figure 5.6 for an example.
arent pointers to the root of the tree and therefore takes time proportiona
builtBy viadesign,
rank of a node is the height
e tree. The tree actually gets E of subtree rootedthe
the third rank
atoperation,
H of a node
that node. union, is exac
and
By design,
re that this procedure keeps trees the rank of a
means, node
shallow. is exactly the height
for instance, that as you move of the subtr up
20 / 33
Disjoint Sets with Union by Depth (2)
Figure 5.6 A sequence of disjoint-set operations. Superscripts denote rank.
After makeset(A), makeset(B), . . . , makeset(G):
A0 B
0 0
C D
0
E
0
F0 G
0
After union(A, D), union(B, E), union(C, F ):
D1 E1 F1 G0
A0 B
0
C0
After union(C, G), union(E, A):
D2 F1 21 / 33
0 0
Disjoint Sets withA Union by
0
B
DepthC (3)
After union(C, G), union(E, A):
D2 F1
E1
A0 C
0 0
G
B0
After union(B, G):
D2
E1 F1
A0
B0 C0 G
0
22 / 33
Complexity of disjoint sets w/ union by depth
Asymptotic complexity of makeset unchanged.

union has become a bit more expensive, but only modestly.
What about find?
A sequence of N operations can create at most N elements
So, maximum set size is O(N)
With union by rank, each increase in rank can occur only after a
doubling of elements in the set
Observation
The number of nodes of rank k never exceeds N/2k
So, height of trees is bounded by log N
23 / 33
Complexity of disjoint sets w/ union by depth (2)
Height of trees is bounded by log N

Thus we have a complexity of O(log N) for find
Question: Is this bound tight?
From here on, we limit union operations to only root nodes, so

their cost is O(1).
This requires find to be moved out of union into a separate

operation, and hence the total number of operations increases,
but only by a constant factor.
24 / 33
Improving find performance

Lecture 17: Disjoint
Idea: Why not force depth to be 1? Then find will have O(1)
will ‘thread’ a linked list through each set, starting at the set’s leader. The tw
complexity!
e UNION algorithm in constant time.
Approach: Threaded Trees
a p a
b c d q r p q r b c d
Merging two sets stored as threaded trees.
Problem:
old arrows pointWorst-case complexity
to leaders; lighter of union
arrows form becomes
the threads. ShadedO(n)
nodes have a new lead
Solution:
UNION(x, y):
Merge smaller set with larger set x ← FIND(x)
Amortize cost of union over other operations
y ← FIND( y)
y←y 25 / 33
Sets w/ threaded trees: Amortized analysis
Other than cost of updating parent pointers, union costs O(1)

Idea: Charge the cost of updating a parent pointer to an element.
Key observation: Each time an element’s parent pointer
changes, it is in a set that is twice as large as before
So, with n operations, you can at most O(log n) parent pointer updates
per element
Thus, amortized cost of n operations, consisting of some mix of

makeset, find and union is at most n log n
26 / 33
Further improvement
Can we combine the best elements of the two approaches?
Threaded trees employ an eager approach to union while the original
approach used a lazy approach
Eager approach is better for find, while being lazy is better for union.
So, why not use lazy approach for union and eager approach for find?
Path compression: Retains lazy union, but when a find(x)

is called, eagerly promotes x to the level beloe the root
Actually, we promote x, π(x), π(π(x)), π(π(π(x))) and so on.
As a result, subsequent calls to find x or its parents become cheap.
From here on, we let rank be defined by the union algorithm
For root node, rank is same as depth
But once a node becomes a non-root, its rank stays fixed,
even when path compression decreases its depth.
27 / 33
Disjoint sets w/ Path compression: Illustration

Figure 5.7 The effect of path compression: find(I) followed by find(K).
find(I) followed by find(K)

A3 A3
1
B0 C E2 B0 C1 E2 F
1
I
0
D0 F1 G1 H
0
−→ D0 G1 H0 J
0
0
I0 J
0
K K0
A3
−→ B0 C
1
E2 F
1
I0 K0 G1
D0 H0 J0
28 / 33
Sets w/ Path compression: Amortized analysis
Amortized cost per operation of n set operations is O(log∗ n) where

log∗ x = smallest k such that log(log(· · · log (x) · · · )) = 1
| {z }
k times
Note: log∗ (x) ≤ 5 for virtually any n of practical relevance.
Specifically, 2
22
log∗ (265536 ) = log∗ (22 ) = 5
Note that 265536 is approximately a 20, 000 digit decimal number.

We will never be able to store input of that size, at least not in our
universe. (Universe contains may be 10100 elementary particles.)
So, we might as well treat log∗ (n) as O(1).
29 / 33
Path compression: Amortized analysis (2)

For n operations, rank of any node falls in the range [0, log n]
Divide this range into following groups:
[1], [2], [3–4], [5–16], [17–216 ], [216 + 1–265536 ], . . .
Each range is of the form [k–2k−1 ]
Let G(v) be the group rank(v) belongs to: G(v) = log∗ (rank(v))
Note: when a node becomes a non-root, its rank never changes
Key Idea
Give an “allowance” to a node when it becomes a non-root. This
allowance will be used to pay costs of path compression operations
involving this node.
For a node whose rank is in the range [k–2k−1 ], the allowance is 2k−1 .
30 / 33
Total allowance handed out

Recall that number of nodes of rank r is at most n/2r
Recall that a node of rank is in the range [k–2k−1 ] is given an
allowance of 2k−1 .
Total allowance handed out to nodes with ranks in the range
[k–2k−1 ] is therefore given by
n n n n
2k−1 k + k+1 + · · · + 2k−1 ≤ 2k−1 k−1 = n
2 2 2 2
Since total number of ranges is log∗ n, total allowance granted to

all nodes is n log∗ n
We will spread this cost across all n operations, thus contributing
O(log∗ n) to each operation.
31 / 33
Paying for all find’s
Cost of a find equals # of parent pointers followed

Each pointer followed is updated to point to root of current set.
Key idea: Charge the cost of updating π(p) to:
Case 1: If G(π(p)) 6= G(p), then charge it to the current find
operation
Can apply only log∗ n times: a leaf’s G-value is at least 1, and the root’s
G-value is at most log∗ n.
Adds only log∗ n to cost of find
Case 2: Otherwise, charge it to p’s allowance.
Need to show that we have enough allowance to to pay each time this case
occurs.
32 / 33
Paying for all find’s (2)

If π(p) is updated, then the rank of p’s parent increases.
Let p be involved in a series of find’s, with qi being its parent after the
ith find. Note
rank(p) < rank(q0 ) < rank(q1 ) < rank(q2 ) < · · ·
Let m be the number of such operations before p’s parent has a higher
G-value than p, i.e., G(p) = G(qm ) < G(qm+1 ).
Recall that
A G(p) = r then r corresponds to a range [k–2k−1 ] where
k ≤ rank(p) ≤ 2k−1 . Since G(p) = G(qm ), qm ≤ 2k−1
The allowance given to p is also 2k−1
So, there is enough allowance for all promotions up to m.
After m + 1th find, the find operation will pay for pointer updates, as
G(π(p)) > G(p) from here on. 33 / 33

CSE 548: (Design And) Analysis of Algorithms

Uploaded by

Copyright:

Available Formats

CSE 548: (Design And) Analysis of Algorithms

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CSE 548: (Design And) Analysis of Algorithms

Uploaded by

Copyright:

Available Formats

Intro Aggregate Charging Potential Table resizing Disjoint sets

CSE 548: (Design and) Analysis of Algorithms

A clever trick used by accountants to average large one-time costs

1. Intro Amortized Rehashing

Summation or Aggregate Method

Note: We are not making an “average case” argument about inputs.

Summation Example: Binary Counter

What is the worst-case runtime of incr?

Thus, amortized cost per incr is O(1)

In the counter example, we charge 2 units for each operation to

Now we ignore costs of 1 to 0 flips in the algorithm

Consider a stack with two operations:

Define a potential for a data structure that is initially zero, and is

Analogy with “potential” energy. “Potential” is prepaid cost that

A more sophisticated technique that allows “charges” or “taxes” to

Potential Method: Illustration

To provide expected constant time access, collisions need to be

Amortize the cost of rehashing over other hash table operations

Approach 1: Rehash after a large number (say, 1K) operations.

Amortized Rehash (2)

T (2K) = 2K + 1K (first rehash) + 2K (second rehash) = 5K

Amortized Rehash (3)

Approach 2: Double the hash table whenever it gets full

Amortized Rehash (4)

Amortized Rehash (5)

Amortized Rehash (6)

What if we increase the size by a factor less than 2?

Amortized performance of Vectors vs Lists

he root of the tree.

ne way to store a set is

Disjoint Sets with Union by Depth (2)

Figure 5.6 A sequence of disjoint-set operations. Superscripts denote rank.

After makeset(A), makeset(B), . . . , makeset(G):

After union(A, D), union(B, E), union(C, F ):

After union(C, G), union(E, A):

After union(B, G):

Complexity of disjoint sets w/ union by depth

Asymptotic complexity of makeset unchanged.

So, height of trees is bounded by log N

Complexity of disjoint sets w/ union by depth (2)

Height of trees is bounded by log N

From here on, we limit union operations to only root nodes, so

This requires find to be moved out of union into a separate

Improving find performance

Sets w/ threaded trees: Amortized analysis

Other than cost of updating parent pointers, union costs O(1)

Thus, amortized cost of n operations, consisting of some mix of

Path compression: Retains lazy union, but when a find(x)

Disjoint sets w/ Path compression: Illustration

find(I) followed by find(K)

Sets w/ Path compression: Amortized analysis

Amortized cost per operation of n set operations is O(log∗ n) where

Note that 265536 is approximately a 20, 000 digit decimal number.

Path compression: Amortized analysis (2)

Total allowance handed out

Since total number of ranges is log∗ n, total allowance granted to

Paying for all find’s

Cost of a find equals # of parent pointers followed

Paying for all find’s (2)