# Amortized Analysis of Algorithms

1. Best, Average and Worst-case Analysis of Algorithms

## Introduction

For some data structures, worst case analysis for some operations might be too pessimistic especially when we are interested in the running time of sequences of operations rather than a single operation. Consider an example of a dynamic array (ArrayList in Java). Most of the time the insertion to the dynamic array takes $O(1)$ time in the worst case. But sometimes (when the array is full) it can take $O(n)$ time to insert. If we do the worst-case analysis of the insert operation, we would say that the running time to insert an item in the dynamic array is $O(n)$. In this case, where occasional operations are slow and most of the other operations are fast, we do the amortized analysis and give the average running time per operation in the worst case.

Amortized analysis is concerned with the overall cost of the operations. It doesn’t say about the complexity of the specific operation in the sequence. There are three approaches to amortized analysis.

1. The Aggregate Method
2. The Accounting Method
3. The Potential Method

I will explain the details of each approach with the help of the Dynamic Array Example.

First of all, let us discuss briefly dynamic arrays.

## Dynamic Arrays

Dynamic arrays are just like the regular static arrays with one major difference i.e. dynamic arrays can change the size. We do not specify the size of the dynamic array at the time of instantiation. The add (or insert) operation inserts a new item to the end of all the item previously inserted. If the array has an empty cell, it inserts the item into the cell. Otherwise, a new array of double the size is allocated and all the items from the old array are copied to this newly allocated array. The copying operation takes $O(n)$ time. The insertion process is illustrated in the figure below.

If you are interested in the implementation, following c++ code is the rough implementation of the add method.

Now we compute the amortized cost of the dynamic array using the three techniques mentioned above.

## The Aggregate Method

The Aggregate technique is the simplest technique for amortized analysis. In this method, we find the cost of all the operations and divide it with the number of operations to get the average cost per operation. If $C_k$ is the total cost of $k$ operations then using the aggregate method, the amortized cost per operation can be computed as
$$\text{amortized cost per operation} = \frac{C_k}{k}$$

In the dynamic array, the cost ($c_i$) of $i^{th}$ insertion can be written as
$$c_i = \begin{cases}i & \text{if i - 1 is power of 2} \\ 1 & \text{ otherwise }\end{cases}$$
The cost of first few operations are tabulated below.

$i$123456789
size1244888816
$c_i$123151119

The sum of all the operations that have $c_i = 1$ is $O(n)$ because there are $n - \log_2 n$ such operations.
$$\text{cost 1} = n - \log_2 n$$
The sum of all the operation that have $c_i = i$ can be calculated as follows.
\begin{align}\text{cost 2} &= (2^0 + 1) + (2^1 + 1) + … + (2^{\log_2 (n - 1)} + 1) \\ &= 2^0 + 2^1 + … + 2^{\log_2 (n - 1)} + \log_2 n \\ &= \frac{2(2^{\log_2 (n - 1)} - 1)}{2 - 1} + \log_2 n \\ &= 2n - 2 + \log_2 n - 1 \end{align}
The total cost is $2n - \log_2 n + n + \log_2 n - 3 = 3n - 3$.
This shows that the overall cost of $n$ insertion operations is $O(n)$. Cost per operation is $\frac{O(n)}{n} = O(1)$. Using the aggregate method, we computed the amortized cost of insertions to be $O(n)$ and amortized cost per insertion to be $O(1)$.

## The Accounting Method

This method is also called a banking method because the method can be viewed from the banker’s perspective. Suppose you have a bank account that never goes into debt i.e. the balance is always $>= 0$. Every time when you do a low-cost operation, you charged it a little bit more than its actual cost and you deposit the surplus to the bank account. In this way when you do the cheap operations, you accumulate the amount in your account. When you do the expensive operation, you charge less than its actual cost and the deficit is paid for by the savings in the bank account in such a way that the balance never goes less than 0. In this way, if you can do all of your expensive operations then the amount you charged for the cheap operation will be your amortized cost.

Let $c_i$ be the actual cost of $i^{th}$ operation and let $\hat{c_i}$ be the amortized cost. If $\hat{c_i} \ge c_i$, the $i^{th}$ operation leaves some positive amount (balance = $\hat{c_i} - c_i$) that can be used later by future operations. To make sure the credit never goes negative, following relation must hold
$$\sum_{i = 1}^n\hat{c_i} \ge \sum_{i = 1}^n c_i$$

i.e. the sum of amortized costs will be an upper bound on the sum of actual cost.

Let us use this method in our example of the dynamic array. The cheap insertion operation in dynamic array costs 1 unit. We charge this operation 3 units so that we can save 2 units for later use. The cost of few operations is tabulated below.

$i$123456789
size1244888816
$c_i$123151119
$\hat{c_i}$333333333
balance233535793

We can clearly see from the table that the balance is always positive. It looks like charging 3 units per operation gives the upper bound on the actual cost (you can charge 2 units and see if that is sufficient).

Therefore, using the accounting method, the amortized cost per operation is $O(1)$.

## The potential method

The accounting method focuses on the accumulation of credits for the future use. The potential method focuses on the particular operation at a particular point in time and depends only upon the current state. Let $D_i$ represents the data structure at time $i$. Potential $\Phi(D_i)$ maps $D_i$ to some real value. The amortized cost $\hat{c_i}$ is given by the following relation.
$$\hat{c_i} = c_i + \Phi(D_i) - \Phi(D_{i - 1})$$
For the cheap operation, the change in potential is positive and for expensive operation the change is negative.

Let us consider a sequence of $n$ operations with cost $c_1$, $c_2$, …, $c_n$. The corresponding data structures are $D_1$, $D_2$, …, $D_n$. The total amortized cost can be calculated as
$$\sum_{i = 1}^n \hat{c_i} = \sum_{i = 1}^n c_i + \Phi(D_n) - \Phi(D_0)$$
The potential at $\Phi(D_0)$ is 0 and any other potential should be $\ge$ 0. Therefore, the total amortized cost is upper bounded on the total actual cost.

In the dynamic array example, we need to search the potential function that never gives a negative value. One possible such function would be,
$$\Phi(D_n) = 2n - m$$

Where $n$ is the number of items in the array and $m$ is the size of the array. To calculate the amortized cost for insertion, we need to consider two cases

1. If the array is not full (i.e. $m > n$), insertion will change $n$ and $m$ will be fixed. The change in potential will be $2(n + 1) - m - 2n - m = 2$. The actual cost of insertion in this case is $1$. So total amortized cost is 2 + 1 = 3.
2. If the array is full (i.e. $m = n$), insertion will change $n$ to $n + 1$ and $m$ to $2m$. So the change in potential will be $2(n + 1) - 2m - 2n - m = 2 - m = 2 - n$. The cost of moving $n$ items and inserting one item is $n + 1$. So the amortized cost is $n + 1 + 2 - n = 3$.

Therefore, using the potential method, the amortized cost per operation is $O(1)$.

## References

1. Zabih, R. (n.d.). Amortized Analysis. Lecture. Retrieved September 2, 2018, from http://www.cs.cornell.edu/courses/cs3110/2011fa/supplemental/lec20-amortized/amortized.htm
2. Fiebrink, R. (n.d.). Amortized Analysis Explained. Lecture. Retrieved September 2, 2018, from https://www.cs.princeton.edu/~fiebrink/423/AmortizedAnalysisExplained_Fiebrink.pdf
3. Brassard, G., & Bratley, P. (2008). Fundamentals of Algorithmics. New Delhi: PHI Learning Private Limited.
4. Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (n.d.). Introduction to algorithms (3rd ed.). The MIT Press.