Best, Average and Worst case Analysis of Algorithms

Before reading this article, I strongly recommend you to visit the following article.

Running Time, Growth of Functions and Asymptotic Notations

Introduction

We all know that the running time of an algorithm increases (or remains constant in case of constant running time) as the input size ($n$) increases. Sometimes even if the size of the input is same, the running time varies among different instances of the input. In that case, we perform best, average and worst-case analysis. The best case gives the minimum time, the worst case running time gives the maximum time and average case running time gives the time required on average to execute the algorithm. I will explain all these concepts with the help of two examples - (i) Linear Search and (ii) Insertion sort.

Best Case Analysis

Consider the example of Linear Search where we search for an item in an array. If the item is in the array, we return the corresponding index, otherwise, we return -1. The code for linear search is given below.

int search(int a, int n, int item) {
    int i;
    for (i = 0; i < n; i++) {
        if (a[i] == item) {
            return a[i]
        }
    }
    return -1
}

Variable a is an array, n is the size of the array and item is the item we are looking for in the array. When the item we are looking for is in the very first position of the array, it will return the index immediately. The for loop runs only once. So the complexity, in this case, will be $O(1)$. This is the called the best case.

Consider another example of insertion sort. Insertion sort sorts the items in the input array in an ascending (or descending) order. It maintains the sorted and un-sorted parts in an array. It takes the items from the un-sorted part and inserts into the sorted part in its appropriate position. The figure below shows one snapshot of the insertion operation.

In the figure, items [1, 4, 7, 11, 53] are already sorted and now we want to place 33 in its appropriate place. The item to be inserted are compared with the items from right to left one-by-one until we found an item that is smaller than the item we are trying to insert. We compare 33 with 53 since 53 is bigger we move one position to the left and compare 33 with 11. Since 11 is smaller than 33, we place 33 just after 11 and move 53 one step to the right. Here we did 2 comparisons. It the item was 55 instead of 33, we would have performed only one comparison. That means, if the array is already sorted then only one comparison is necessary to place each item to its appropriate place and one scan of the array would sort it. The code for insertion operation is given below.

void sort(int a, int n) {
    int i, j;
    for (i = 0; i < n; i++) {
       j = i-1;
       key = a[i];
       while (j >= 0 && a[j] > key)
       {
           a[j+1] = a[j];
           j = j-1;
       }
       a[j+1] = key;
    }
}

When items are already sorted, then the while loop executes only once for each item. There are total n items, so the running time would be $O(n)$. So the best case running time of insertion sort is $O(n)$.

The best case gives us a lower bound on the running time for any input. If the best case of the algorithm is $O(n)$ then we know that for any input the program needs at least $O(n)$ time to run. In reality, we rarely need the best case for our algorithm. We never design an algorithm based on the best case scenario.

Worst Case Analysis

In real life, most of the time we do the worst case analysis of an algorithm. Worst case running time is the longest running time for any input of size $n$.

In the linear search, the worst case happens when the item we are searching is in the last position of the array or the item is not in the array. In both the cases, we need to go through all $n$ items in the array. The worst case runtime is, therefore, $O(n)$. Worst case performance is more important than the best case performance in case of linear search because of the following reasons.

The item we are searching is rarely in the first position. If the array has 1000 items from 1 to 1000. If we randomly search the item from 1 to 1000, there is 0.001 percent chance that the item will be in the first position.
Most of the time the item is not in the array (or database in general). In which case it takes the worst case running time to run.

Similarly, in insertion sort, the worst case scenario occurs when the items are reverse sorted. The number of comparisons in the worst case will be in the order of $n^2$ and hence the running time is $O(n^2)$.

Knowing the worst-case performance of an algorithm provides a guarantee that the algorithm will never take any time longer.

Average Case Analysis

Sometimes we do the average case analysis on algorithms. Most of the time the average case is roughly as bad as the worst case. In the case of insertion sort, when we try to insert a new item to its appropriate position, we compare the new item with half of the sorted item on average. The complexity is still in the order of $n^2$ which is the worst-case running time.

It is usually harder to analyze the average behavior of an algorithm than to analyze its behavior in the worst case. This is because it may not be apparent what constitutes an “average” input for a particular problem. A useful analysis of the average behavior of an algorithm, therefore, requires a prior knowledge of the distribution of the input instances which is an unrealistic requirement. Therefore often we assume that all inputs of a given size are equally likely and do the probabilistic analysis for the average case.