Merge Sort

Merge Sort is a Divide and Conquer algorithm. It divides input array in two halves, calls itself for the two halves and then merges the two sorted halves. The merge() function is used for merging two halves. The merge(arr, l, m, r) is key process that assumes that arr[l..m] and arr[m+1..r] are sorted and merges the two sorted sub-arrays into one. See following C implementation for details.

The following image shows the complete merge sort process for an example array {6, 5, 3, 1, 8, 7, 2, 4}. We can see that the array is recursively divided in two halves till the size becomes 1. Each element is compared with the adjacent list for sorting and merging the two adjacent lists. Finally all the elements are sorted and merged.

From the below animation it can be seen that Merge sort works fastest for nearly sorted array of elements. The time required to sort the elements increases in the order: few unique/reversed and then random. Reversed and few unique elements take about the same time to get sorted.

What are random, nearly sorted, reversed and few unique elements?

  1. Random: {6, 9, 5, 8, 7, 4, 2, 1, 3}
    Here all the elements are arranged in a random fashion.
  2. Nearly sorted: {5, 6, 7, 8, 9, 1, 2, 3, 4}
    The array is nearly sorted. For example, the above array if divided into two, are sorted individually.
  3. Reversed: {9, 8, 7, 6, 5, 4, 3, 2, 1}
    The array elements are reverse sorted and we are supported to sort them in ascending order.
  4. Few Unique: {1, 1, 2, 9, 4, 3, 2, 5, 9, 6}
    The array contains duplicate entries.
Random Nearly Sorted Reversed Few Unique

Key

  1. Black values are sorted.
  2. Gray values are unsorted.
  3. A red triangle marks the algorithm position.

Algorithm

MergeSort(arr[], l,  r)
If r > l
     1. Find the middle point to divide the array into two halves:  
             middle m = (l+r)/2
     2. Call mergeSort for first half:   
             Call mergeSort(arr, l, m)
     3. Call mergeSort for second half:
             Call mergeSort(arr, m+1, r)
     4. Merge the two halves sorted in step 2 and 3:
             Call merge(arr, l, m, r)


Implementation

/* Function to merge the two haves arr[l..m] and arr[m+1..r] of array arr[] */
void merge(int arr[], int l, int m, int r)
{
    int i, j, k, n1 = m-l+1, n2 = r-m;
 
    /* create temp arrays */
    int L[n1], R[n2];
 
    /* Copy data to temp arrays L[] and R[] */
    for(i = 0; i < n1; i++)
        L[i] = arr[l + i];
    for(j = 0; j < n2; j++)
        R[j] = arr[m + 1+ j];
 
    /* Merge the temp arrays back into arr[l..r]*/
    i = 0;
    j = 0;
    k = l;
    while (i < n1 && j < n2)
    {
        if (L[i] <= R[j])
        {
            arr[k] = L[i];
            i++;
        }
        else
        {
            arr[k] = R[j];
            j++;
        }
        k++;
    }
 
    /* Copy the remaining elements of L[], if there are any */
    while (i < n1)
    {
        arr[k] = L[i];
        i++;
        k++;
    }
 
    /* Copy the remaining elements of R[], if there are any */
    while (j < n2)
    {
        arr[k] = R[j];
        j++;
        k++;
    }
}
 
/* l is for left index and r is right index of the sub-array of arr to be sorted */
void mergeSort(int arr[], int l, int r)
{
    if (l < r)
    {
        int m = l+(r-l)/2; //Same as (l+r)/2, but avoids overflow for large l and h
        mergeSort(arr, l, m);
        mergeSort(arr, m+1, r);
        merge(arr, l, m, r);
    }
}

Analysis

In the worst case, the number of comparisons merge sort makes is equal to or slightly smaller than (n ⌈log n⌉ – 2⌈log n + 1), which is between (n log nn + 1) and (n log n + n + O(log n)). In the worst case, merge sort does about 39% fewer comparisons than quick sort does in the average case. In terms of moves, merge sort’s worst case complexity is O(n log n)—the same complexity as quick sort’s best case, and merge sort’s best case takes about half as many iterations as the worst case.

Merge sort is more efficient than quick sort for some types of lists if the data to be sorted can only be efficiently accessed sequentially, and is thus popular in languages such as Lisp, where sequentially accessed data structures are very common. Unlike some (efficient) implementations of quicksort, merge sort is a stable sort.

Merge sort’s most common implementation does not sort in place; therefore, the memory size of the input must be allocated for the sorted output to be stored in. Merge Sort is useful for sorting linked lists in O(nLogn) time. Other nlogn algorithms like Heap Sort, Quick Sort (average case nLogn) cannot be applied to linked lists.

Auxiliary Space: O(n)
Sorting In Place: No in a typical implementation
Stable: Yes

Merge sort also has some demerits. One is its use of 2n locations; the additional n locations are commonly used because merging two sorted sets in place is more complicated and would need more comparisons and move operations. But despite the use of this space the algorithm still does a lot of work: The contents of m are first copied into l(left) and r(right) and later into the list arr(result) on each invocation of mergesort (variable names according to the pseudocode above).

Source: Wikipedia