Understanding Counting Sort in Python

Introduction

Sorting is a fundamental operation in computer science that involves arranging elements in a specific order. Counting Sort is a simple and efficient sorting algorithm that works well for a specific range of input values. In this article, we will explore the Counting Sort algorithm and its implementation in Python.

What is Counting Sort?

Counting Sort is a linear time sorting algorithm that operates by counting the number of occurrences of each element in the input array and then using this information to determine the sorted order of the elements. It is particularly useful when the range of input values is relatively small compared to the size of the array.

Algorithm Steps

The Counting Sort algorithm follows these steps:

  1. Find the minimum and maximum values in the input array to determine the range.
  2. Create an auxiliary array of size range+1, initialized with zeros. This auxiliary array is used to count the occurrences of each element.
  3. Traverse the input array and increment the corresponding count in the auxiliary array for each element encountered.
  4. Modify the auxiliary array by adding the previous count to the current count. This step calculates the final position of each element in the sorted array.
  5. Create a sorted output array of the same size as the input array.
  6. Traverse the input array again, and for each element, find its sorted position from the auxiliary array and place it in the output array. Decrement the count in the auxiliary array for that element.
  7. The output array contains the sorted elements.

Visual Representation of Counting Sort

The following steps will assist you in grasping the workings of counting sort by providing an example with an array.

Step 1

showing the initial step of the counting sort algorithm with the original array, a temporary array, and a count array based on the length and range of the original array.

Step 2

It is showing the count array with counts of each element from the original array stored in corresponding cells.

Step 3

Showing the updating of count array elements with the actual position of each element from the original array.

Step 4

Representing a for loop iterating over arrays, starting from the size minus one and decrementing to zero.

Step 5

A sequence of the arrays representing the state of the temporary array after each iteration, with a total of eight states.

Step 6

showing the copying of elements from the sorted temporary array to the main array.

Python Implementation

Let’s now look at an example implementation of the Counting Sort algorithm in Python:


def counting_sort(arr):
    # Find the range of input values
    min_value = min(arr)
    max_value = max(arr)
    range_of_values = max_value - min_value + 1
    
    # Create the auxiliary array
    count = [0] * range_of_values
    
    # Count the occurrences of each element
    for num in arr:
        count[num - min_value] += 1
    
    # Modify the auxiliary array
    for i in range(1, len(count)):
        count[i] += count[i - 1]
    
    # Create the sorted output array
    output = [0] * len(arr)
    
    # Place elements in the sorted order
    for num in reversed(arr):
        output[count[num - min_value] - 1] = num
        count[num - min_value] -= 1
    
    return output

Usage Example

Now, let’s see how we can use the `counting_sort` function to sort an array:


array = [9, 4, 7, 2, 8, 3, 1, 5, 6]
sorted_array = counting_sort(array)
print(sorted_array)

Output


[1, 2, 3, 4, 5, 6, 7, 8, 9]
 

Complexity of Counting Sort

1. Time Complexity:

Counting Sort has a time complexity of O(n + k), where n is the number of elements in the input array and k is the range of input values.

  • Counting the occurrences of each element: This step requires traversing the input array once, resulting in a time complexity of O(n).
  • Modifying the auxiliary array: This step involves updating the count array based on the cumulative counts, which takes O(k) operations.
  • Creating the sorted output array: Traversing the input array in reverse and placing the elements in the output array takes O(n) operations.

Therefore, the overall time complexity is O(n + k). Notably, Counting Sort has a linear time complexity, making it efficient for scenarios where the range of input values is small compared to the array size.

2. Space Complexity:

Counting Sort has a space complexity of O(n + k), where n is the number of elements in the input array and k is the range of input values.

  • Auxiliary array: The space required for the auxiliary array is proportional to the range of input values, resulting in a space complexity of O(k).
  • Sorted output array: The space required for the sorted output array is equal to the size of the input array, giving a space complexity of O(n).

Therefore, the overall space complexity is O(n + k). The auxiliary array contributes to the additional space requirement, which can be significant when the range of input values is large.

It’s important to note that the space complexity of Counting Sort can be improved by using an in-place approach. By modifying the input array directly, the space complexity can be reduced to O(k) if the range of values is known in advance.

Drawbacks of Counting Sort

1. Limited Applicability to Integer Values:

Counting Sort is designed for sorting integer values within a specific range. It relies on the range of input values to create an auxiliary array of counters. If the range of values is significantly large, it can lead to excessive memory usage and inefficiency. For example, consider sorting a list of integer numbers: “5, 3, 1, 7, 9, 8, 2001” using Counting Sort. The largest element ‘2001’ making the range unnecessarily high and the algorithm is unsuitable for this scenario.

2. Non-Negative Integer Constraint:

Counting Sort assumes that the input array contains non-negative integers. If the array contains negative values or a mix of positive and negative values, Counting Sort cannot be directly applied. In such cases, additional preprocessing or modifications to the algorithm are required. For instance, sorting an array that includes both positive and negative numbers using Counting Sort would require a transformation step to convert the negative values to positive values.

3. Extra Space Requirement:

Counting Sort requires extra space to create the auxiliary array for counting occurrences. The size of this auxiliary array is determined by the range of input values. If the range is significantly large, it can lead to a considerable memory overhead. This can be problematic when dealing with large input sizes or systems with limited memory resources.

4. Limited Sorting Stability:

Counting Sort is not a stable sorting algorithm, meaning that it does not preserve the relative order of equal elements in the sorted array. For example, consider sorting an array of objects based on a specific attribute using Counting Sort. The algorithm does not guarantee that objects with equal attribute values will be sorted in their original order.

5. Inefficient for Sparse Data:

Counting Sort performs well when the input contains a relatively small range of values with high frequencies. However, it becomes less efficient when the input has a sparse distribution, meaning that there are many gaps between the values. In such cases, the auxiliary array may have many unused entries, resulting in unnecessary memory consumption and time complexity.

Conclusion

Counting Sort is an efficient sorting algorithm that operates in linear time, making it useful for specific scenarios where the range of input values is small compared to the array size. By counting the occurrences of each element and using this information to determine the sorted order, Counting Sort provides a simple and effective solution for sorting arrays. Understanding such algorithms is crucial for any programmer’s toolkit, as they can improve performance and efficiency in a variety of applications.

Share your love
Subhankar Rakshit
Subhankar Rakshit

Hey there! I’m Subhankar Rakshit, the brains behind PySeek. I’m a Post Graduate in Computer Science. PySeek is where I channel my love for Python programming and share it with the world through engaging and informative blogs.

Articles: 194