Understanding Radix Sort in Python

Introduction

Sorting algorithms play a crucial role in computer science and programming. They allow us to arrange data in a particular order, making it easier to search, analyze, and process information efficiently. Radix sort is a powerful algorithm that operates on the individual digits or characters of a number or string, providing an efficient solution for sorting tasks.

Understanding Radix Sort

Radix sort, also known as bucket sort or digital sort, is a non-comparative sorting algorithm. Unlike other popular sorting algorithms such as quick sort or merge sort that compare elements to determine their relative order, radix sort focuses on sorting numbers or strings digit by digit or character by character. This approach allows radix sort to achieve linear time complexity, making it an excellent choice for sorting large datasets.

The core idea behind radix sort is to sort elements based on their most significant digit to the least significant digit (MSD to LSD) or from the least significant digit to the most significant digit (LSD to MSD). The algorithm uses stable sorting methods such as counting sort or bucket sort at each digit or character position to achieve the final sorted order.

Visual Representation of Radix Sort

The following steps will assist you in grasping the workings of counting sort by providing an example with an array.

Step 1

There is an integer array. The array size is depicted along with the information about the largest number present within the array.

Step 2

An illustration of an integer array being sorted using the counting sort algorithm. The sorting process is based on the least significant digit.

Step 3

An illustration of an integer array undergoing the second sorting iteration using the counting sort algorithm. The sorting process is based on the second least significant digit.

Step 4

An illustration of an integer array undergoing the final sorting iteration using the counting sort algorithm. The sorting process is based on the most significant digit. The array is then sorted in ascending order, completing the sorting process.

Implementing Radix Sort in Python

Let’s delve into the implementation of radix sort in Python. We’ll focus on sorting a list of integers, but the algorithm can also be adapted to sort strings or other data types based on their individual characters.


def radix_sort(arr):
    # Find the maximum element to determine the number of digits
    max_value = max(arr)
    
    # Perform counting sort for each digit
    digit_position = 1
    while max_value // digit_position > 0:
        counting_sort(arr, digit_position)
        digit_position *= 10

def counting_sort(arr, digit_position):
    size = len(arr)
    output = [0] * size
    count = [0] * 10

    # Store the count of occurrences in count[]
    for i in range(size):
        index = arr[i] // digit_position
        count[index % 10] += 1

    # Update count[i] to contain the actual position
    # of this digit in the output array
    for i in range(1, 10):
        count[i] += count[i - 1]

    # Build the output array
    i = size - 1
    while i >= 0:
        index = arr[i] // digit_position
        output[count[index % 10] - 1] = arr[i]
        count[index % 10] -= 1
        i -= 1

    # Copy the output array to arr[]
    for i in range(size):
        arr[i] = output[i]

The `radix_sort` function serves as the entry point for the algorithm. It begins by finding the maximum value in the input list `arr` to determine the number of digits required for sorting. The `digit_position` variable keeps track of the current digit being processed, starting from the least significant digit.

The core sorting logic lies within the `counting_sort` function, which performs counting sort based on the `digit_position`. It initializes an auxiliary `output` array and a `count` array to store the occurrences of each digit. The algorithm iterates over the input list, calculating the digit at the current position and updating the corresponding count.

After counting the occurrences, the algorithm modifies the count array to store the actual positions of each digit in the output array. This step ensures a stable sort, preserving the relative order of elements with equal digits. The algorithm then builds the output array by iterating over the input list in reverse order and placing each element in its correct position based on the current digit.

Finally, the `output` array is copied back to the original `arr` list, completing one iteration of counting sort for the current digit position. The process continues until all digits have been sorted, resulting in a fully sorted list.

Using Radix Sort

Using the radix sort algorithm is straightforward. Here’s an example that demonstrates how to sort a list of integers:


numbers = [170, 45, 75, 90, 802, 24, 2, 66]
radix_sort(numbers)
print(numbers)

The output will be:


[2, 24, 45, 66, 75, 90, 170, 802]

As you can see, the numbers are sorted in ascending order using radix sort. Feel free to modify the `numbers` list and experiment with different datasets to see the algorithm in action.

Complexity Analysis of Radix Sort

Let’s analyze the time complexity and space complexity of the radix sort algorithm:

Time Complexity

The time complexity of radix sort depends on the number of digits or characters in the input elements. Let’s assume we have n elements, each containing d digits or characters.

The radix sort algorithm performs counting sort or bucket sort for each digit position from the least significant digit (LSD) to the most significant digit (MSD). Counting sort and bucket sort have a time complexity of O(n + k), where k is the range of possible values at each digit position.

For radix sort, we perform counting sort or bucket sort d times, once for each digit position. Hence, the overall time complexity of radix sort can be calculated as follows:

Time Complexity = O(d * (n + k))

In the worst case, where d is the maximum number of digits or characters in the input elements, the time complexity can be simplified to:

Time Complexity = O(max_digits * (n + k))

It’s important to note that the range of possible values at each digit position (k) should be relatively small compared to the number of elements (n) for radix sort to maintain its efficiency. If the range of values is significantly large, it can negatively impact the performance of radix sort.

Space Complexity

The space complexity of radix sort is determined by the additional space required for the auxiliary arrays used during the sorting process. These arrays include the output array and the count array.

The size of the output array is equal to the number of elements (n), as it stores the sorted elements. The count array has a fixed size of 10 (assuming a decimal number system) or the range of possible values at each digit position.

Hence, the space complexity of radix sort can be expressed as:

Space Complexity = O(n + k)

Similar to the time complexity, the range of values (k) should be relatively small compared to the number of elements (n) to ensure efficient memory usage.

Overall, radix sort has a linear space complexity, making it memory-efficient compared to some other sorting algorithms. However, it’s important to consider the potential space requirements when dealing with large datasets or a wide range of values.

It’s worth noting that the space complexity analysis assumes an in-place counting sort implementation. If a separate array is used for the output or count arrays, the space complexity will be higher.

Advantages and Limitations of Radix Sort

Radix sort offers several advantages compared to other sorting algorithms:

  1. Linear Time Complexity: Radix sort has a time complexity of
    O(kn), where n is the number of elements and k is the average number of
    digits. This makes radix sort highly efficient for large datasets.
  2. Stable Sorting: Radix sort is a stable sorting algorithm, preserving the relative order of elements with equal digits.
  3. Adaptability: Radix sort can be easily adapted to sort different
    data types, including integers, strings, or even custom objects, by
    considering the individual digits or characters.

However, radix sort also has some limitations:

  1. Extra Space: Radix sort requires additional space to store the
    output and count arrays, which can be a concern when dealing with
    limited memory.
  2. Limited Applicability: Radix sort is most effective when the
    range of input values is known and relatively small. Sorting large
    integers or strings with a vast number of unique elements may lead to
    memory issues or reduced performance.

Despite these limitations, radix sort remains a powerful sorting algorithm with its unique set of advantages, making it a valuable tool in various sorting scenarios.

Conclusion

Radix sort provides an efficient solution for sorting numbers or strings by focusing on individual digits or characters. By employing stable sorting techniques at each digit position, radix sort achieves linear time complexity, making it an excellent choice for large datasets. Its adaptability and stable sorting properties further contribute to its usefulness in different scenarios. Understanding radix sort and implementing it in Python expands your arsenal of sorting algorithms and enhances your ability to handle diverse sorting tasks with ease.

Share your love
Subhankar Rakshit
Subhankar Rakshit

Hey there! I’m Subhankar Rakshit, the brains behind PySeek. I’m a Post Graduate in Computer Science. PySeek is where I channel my love for Python programming and share it with the world through engaging and informative blogs.

Articles: 147