Sorting Algorithms — With Python

Experimental refresher on why what and how of sorting algorithms

Amit Singh Rathore
Towards Data Science

--

Photo by Max Panamá on Unsplash

Art of Bringing Order

Sorting means putting items in a particular order. That particular order is determined by the comparison property of the elements. In the case of integers, we say smaller number comes first and bigger number appear later. Arranging items in particular order improves the searching of the element. Hence sorting has heavy usage in computer science.

In this blog, we will go through common sorting algorithms. We will see the implementation of each in python. To compare their runtime I used the Leetcode question on sorting array. The constraint for the question is as follows.

Constraints:

1 <= nums.length <= 50000
-50000 <= nums[i] <= 50000

I solved this problem with all common sorting algorithms. Below are the execution results.

Image by Author

Note: TLE is time limit exceeded.

Comparison Based

Bubble Sorting
Selection Sort
Insertion Sorting
Quick Sorting
Shell / Hill Sorting
Heap Sorting
Merge Sorting

Non comparison Based

Counting Sort
Bucket Sort
Radix Sort

Bubble Sort

Simplest sorting algorithm. Iterates over the list, in each iteration it compares elements in pairs and keeps swapping them such that the larger element is moved towards the end of the list.

Non recursive
Stable
In place
O(n²)

def bubbleSort(array):
swapped = False
for i in range(len(array)-1,0,-1):
for j in range(i):
if array[j]>array[j+1]:
array[j], array[j+1] = array[j+1], array[j]
swapped= True
if swapped:
swapped=False
else:
break
return array

Selection Sort

In this algorithm, we create two segments of the list one sorted and the other unsorted. We continuously remove the smallest element from the unsorted segment of the list and append it to the sorted segment. We don’t swap intermediate elements. Hence this algorithm sorts the array in the minimum number of swaps.

Non recursive
Unstable
In place
O(n²)

def selectionSort(array):
for i in range(len(array)-1):
min_idx = i
for idx in range(i + 1, len(array)):
if array[idx] < array[min_idx]:
min_idx = idx
array[i], array[min_idx] = array[min_idx], array[i]
return array

Insertion Sort

Like Selection Sort, in this algorithm, we segment the list into sorted and unsorted parts. Then we iterate over the unsorted segment and insert the element from this segment into the correct position in the sorted list.

Non-recursive
Stable
In place
O(n²)

def insertionSort(array):
for i in range(1, len(array)):
key = array[i]
j = i-1
while array[j] > key and j >= 0:
array[j+1] = array[j]
j -= 1
array[j+1] = key
return array

Shell Sort

Shell sort is an optimization over insertion sort. This is achieved by repeatedly doing insertion sort on all elements at fixed, decreasing intervals. The last iteration the interval is 1. Here it becomes a regular insertion sort and it guarantees that the array will be sorted. But to note one point is that by the time we do that array is almost sorted, hence the iteration is very fast.

Non-recursive
Stable
In place
0(n²) also depends on interval choice

def shellSort(array):
n = len(array)
k = int(math.log2(n))
interval = 2**k -1
while interval > 0:
for i in range(interval, n):
temp = array[i]
j = i
while j >= interval and array[j - interval] > temp:
array[j] = array[j - interval]
j -= interval
array[j] = temp
k -= 1
interval = 2**k -1
return array

Heap Sort

Like the last two previous algorithms, we create two segments of the list one sorted and one unsorted. In this, we use the heap data structure to efficiently get the max element from the unsorted segment of the list. The heapify method uses recursion to get the max element at the top.

Non-recursive
Unstable
In place
O(nlogn)

def heapify(array, n, i):
largest = i
l = 2 * i + 1
r = 2 * i + 2

if l < n and array[i] < array[l]:
largest = l
if r < n and array[largest] < array[r]:
largest = r

if largest != i:
array[i], array[largest] = array[largest], array[i]
heapify(array, n, largest)

def heapSort(array):
n = len(array)
for i in range(n//2, -1, -1):
heapify(array, n, i)
for i in range(n-1, 0, -1):
array[i], array[0] = array[0], array[i]
heapify(array, i, 0)
return array

Merge Sort

This is a divide and conquer algorithm. In this algorithm, we split a list in half and keep splitting the list by 2 until it only has a single element. Then we merge the sorted list. We keep doing this until we get a sorted list with all the elements of the unsorted input list.

Recursive
Stable
Needs extra space
O(nlogn)

def mergeSort(nums):
if len(nums)==1:
return nums
mid = (len(nums)-1) // 2
lst1 = mergeSort(nums[:mid+1])
lst2 = mergeSort(nums[mid+1:])
result = merge(lst1, lst2)
return result
def merge(lst1, lst2):
lst = []
i = 0
j = 0
while(i<=len(lst1)-1 and j<=len(lst2)-1):
if lst1[i]<lst2[j]:
lst.append(lst1[i])
i+=1
else:
lst.append(lst2[j])
j+=1
if i>len(lst1)-1:
while(j<=len(lst2)-1):
lst.append(lst2[j])
j+=1
else:
while(i<=len(lst1)-1):
lst.append(lst1[i])
i+=1
return lst

Quick Sort

In this algorithm, we partition the list around a pivot element, sorting values around the pivot. In my solution, I used the last element from the list as a pivot value. Best performance is achieved when the pivot value splits the list into two almost equal halves.

Recursive
In place
Unstable
O(nlogn)

def quickSort(array):
if len(array)> 1:
pivot=array.pop()
grtr_lst, equal_lst, smlr_lst = [], [pivot], []
for item in array:
if item == pivot:
equal_lst.append(item)
elif item > pivot:
grtr_lst.append(item)
else:
smlr_lst.append(item)
return (quickSort(smlr_lst) + equal_lst + quickSort(grtr_lst))
else:
return array

Counting Sort

This algorithm does not do a comparison between the elements. We use the mathematical properties of the integers to sort. We count how many times a number has come and store the count in the array where the index is mapped to the key’s value.

Non-Recursive
In place but needs extra space
Stable
O(n)

def sortArray(self, nums: List[int]) -> List[int]:
i_lower_bound , upper_bound = min(nums), max(nums)
lower_bound = i_lower_bound
if i_lower_bound < 0:
lb = abs(i_lower_bound)
nums = [item + lb for item in nums]
lower_bound , upper_bound = min(nums), max(nums)

counter_nums = [0]*(upper_bound-lower_bound+1)
for item in nums:
counter_nums[item-lower_bound] += 1
pos = 0
for idx, item in enumerate(counter_nums):
num = idx + lower_bound
for i in range(item):
nums[pos] = num
pos += 1
if i_lower_bound < 0:
lb = abs(i_lower_bound)
nums = [item - lb for item in nums]
return nums

I would also like to mention the Radix sort, which can be implemented with count sort/bucket sort as a subroutine. See below link for reference.

Putting it all together:

After trying all of these different algorithms, just for curiosity I submitted my solution with the default sort method. It was pretty fast with a runtime of 152 (ms). Python’s default sort uses Tim Sort, which is a combination of both merge sort and insertion sort. Implementation of that will be covered in another article.

Found an amazing playlist where sorting algorithms are demonstrated with dance. You may want to watch it. It is worth the time.

Courtesy: https://www.youtube.com/channel/UCIqiLefbVHsOAXDAxQJH7Xw

With our little experiment, we came to know of different algorithms, their runtimes, and memory consumption. We now know the parameters like memory, CPU time, and stability of the algorithm. We need to evaluate these parameters against the given problem to zero in on an algorithm. We can also combine these base sorting algorithms into more powerful ones like Tim Sort.

Happy sorting!!

--

--