The world’s leading publication for data science, AI, and ML professionals.

3 Must-Know Sorting Algorithms

Explained with visualizations and Python code

Photo by Soraya Irving on Unsplash
Photo by Soraya Irving on Unsplash

An algorithm is a finite set of instructions that are designed and ordered in a sequence to perform a task or computation. Algorithms along with data structures are the building blocks of programming languages.

An algorithm is not only supposed to perform a task, but also do it efficiently. The time and space complexity are two critical measures when evaluating an algorithm.

Let’s say you need to implement an algorithm to perform a task. You cannot just look it up like syntax of a Programming language. You need to think analytically and computationally. Thus, designing an algorithm is not a skill that can be learned by reading documentation.

However, learning and understanding the common algorithms definitely help to grow an analytical mindset. I think of it as mind exercise. Just like we play football to learn and get better at it, practice on solving algorithms improves our ability to design new ones.

In this post, I will try to explain 3 different types of sorting algorithms. We use them a lot in data analysis but don’t know what is going on under the hood. For instance, we just call the sort method in Python to sort a list.

The algorithms we will cover are:

  • Bubble sort
  • Selection sort
  • Merge sort

Bubble sort

In bubble sort algorithm, we go over a sequence of numbers by comparing consecutive numbers. If a number is smaller than its previous number, we swap them. We keep going over the list until all numbers are sorted.

The following figure shows how the bubble sort algorithm works. We start with the list [3,2,8,4,1,5]. After the forth iteration, the list sorted in ascending order. I also put a note showing that how many swaps are done during each iteration.

Bubble sort algorithm (image by author)
Bubble sort algorithm (image by author)

As you might expect, the program that executes this algorithm includes while and for loops. We also need an if statement for comparison.

Here is the python code for the bubble sort algorithm.

def bubble_sort(L):
   swap = False
   while not swap:
      swap = True
      for j in range(1, len(L)):
         if L[j-1] > L[j]:
         swap = False
         L[j], L[j-1] = L[j-1], L[j]

We need the swap variable to stop the algorithm when the sorting is completed. The while loop is executed as long as the swap variable is False.

For loop iterates over the numbers in the list. We compare each number with its previous number. If it is smaller than the previous one, we swap them. As soon as the condition to swap is met and we get in the if statement, the swap variable is change to False so that we can start over.

The complexity of an algorithm is very important. It’s not enough just to do a task. We should design algorithms as simple as possible.

The complexity of bubble sort algorithm is proportional to the square of the length of a list. Thus, the length of a list increases, the time it takes for the algorithm to sort the list increase quadratically.

At worst case, we do n-1 comparisons and n-1 passes where n is the length of the list. With big-O notation, the time complexity of bubble sort algorithm is O(n²).


Selection sort

The selection sort algorithm takes the first number and starting from the second one, all other numbers are compared with the first number. If a number is smaller than the first number, they are swapped.

When the first iteration is complete, we make sure the smallest number is at the beginning of the list. Then, we take the second number and repeat the same process.

After the execution is completed, the list is sorted in ascending order.

The following figure shows how the selection sort algorithm works. As we just mentioned, after the first iteration, the smallest number is in the first position. After the second iteration, the second smallest number is in the second position and so on.

Selection sort algorithm (image by author)
Selection sort algorithm (image by author)

Here is the python code for the selection sort algorithm.

def selection_sort(L):
   for i in range(len(L)-1):
      for j in range(i+1, len(L)):
         if L[j] < L[i]:
            L[i], L[j] = L[j], L[i]

If you want to see the current version of the list after each iteration and count the number of swaps at each iteration, you can use the following code. It will be helpful in understanding how the algorithm works.

def selection_sort(L):
   for i in range(len(L)-1):
      count = 0
      for j in range(i+1, len(L)):
         if L[j] < L[i]:
            L[i], L[j] = L[j], L[i]
            count +=1
      print(L, count)

For each number in the list, we iterate over the entire list. Thus, the time complexity of the selection sort algorithm is also O(n²) where n is the length of the list.


Merge sort

The merge sort algorithm is a little more complicated to grasp but it is more efficient than the previous two algorithms. It performs better in terms of time complexity.

This algorithm has two parts. First part is the merge function that merges two sorted lists in a way that the resulting list is sorted. Please note that the lists must be sorted for the merge function to work.

The merge function first compares the smallest numbers of two lists. Then, it takes the smaller of the two and appends to a new list. The next step is to compare the second smallest from this list to the smallest in the other list. The process continues until all numbers are appended to a new list in ascending order.

The following figure shows how the merge function works.

Merge (image by author)
Merge (image by author)

The first part which consists of the merge function does not actually serve for our purpose by itself. We are trying to sort a list but the merge function takes merged lists as arguments. We will solve this issue with the second part.

The second part is to split a list. It keeps splitting a list into two until all the parts have a length of 1. We will do that recursively. Once we get lists of length 1, we call the merge function explained in the previous step.

We can now build our way towards the top by merging sorted lists. The end result is the original list sorted in ascending order.

The following figure illustrates how merge sort algorithm works.

Merge sort algorithm (image by author)
Merge sort algorithm (image by author)

To be more accurate, it is better to show merge function calls from down to bottom because it is called recursively. I have created the figure in this way for demonstration purposes.

Here is the python code for merge sort algorithm.

def merge(left, right): #left and right are sorted lists
   result = []
   i, j = 0, 0
   while i < len(left) and j < len(right):
      if left[i] < right[j]:
         result.append(left[i])
         i +=1
      else:
         result.append(right[j])
         j +=1
   while (i < len(left)):
      result.append(left[i])
      i +=1
   while (j < len(right)):
      result.append(right[j])
      j +=1

   return result
def merge_sort(L):
   if len(L) < 2:
      return L[:]
   else:
      middle = len(L) // 2
      left = merge_sort(L[:middle])
      right = merge_sort(L[middle:])
      return merge(left, right)

The time complexity of the merge sort algorithm is O(n log(n)). The n comes from the merging part and log(b) comes from the splitting in half part.

Let’s test our merge_sort function.

lst = [4,1,3,11,2,1,8,4]
merged = merge_sort(lst)
print(merged)
[1, 1, 2, 3, 4, 4, 8, 11]

It works as expected.


Conclusion

We have covered three essential sorting algorithms used in computer science. The merge sort is a better choice in term of the time complexity.

When designing an algorithm, one should consider both time and space complexities as well as successively completing the given task.

Practicing and learning the commonly used algorithms will definitely help you gain experience in designing algorithms. You will also have a chance to have a mind exercise.

Thank you for reading. Please let me know if you have any feedback.


Related Articles