The world’s leading publication for data science, AI, and ML professionals.

What Is Big O Notation and Why You Should Care

Choose the most efficient algorithm, according to your situation.

Choose the most efficient algorithm to process your data, based on your specific situation.

Photo by Junior REIS on Unsplash
Photo by Junior REIS on Unsplash

As programmers, we often have to do some prototyping before starting a complex project. During this process, we might not write the most efficient code and that’s OK: it’s more important to reach a working state in the shortest time possible. Inevitably, though, it comes the time to start coding a deployable implementation of the project idea. This means technical debts that may have been left behind must now be addressed.

Moreover, if you are a data scientist, you often find yourself working with great amounts of information at once. Since processing loads of data is a resource-intensive and time-consuming operation, it’s crucial to find the most efficient approach to the problem. In this article I’ll walk you through big O notation and its applications.

Before choosing an algorithm

Given a problem, there are countless possible ways to solve it. Though, only a few are actually good ones. How do you choose between an algorithm A and algorithm B? You have to analyze the problem first.

  • Does your code have to run multiple times per second?
  • How much do you plan to scale your system?
  • What type of data will you be working with?

In case your program calls the same functions over and over again, you definitely need to make sure they don’t become a bottleneck for the whole system. In this case, even a small speed increase can make the difference. For instance, assume you have written a function check_collisions() for your physics engine and it takes 20 milliseconds to run it. You can calculate only around 50 physics frames per second. Imagine now you have been to save 3 milliseconds of execution time, that’s already 59 frames per second.

Scalability is another important factor to take into account. If your application has to process an increasing amount of data, you had better choose an algorithm able to process it in a reasonable amount of time and that doesn’t need excessive memory to operate.

Talking about data, are you using appropriate types and structures for your specific problem? You can save high amounts of RAM just by knowing in advance the data you are dealing with. For example, if you are storing a person’s age, you don’t have to use a 4-byte signed integer. You would be better off using just an unsigned byte, since ages are never negative and don’t exceed 255 years (the maximum number that can be stored in a single byte).

So, how do you compare different Algorithms in terms of performance? This is where big O notation comes in handy.

What is big O notation?

From Wikipedia, Big O Notation is a mathematical notation that describes the limiting behavior of a function when the argument tends towards a particular value or infinity. In computer science, big O notation is used to classify algorithms according to how their run time or space requirements grow as the input size grows. In other words, it measures a function’s time or space complexity. This means, we can know in advance how well an algorithm will perform in a specific situation.

Before starting with the code, here is a list of simple rules to follow when calculating time or space complexity of an algorithm:

  1. Divide your algorithm into single operations or functions.
  2. Calculate the complexity of each operation.
  3. Add up the resulting big Os.
  4. Remove any constant and the lower order terms. Keep only the highest order term (the fastest growing one), which will be the big O of your algorithm.

To make it all clearer, let’s look at some concrete examples. For the code examples I’ll be using Python, since it has a simple English-like syntax anyone can understand. Also, I’m going to focus on time complexity, rather than space complexity to keep the article more concise. Anyway, the same rules also apply, so it seemed a bit redundant to me to talk about it in detail.

Constant time complexity

Let’s start with the simple case in big O notation, constant time complexity. Take a look at this code:

The time it takes to complete this calculation doesn’t depend on the input size. If the number was 2, 60, 100 or 1000 it wouldn’t increase or decrease, at least significantly, the run time. It’s classified using big O notation as O(1), meaning it has a constant time complexity.

The same applies for this code snippet. Changing input_number won’t have impact on the algorithm’s performance. The sum of all operations is O(1+1+1+1+1+1) or O(6), but can be approximated to just O(1), since 6 is a constant. Surely it takes longer to execute than the previous example, but big O doesn’t measure the precise time it takes to complete a certain task. Big O is about measuring the algorithm’s performance variation in relation to the growing input size.

Mathematical operations, assignments, logical statements and function calls all approximate to O(1), unless their implementations are also dependent on the input size. Here’s a list of common operations that are considered time-constants:

  • Arithmetical operations: a + b, a - b, a * b, a++ ...
  • Variable assignments: a = 2, a += 3, a -= 4 ...
  • Array indexing: a[0], a[i] ... where a is an array and i an integer.
  • Function calls: foo(), bar(arg) ... as long as their run time isn’t significantly dependent on the input size.
  • Logical statements: a && b, a and b, !a, not a, a || b, a or b ...
  • Member accessing: a.member, a.foo() ...
  • Bitwise operations: a << b, a | b, a & b, a >> b ...

Hash table indexing also scores, on average, O(1) like arrays. Nonetheless, in very rare cases they can reach O(n).

Iteration time complexity

Assume you have to iterate over every element of an array, for example to count the number of occurrences of a particular value.

The time it takes to complete the task depends on the number n of elements of the array, precisely it’s directly proportional. The time complexity of this algorithm can be classified as O(n), meaning that, as n increases, the run time increases linearly.

We can ignore the if statement inside the for loop, as its time complexity O(1) isn’t strongly dependent on the input size. Note that its performance depends on the length of the strings to compare, but it isn’t as significant as the length of array, thus it can be ignored. Remember that big O is about giving an approximation of an algorithm’s behavior as its input size tends to infinity, it’s not about precise measurements.

Constantly nested iteration, exponential time complexity

Say you have to iterate over an array multiple times, for example to calculate all the possible password combinations for a length of two characters and using just digits from 0 to 9.

This algorithm’s run time obviously depends on the number of possible characters. Precisely, it generates combinations where n is the number of possible characters. If you were to add one more character, the program would have to iterate over the array n more times. The time complexity for this approach scores O(n²).

If you were to add more for loops to increase the length of the password, the time complexity would be O(nᶩ) where n is the number of for loops and l is the number of available characters. For example, three loops score O(n³), four loops score O(n⁴) and so on.

The statement combinations.append(char1 + char2) can generally be approximated to a time complexity of O(1), despite the fact that its run time may also be influenced by the input size, depending on the function’s implementation. Speaking about Python’s lists, they are actually dynamic arrays, so they would have to be resized as they grow. Anyway, this topic is beyond the scope of this article, so I won’t focus much on it.

Iteration over two arrays, linear time complexity

Imagine you now have to iterate over two arrays in order to find how many elements they have in common.

This algorithm’s time requirements are also dependent on the number n of elements to check. For every element of an array, it has to iterate over all the other array. For example, if we added element to array1, the program would have to iterate over every element of array2 one more time.

Since the total number of iterations is the length of the first array times the length of the second, adding an element to one array would increase the number of iterations by the length of the other, since the partial derivative of f(a,b) = a*b with respect to a is b and with respect to b is a. The big O notation for this algorithm would be *O(ab)**, but it can be simplified to just O(n) by treating one of the two arrays’ length as a constant.

As mentioned before, the if statement and the increment of counter can be ignored when calculating time complexity, as they don’t depend on the input size.

Nested iteration, exponential time complexity

The following algorithm generates all possible password combinations for the given character set, given a variable password length.

Let n be the password length and k the constant number of available characters, this program performs n*kⁿ iterations. This is because for every one of the kⁿ iterations of the outer for loop, n other iterations get performed by the inner for loop. The other operations can be ignored.

The derivative of the number of iterations n*kⁿ is kⁿ+kⁿnlog(k), but can be approximated to *O(nkⁿ)**, often written as *O(n2ⁿ)**, **** as the input size grows towards infinity.

Logarithmic time complexity

Let’s now move to a very popular algorithm: binary search. Given a sorted array, it finds the index of the specified element, if it exists.

Binary search works by discarding half of the array in every iteration. This approach reduces the elements to search very rapidly, despite the size of the given array. It scores O(log(n)) in time complexity and it’s very efficient when it comes to large input sizes, thanks to its logarithmic run time.

Factorial time complexity

One of the worst possible cases when it comes to big O is factorial run time. Take this simple code snippet as an example:

For every iteration of this function, factorial gets called multiple times. This recursive call results in the for loop being executed over and over, until number reaches zero. This algorithm is indeed very slow, especially in case of high input numbers. The time complexity of this function is O(n!).

How to choose the right algorithm

When it’s time to get your hands on the problem, it’s important to choose an algorithm that fits the situation properly. It may seem obvious that you should opt for the lowest time complexity possible, right? It’s actually not that easy.

Depending on the input size and the specific implementation, an O(n) might even be faster than an O(1). Take a look at this graph:

Graph of common big O notation cases.
Graph of common big O notation cases.

In the long run, lower time complexities are always going to be more efficient than their higher counterparts. However, there is a point where the some performance lines overlap you should pay close attention to, if you intend to optimize your programs as much as possible.

Graph of common big O notation cases.
Graph of common big O notation cases.

As you can see, below a certain threshold, which is totally dependent on the situation, some algorithms that score worse in big O actually have an advantage over others that may do better with larger input sizes. This is to say that you should first analyse in depth and the related environment before jumping on the keyboard. Depending on the data you have to process, the graph for your algorithm may look different at this level.

Conclusion

To wrap it up, it’s important to know well your problem and your data you in order to apply the most efficient algorithm for the situation. When planning to build an application, you have to take into account how it’s going to perform as the user base and data grows and adapt your code accordingly.

Unfortunately, there isn’t a tool that fits all purposes and situations. Optimization is all about using the most suitable one for a specific case, rather than sticking with what you may have on hand first.

Even though you can bump in a screw with a hammer, you might as well use a screwdriver.

I hope you enjoyed this article. It you have a question or would like to add something, please share your thoughts in a comment. I’d like to know what your opinion.

Thanks for reading!


Related Articles