The Math Behind “Big O” and Other Asymptotic Notations

The formal definitions of notations like “Big O”, “Big Omega”, and “Big Theta”.

CR Ferreira
Towards Data Science

--

Photo by Shubham Sharan on Unsplash.

Big O (pronounced “big oh”) is a mathematical notation widely used in computer science to describe the efficiency of algorithms, either in terms of computational time or of memory space. The main purpose of this and other so-called asymptotic notations is to describe the behavior of mathematical functions, by comparing their “orders of growth”.

In this article, I first give a quick overview of how the Big O notation is used in practice to describe the efficiency of algorithms. Later, I present its mathematical definition, as well as those of other similar notations like “Big Omega” (Ω) and “Big Theta” (Θ). Additionally, I also provide a couple of formal methods you can use to classify mathematical functions according to those definitions.

“Big O” in a Nutshell

Note: If you are used to working with algorithms, chances are you’ve used or seen the Big O notation before. If you believe that you already have a general understanding of what Big O means and how it is used to describe the efficiency of an algorithm, you can probably skip to the next section, where you’ll find the mathematical definitions for Big O and other similar notations.

The Big O notation is commonly used to distribute algorithms into a few Basic Efficiency Classes, like O(log(n)), O(n), O(n*log(n)), O(n²), and so on. We say that a standard linear search algorithm runs in O(n) because its running time is expected to increase linearly with its input size. Similarly, we know that binary searches are O(log(n)), and the most efficient sorting algorithms can run in O(n*log(n)).

However, saying that an algorithm is O(n²) doesn’t mean that it performs exactly operations for a given input of size n. Imagine that Algorithm A always performs f(n)=2n²+n+1 operations. What people usually do with this function is to drop its non-dominant terms (like +n and +1) and its constants (like the 2 in 2n²) to obtain its asymptotic notation O(n²). If there is an Algorithm B that always performs f(n)=4n²+3n+3 operations, it can also be described as O(n²), although it performs more than twice the operations as Algorithm A for any value of n.

But there’s nothing wrong with that. When we use an asymptotic notation like Big O, we are interested in comparing orders of growth, instead of caring about exact numbers. The main goal of this analysis is to describe how the running time or memory-space required by an algorithm increases depending on the input size n, especially when n is considerably large.

If an algorithm is O(n²), that means that if we double the input size, then the number of operations performed will be roughly quadrupled. By increasing n=100 to n=200, the number of operations performed by Algorithm A will go from 20,101 to 80,201 operations, while for Algorithm B this number will go from 40,303 to 160,603 operations. The number of operations performed by each algorithm gets multiplied by roughly four (although not exactly).

On the other hand, algorithms that are O(2ⁿ) or O(n!) have much higher orders of growth, and that can be noticed even for small values of n. For instance, simply going from n=10 to n=20 will cause an O(2ⁿ) algorithm to perform roughly 1024 more operations. And, for an O(n!) algorithm, that would mean more than 670 billion times more operations as before.

Formal Definitions of Asymptotic Notations

If you weren’t so familiar with the Big O notation, I hope that the quick introduction above was enough to give you a general understanding of how an asymptotic notation like Big O works, and why it makes so much sense to apply this formalism when evaluating the efficiency of algorithms. If you feel like you still don’t fully understand this concept, maybe you should check other articles with additional examples (like this one) before getting into the more complex mathematical definitions that will be discussed in this article.

In the following, I will get deeper into how Big O is mathematically defined. I will also introduce a few other similar asymptotic notations that are also used in computer science to evaluate the efficiency of algorithms.

Big O: Upper Bounds

So far, I’ve been using the Big O notation the way it’s most commonly used: as an exact description of an algorithm’s efficiency. However, that is not technically correct — as I will show below, Big O only provides a way to represent upper bounds. So keep in mind that from now on I will start referring to Big O according to how it’s mathematically defined, instead of how it’s normally used in practice. (If you are interested in a deeper discussion about this very common misconception, you can check this previous article of mine.)

Informally, O(g(n)) can be understood as the set of mathematical functions that contains all functions that don’t grow faster than g(n). For instance, O(n²) is the set that contains n²+3n, 3n, n, log(n), and infinite other functions that do not grow faster than f(n)=n² when n → ∞.

Now, here’s how Big O is formally defined:

f(n)∈ O(g(n)) if and only if there exist some positive constant c and some nonnegative integer nₒ such that f(n) ≤ cg(n) for all n ≥ nₒ.

For instance, let’s check if 3n∈ O(n²). Taking c=1 and plotting the functions f(n)=3n and cg(n)=1n², we can see that both intersect at nₒ=3:

Plotting f(n)=3n and cg(n)=1n². Note that n∈ℕ, but I plotted the function domain as ℝ for clarity. Created with Matplotlib.

Looking at the plot, we can easily tell that 3n ≤ 1n² for all n≥3. But that’s not enough, as we need to actually prove that. We can use mathematical induction to do it. It goes like this:

  1. We need a base case nₒ for which we know that f(nₒ) ≤ cg(nₒ). But we already have one, as we know that we can use nₒ=3.
  2. Then we assume that f(n)≤cg(n) for a given n≥nₒ. In our case, we assume that 3n ≤ 1n² for a given n≥3.
  3. Now we need to show that if f(n)≤cg(n), then f(n+1) ≤ cg(n+1). In our case, we need to prove that if our assumption (2) above is true, then 3(n+1) ≤ (n+1)² is also true. This can also be written as 3n+3 ≤ n² + 2n + 1. But, since we already know from (2) that 3n ≤ n², we can subtract 3n from the left side and from the right side of the inequality, without affecting its relation. This will give us 3 ≤ 2n+1. As we know that n≥3, this inequality will always be true provided that (2) holds true. This completes the proof that 3n ≤ 1n² for all n≥3, and therefore 3n ∈ O(n²).

Note: There’s a much more convenient way to check if f(n)∈ O(g(n)) for two given functions f(n) and g(n). This will be discussed further in this article, after we’re done presenting the formal definitions of some other asymptotic functions.

Big Omega (Ω): Lower bounds

Big Omega is defined in a similar manner as Big O, except that it represents a lower bound instead of an upper bound. Thus, its formal definition only reverses the relation between f(n) and cg(n). In other words, “≤” becomes “≥”:

f(n)∈ Ω(g(n)) if and only if there exist some positive constant c and some nonnegative integer nₒ such that f(n)≥cg(n) for all n ≥ nₒ.

As an example, consider that n³∈ Ω(2n³+n). To prove this statement, you may try c=1/3 and n=1, and use the induction approach as described above. Here’s a graphical representation:

Plotting f(n)=n³ and cg(n)=(1/3)(2n³+n). Created with Matplotlib.

Big Theta (Θ): Tight bounds

Bit Theta is used to represent tight bounds for functions. Saying that f(n)∈ Θ(g(n)) means that f(n) has exactly the same order of growth as g(n).

Basically, Big Theta is the intersection of Big O and Big Omega. Here are two simple definitions for Big Theta based on that fact:

1) f(n) ∈ Θ(g(n)) if and only if f(n) ∈ O(g(n)) and f(n) ∈ Ω(g(n)).

2) Θ(g(n))=O(g(n)) ⋂ Ω(g(n)).

But Big Theta can also be defined similarly as how we’ve defined Big O and Big Omega above, by merging both definitions together:

f(n)∈ Θ(g(n)) if and only if there exist some positive constants cand c₂ and some nonnegative integer nₒ such that cg(n) ≤ f(n) ≤ c₂ g(n) for all n ≥ nₒ.

Just like we did for Big Omega, we can also show that n³∈ Θ(2n³+n). For that, consider c₁=1/3, c₂=1/2, and nₒ=1:

Plotting cg(n)=(1/2)(2n³+n), f(n)=n³, and cg(n)=(1/3)(2n³+n). Created with Matplotlib.

Little Omega (ω) and Little O (o): Loose bounds

That is right, asymptotic notations are not always Big. There are also these two Little notations: Little Omega (ω) and Little O (o). They are not used as much as the Big ones, but I like to mention them for the sake of completeness.

These two notations are used to represent lower or upper bounds that are not “tight”. More specifically, ω(g(n)) includes all functions that are in Ω(g(n)) but not in Θ(g(n)). Formally, ω(g(n)) = Ω(g(n)) – Θ(g(n)). Likewise, o(g(n)) = O(g(n)) – Θ(g(n)).

For instance, n²∈ o(n³), but n³∉ o(n³).

Here are the formal definitions for these two Little notations:

f(n)∈ o(g(n)) if and only if there exists a nonnegative integer nₒ such that f(n)≤cg(n) for all n ≥ nₒ and for any positive constant c.

f(n)∈ ω(g(n)) if and only if there exists a nonnegative integer nₒ such that f(n)≥cg(n) for all n ≥ nₒ and for any positive constant c.

Did you get the difference? While for Big Omega and Big O a single value of c is enough, Little Omega and Little O require the property to be valid for any value of c.

Using Limits To Compare Two Functions

In the examples above I’ve shown how you can use mathematical induction to prove whether f(n)∈ O(g(n)), for any two arbitrary functions f(n) and g(n). But there’s a much more handy way to systematically establish how two functions compare to each other, when it comes to their orders of growth.

For that, you need to compute the limit L = lim(f(n)/g(n)) when n → ∞. If such a limit exists, its value tells you right away which asymptotic notations are valid when comparing f(n) with g(n):

  1. If L=0, then f(n)∈ O(g(n)) and f(n)∈ o(g(n)).
  2. If L → ∞, then f(n)∈ Ω(g(n)) and f(n)∈ ω(g(n)).
  3. For other constant values of L, then f(n)∈ Θ(g(n)).

As an example, consider again f(n)=n³ and g(n)=2n³+n. For these functions, you will find that L=1/2, which means that n³∈ Θ(2n³+n). (But note that, by definition, it also means that n³∈ Ω(2n³+n) and n³∈ O(2n³+n).)

Final Thoughts

In order to use asymptotic notations like Big O, you don’t really need to have an in-depth understanding of the math behind them. Nevertheless, it may be useful to know what they mean and how they are defined, especially when the relationship between two functions is not so obvious. Is Θ(log(n)) faster than Θ(√n)? Is O(2ⁿ) the same as O(3ⁿ)? What about Ω(log₂(n)) and Ω(ln(n))? Now that you know how these notations are mathematically defined, you can try to find answers to such questions by yourself.

Disclosure: This post contains one or more links from the Amazon Services LLC Associates Program. As an affiliate, I get commissions for purchases made through those links, at no extra cost to the customer.

--

--

Ph.D. in Computer Science and software developer, but may occasionally write about other topics.