Big O notation is probably something you have come across if you have ever done programming or looked at mathematical algorithms. Many algorithms (i.e. Ramujan's Formula) list a Big O notation, examples of these notations are: $$ \Large \begin{multline} \\ O(1) \\ O(n) \\ O(n^2) \\ O(n \log(n)) \\ \\ \end{multline} $$ But what do these things actually mean?
Generally speaking, Big O notation can be treated as the "speed" of the algorithm. But this term isn't entirely correct, it should instead be seen as "how much an increase in difficulty, increase time".
Now that statement might not instantly make sense, but stick with me here, using these coding examples, I hope to help you understand.
def get_item(l, n):
return l[n]
x = [0, 1, 2, 3]
get_item(x, 2)
We call the algorithm get_item difficulty: \( O(1) \). This is because, even if the list is 10000s of items long, the difficulty of accessing the "nth" term is no more complicated. I.e. finding item 1 and finding item 10000 are equally as difficult.
Now let us take a slightly different problem, what if I asked you to (non-pythonically) find an item in a list that's equal to 1.
x = [0, 1, 2, 3]
for item in x:
if item == 1:
break
We say this \(\text{has a worse case scenario of } O(n)\). But what does "worst case scenario" mean?
We say it has a worst case scenario, because the worst case scenario is that the list looks like this:
x = [..., 1]
When the final item of the list is 1, the computer has to look through the whole list to find this item.
In a best case scenario, this problem has Complexity: \(O(1)\)
x = [1 ....]
When the algorithm finds the first value as the value "1", its best case is only needing to do one function. This is no different to our first example of get_item.
Let's take an example of a 2-dimensional (square) list of values, and try and do the same as before, where we try and find the value that's equal to 1.
x = [
[4, 3, 7, 3],
[9, 4, 3, 2],
[2, 1, 3, 5],
[6, 7, 1, 5]
]
for row in x:
for cell in row:
if cell == 1:
exit()
So, as you can see, we now have two for-loops inside each other, this means that it has to go through every
single row, and then also go through each column of those rows (cell).
This problem has a worst case scenario of \(O(n^2)\) complexity, because if the value of 1 were in the far right
corner of this table, it would have to iterate over every single item in the list to reach there.
We can find the complexity of a problem by simply looking at the formula that requires it. For example, let's look at the following formula:
$$ y = x^2 + 2x + 1 $$
As the value of \(x\) increases, the value of the \(x^2\) increases much more than the value of \(2x\). And the
value of \(1\) will remain the same.
In order to understand this better, let's plot it on a graph:
If you can't see this graph, try plotting this on a bit of paper, trust me, it'll help.
As we zoom-out on this graph, we can see that the \(y = x^2\) graph (blue) gets bigger much quicker than the \(y = 2x\) (red) graph. For this equation, we call its difficult just \(O(n^2)\) because as the value of x increases it approaches infinity.
If you want to read up more about this, I'd thoroughly recommend looking at fractional limits of infinity.
This section of the guide is inspired by A-Level questions, even if you aren't taking A-level, it is still hopefully interesting!
Example problem:
$$ \begin{gather} \text{A list of length 100, takes 5s to be sorted using Bubble Sort.} \\ \text{Given that Bubble Sort is of complexity } O(n^2) \\ \text{Calculate the time take for a list of length 500} \end{gather} $$
First we need to look at the multiplier increase in size. To do this do: \(\frac{500}{100}\). This should give a \(5\) times increase. Because the problem has a difficulty of \(O(n^2)\), we will assume it's the worst case, and square this \(5\). \(5^2 = 25\). Then times this by the original time taken: \(25 \times 5 = 125\).
This gives an overall formula of (for \(O(n^2)\)):
$$
\text{new time taken} = \left (\frac{\text{new length}}{\text{old length}} \right )^2 \times
\text{old time taken}
$$
What if it were for a difficulty of \(O(n!)\) (something we haven't looked at yet)?
$$ \text{new time taken} = \left (\frac{\text{new length}}{\text{old length}} \right )! \times \text{old time taken} $$
Let's take an example formula as a reference point:
$$ \Large f(x) = \frac{x}{x^2-100}, x \neq -10, 10 $$
REMEMBER: \(x\) cannot equal to \(10, -10\) because that would make a division by 0.
Now let's say, we start increasing the value of \(x\) all the way up to \(\inf\). In mathematics we write this as:
$$\Large\begin{gather} \lim_{x \to +\infty} f(x) \\ \text{or} \\ \lim_{x \to \infty} f(x) \end{gather}$$
So let's rewrite our formula, using this standard:
$$\Large\begin{gather} \lim_{x \to \infty} f(x) = \frac{x}{x^2-100} \\ x \neq -10, 10 \end{gather}$$
Let's now plot this function, to give an idea of what it looks like!
Can you see, as it gets closer to infinity (as the value increases) it starts getting closer and closer to the
x-axis!
This means the \(y\) value (\(f(x)\) value) is approaching 0. We can write this as:
$$\Large\begin{gather} \lim_{x \to \infty} f(x) = \frac{x}{x^2-100} = 0 \\ x \neq -10, 10 \end{gather}$$