- March 24, 2023
Misunderstanding Big O Notation can hurt how your systems scale.
Algorithms that increase in operation at a larger rate than linear algorithms are inefficient. This can result in systems that scale at a costly rate.
As a Software Engineer, it is vital you can identify all of the inefficient notations, and this article will cover the first three: N Log N, Quadratic and Cubic notation.
Big O Notation — How algorithms grow in complexity as the input grows in size
If you are not aware of what Big O Notation is, check out the first article from the Big O Notation series — ‘What is Big O Notation?’
The Wikipedia definition of Big O Notation is — ‘In computer science, big O notation is used to classify algorithms according to how their run time or space requirements grow as the input size grows.’
To classify these algorithms we use a growth hierarchy which is ranked in order of efficiency. A good analogy is the energy efficiency rating. Where A is the most efficient rating, and as you move down the alphabet, the lower the efficiency.
In our previous article, we covered the most efficient algorithm notations — Constant, Logarithmic and Linear. These notations grow at a small or fixed rate as the input grows.
Today our concern is the notations which grow at a larger rate than a linear increase as the input gets bigger. With a linear fixed rate, we would expect to see a fixed increase.
For example, a linear algorithm that grows by a 1-second operation time as the input increases by 1, will take 100 seconds to process an input of 100.
This is a linear algorithm, but the following notations will grow at a larger rate, as the input gets bigger:
N Log N Notation — O (n log n)
N Log N involves an algorithm that has an operation of n multiplied by log n.
(n x log n)
For a recap on the log, n notation check out my previous article on the notations.
This occurs in algorithms that step through the input in a linear fashion (a fixed rate), and for each input, a log n operation is used.
Typically we see this in a Merge Sort algorithm, where the input (a list), is split up repeatedly into groups of two until each item has a separate group (linear operation). And then each pair of groups are sorted into one larger group. Repeating the sorting and merging of two groups, until there is just one group left.
So how does that work?
Real Life Scenario — Clothing Store
Christopher is working at a clothes shop and has been tasked with sorting all the trousers in the store.
There is only one size of each item ranging from size 10 to size 17.
Now he has come up with a system to order these efficiently so he takes the following steps:
- He gets the 8 pairs of trousers and keeps dividing the groups of trousers until he has 8 different groups (groups 1–8).
- Christopher then puts two groups together and sorts them in the process. Comparing the size of group 1 and group 2. Putting the lowest on the left in group 9 and the largest on the right. He repeats this process until he has four groups (groups 9–12).
3. Christopher repeats this process and selects the next two groups. He compares the left-hand side of both groups. Whichever is lowest will be moved out of the group and placed on the left-hand side of the new group. He repeats the process of comparing the most left items in each group until there are no more items left.
So for group 9 and group 10. Christopher compares the most left items (13 for group 9, and 10 for group 10). As 10 is the lowest number, it will be the first number in group 13. Now he compares 13 for group 9 and 11 for group 10 (the most left remaining number). As 11 is the lowest, this goes into group 13 next. Now he is just left with group 9 intact, and as that has already been sorted, these trousers can go at the end of group 13.
This process is repeated for groups 11 and 12 resulting in two new groups:
There is one more step, merging the two groups and sorting all the trousers together.
He repeats the process. Comparing the left-most size of group 13 (size 10), with the left-most size of group 14 (size 12). As 10 is the lowest Christopher adds it to group 15 first, and compares the next most left item from group 13 with the most left of group 14. As size 11 is still the lowest, he adds the size to group 14 next and he moves up group 13 again. Comparing size 13 with size 12.
Christopher continues this process until he has a sorted group of trousers.
Christopher used a merge sort to order his trousers
This is an example of a merge sort using n log n.
In step 1, Christopher uses a linear approach to divide the trousers into their own group. By halving the group of 8 trousers continuously until each set of trousers is in their own group.
He compares two groups of trousers and sorts them into a new larger group. Repeating this process until there is one sorted group left. He uses divide and conquer to reduce the number of groups from (eight separate groups to one group), which is a log n notation.
As step 1 was O(n) and the remaining steps were O(log n), we can put the notations together to get O(n x log n), or shortened: O(n log n).
A merge sort contains more complex code than the typical notation:
For this merge sort, there are two separate functions, one function to split the list of numbers up (mergeSort()) and another function to sort the list (merge()).
Merge Sort Function
This function looks to see if there is a list of more than one integer and splits the list into two groups.
As the function is recursive, it calls itself until the stop condition is met. Splitting out the list until the numbers are individual items, using linear notation O(n).
Once this stop condition is met the second function gets called for each iteration of the split.
So if we have a list of [4, 2, 3, 1], we will have multiple levels:
1. [4, 2, 3, 1]
2. [4, 2] [3, 1]
3.    
The merge function runs the output closest to the stop condition and compares two groups.
As 4 and 2 would be the first numbers to reach the stop condition we would run the merge function as so:
And the merge function would return an array with the numbers sorted in size [2, 4].
The next comparison is run:
This sorting continues to occur until there is one group of sorted integers left:
By running a linear function to divide the groups and a log n function to sort and merge the groups, we have an O(n log n) result.
Big O Notation — Quadratic Notation — O(n²)
Quadratic notation is a simple one to remember.
As the input gets larger, so does the amount of operation by a squared rate.
The amount of operation required is the input squared.
If the input is 4, we expect there to be 12 steps (4 x 4). Or if the input is 2, there should be 4 steps (2 x 2).
Real Life Scenario- Creating a guest invitation list
Mike sells a guest invitation service.
Part of his service involves writing handwritten invitations to his customers. For every invitation package, he must make an invitation for each guest. And as part of his design, he includes every guest’s name on the invitation.
The larger the guest list, the more work there is for Mike by a quadratic amount.
If just three people are on the guest list, he just needs to create three invitations and add three names to each invitation.
3 Invitations x 3 Names
But if there are twenty people on the guest list, he must write twenty invitations and include twenty names on each invitation.
20 Invitations x 20 Names
This is not a great business model for Mike. As he starts accepting larger events, the amount of work scales at a squared rate.
Making the notation quadratic.
Squared notations are normally seen when you have a for loop inside a for loop.
In Software Engineering we call this a nested for loop, meaning we have a loop inside another for loop.
Here’s an example:
A nested for loop gets larger as the input gets larger. As for every iteration of the outside loop, the inside loop must loop to a completion.
Meaning if the input is 3, for every step of the outside loop, the inside loop must perform 3 steps. Meaning the outside loop ends up printing the below statement three times:
echo('Invitation: ' . $i);
While the statement of the inside loop gets printed nine times:
echo('Guest: ' . $j);
Now this only works, if we loop through the input for both the outside loop and the inside loop, but you can see how this grows as the input gets bigger.
Cubic Notation — O(n3)
This notation is similar to a Quadratic Notation.
The input of an algorithm with a cubic notation will have the complexity of the input cube.
If the input of an algorithm with a cubic notation is 3, the complexity in time or space will be:
(3 x 3 x 3) = 27.
For a cubic time complexity, this would result in 27 total steps just for an input of 3.
This notation should be avoided, due to its inefficiency.
A cubic notation usually occurs when there are three levels of a nested for loop.
Each for loop, iterates through the input, resulting in the following equation:
(n x n x n)
This could be seen in PHP like this:
A better way to paint the picture of this function is through a flow chart.
Now imagine the input is 2:
The above diagram shows how many steps the cubicNotation() function would have to make, just to process an input of 2.
Algorithms which grow above a linear rate as the input grows are inefficient
The ideal scenario would be to always use linear, log n, and constant notation-based algorithms.
But sometimes using inefficient algorithms for more complex operations is unavoidable.
As a Software Engineer, if you can identify an algorithms notation and improve it by one rank, you can save a lot of money on operation costs.