## Sunday, March 13, 2011

### The simple Big-O Notation Post.

I make no claim to be a "computer scientist" or a software "engineer", those titles alone can spark some debate, I regard myself as a software developer and I generally don't study the math and science behind everything I do. I generally learn what is relevant and useful to my day to day functioning and only rarely go deeper and dabble in the theory behind it. This is one of those occasions, so I decided to scour the internet and see what I could pick up. I hope to keep this simple, practical and to the point.

Big-O:
• Describes how the algorithm scales and performs, in terms of either the execution time required or the space used.
• Is relative representation of complexity. This allows you to reduce an algorithm to a variable which in turn allows you to easily compare it to another.
• Describes an upper limit on the growth of a function, in the other words the "worst case scenario".

There is also Big-Omega notation which looks at the lower bound  / "best case scenario" stating that the algorithm will take at least X amount of time and Big-Theta which is tight bound to both lower and upper / "average".

Some quick observations in determining Big-O:
• A Sequence of statements, or things like conditional checks are constant: O(1)
• A loop of statements result in : O(n) n being the number of loop executions.
• Nested loops are multiplied together: O(n2) where n is the times the outer loop executes and m is the times the inner loop executes.

Comparing the common notation examples:
(Thanks to Algorithms: Big-Oh Notation.)

nConstant
O(1)
Logarithmic
O(log n)
Linear
O(n)
Linear Logarithmic
O(n log n)
Quadractic
O(n2)
Cubic
O(n3)
1111111
2112248
412481664
81382464512
161416642564,096
1,0241101,02410,2401,048,5761,073,741,824
1,048,5761201,048,57620,971,52010121016

Java code example:
Show examples of notations in the table above.

Common Data Structures and Relative functions: Lists and Sets:
Structuregetaddremovecontains
ArrayListO(1)O(1)*O(n)O(n)
LinkedListO(n)O(1)O(1)O(n)
HashSetO(1)O(1)O(1)O(1)
LinkedHashSetO(1)O(1)O(1)O(1)
TreeSetO(log n)O(log n)O(log n)O(log n)
* ArrayList Notes:

Thanks to reader comments, the add method on an ArrayList should be: O(1) amortized (and O(n) in worse case). Useful reference links:

Constant Amortized Time

Maps:

StructuregetputremovecontainsKey
HashMapO(1)O(1)O(1)O(1)
LinkedHashMapO(1)O(1)O(1)O(1)
TreeMapO(log n)O(log n)O(log n)O(log n)
References:

#### 13 comments:

1. Hi,
thanks for the post. It was nice to realize that I remember at least something from my CS course :).

<nit-picker>
I have an observation to some values in the last 2 tables though.

It is not 100% true to say that add to ArrayList is O(1). In case underlying arrays requires extension, it will take O(n), where n is the number of elements already present in the array.

As for the maps, O(1) is the case only in approximation. In fact, it's a bit more complex and depends on load factor, hash function distribution and length of the chain per hash value.
</nit-picker>

On the other side, one of key words in post title is "simple", so these comments are really some nerdy stuff :).

2. Thanks for the comments and from what i read, you're right.

but

:) yeah I tried to keep it simple, practical and easy to remember, so wanna stay away from the painfully technical details

3. You might want to star the ArrayList entry for 'add', and point out the following: add is O(1) amortized, but O(n) worst-case since the array must be resized and copied.

See
http://download.oracle.com/javase/6/docs/api/java/util/ArrayList.html
http://stackoverflow.com/questions/322715/when-to-use-linkedlist-over-arraylist
http://stackoverflow.com/questions/200384/constant-amortized-time

4. Thanks for the input Philip. Yeah, I'll make mention and add the links you mentioned, very helpful.

5. Great post mate,one of the best I have read on BigONotation. beauty of this post is simple, clean and concise and relavant example from Java and Collections make it more useful. I would recommend for any Java developer to read this post to get a clear Idea of BigO and that will certainly help them to analyze there Algorithm as well. hope to see some more article from you.

Thanks
Javin
Why wait() and notify() method must be called from synchronized context

6. Very nice post. The cleanest I've ever seen so far.
Thanks.

Alin

7. Nice summary. While many implementations like to capture the theory of big-O; the associated cost of O(1) is often ignored.

The big-O really depends on the parameters passed to the method/function. For example, if you're dealing with the actual data points it is incorrect to say that any of the LinkedList's operations are O(1). Since given the data, a search to find the list entries is still required.

Have you given thought to the run-time resource usage to get the 'most optimal' performance?

I previously leveraged ArrayList--almost exclusively--due to perceived smaller size and 'constant' access times. I have since retreated from the white rabbit, absent any wounds. The default ArrayList sizes clash with resizing of arrays that kill both performance _and_ resource usage.

8. There's one part that is either wrong, or I am crazy.
while (startIndex < endIndex) {
int midIndex = (endIndex - startIndex / 2) + startIndex;
int midValue = data[midIndex];
The formula for retrieving the midIndex is the problem. The parenthesis don't actually do anything in this case, the equation will be solved the same without it. Did you mean to write this: int midIndex = (endIndex - startIndex)/2 + startIndex ??
Either way I found this post extremely helpful, I've had trouble finding easy to understand documentation on Big ) notation.

9. Well written, it was extremely helpful and easy to understand.

10. Thanks, nice and clearly explained. Another good one on Big O:

Big O Notation

11. Great article, thank you!
One question:
You wrote: "Nested loops are multiplied together: O(n2) where n is the times the outer loop executes and m is the times the inner loop executes."
Shouldn't it be: "Nested loops are multiplied together: O(n * m) where n is the times the outer loop executes and m is the times the inner loop executes."?