Victus Spiritus

home

Brushing up on Computer Science Part 1, Big O

14 Mar 2011

"Those who cannot remember the past are condemned to repeat it"
George Santayana

Intro

Memory is such a fragile construct, yet we are forced to cruelly mold it into submission through extreme repetition and clever tricks of association. And even then, detailed memories fade exponentially in time and are of little use without regular access and utilization.

Early this morning I was inspired to review the basics of computer science, including object oriented programming, data structures, common algorithms and characterization of complexity (Big O) by blogging friend Denton Gentry. I leave the source of my inspiration as an exercise in deductive reasoning.

These posts will serve as a study guide for myself, and anyone else who chooses to read or contribute to them. I'll add to each post as I gather more relevant material and a fresher understanding of the types of issues and real world problems that demand readily accessible knowledge. Here are the 5-ish posts I'm planning on writing up:

  1. Intro and Big O
  2. Object Oriented Programming
  3. Data Structures: arrays, lists, trees, hash tables
  4. Algorithms (searches, sorts, maths!)
  5. Graphs, Networks, and Operating Systems

Big O

Let's begin with a review of Big O or Big Omicron. While you can count on Wikipedia to deliver a well written description of Big O notation, I found William Shields Plain English Explanation of Big O Notation to be a concise and welcome companion for brushing off the cobwebs (thanks William). William shares a down to Earth description of Big O:

Big O notation seeks to describe the relative complexity of an algorithm by reducing the growth rate to the key factors when the key factor tends towards infinity. For this reason, you will often hear the phrase asymptotic complexity. In doing so, all other factors are ignored. It is a relative representation of complexity.

Before we get into the details, I think it's helpful to get a frame of reference for the Orders of Complexity for several different algorithms, some of which you may be intimately familiar with. Over the years I've come across specific examples of Big O notation for various algorithms and here I list a brief sample (see the references, in particular Time Complexity for more complete lists):

For Big O notation we usually skip the Best Case and concentrate on the Expected or Worst Case performance. Best Cases are rare and wonderful, unless they happen all the time in which case, they're expected :).

Equality and arithmetic operators

Unlike many other equality operators you've seen before, this equality operator is not symmetric (reference).

Equals sign

The statement "f(x) is O(g(x))" as defined above is usually written as f(x) = O(g(x)). Some consider this to be an abuse of notation, since the use of the equals sign could be misleading as it suggests a symmetry that this statement does not have. As de Bruijn says, O(x) = O(x^2) is true but O(x^2) = O(x) is not.[2] Knuth describes such statements as "one-way equalities", since if the sides could be reversed, "we could deduce ridiculous things like n = n2 from the identities n = O(n^2) and n^2 = O(n^2)."[3]

For these reasons, it would be more precise to use set notation and write f(x) ∈ O(g(x)), thinking of O(g(x)) as the class of all functions h(x) such that |h(x)| ≤ C|g(x)| for some constant C.[3] However, the use of the equals sign is customary. Knuth pointed out that "mathematicians customarily use the = sign as they use the word 'is' in English: Aristotle is a man, but a man isn't necessarily Aristotle."[4]

Big O notation can also be used in conjunction with other functions. In particular functions only the most complex or highest order aspect dominates it's Order, i.e. if f(x) = x^2 + x + 3 it's complexity is O(n^2)

Big O notation can also be used in conjunction with other arithmetic operators in more complicated equations. For example, h(x) + O(f(x)) denotes the collection of functions having the growth of h(x) plus a part whose growth is limited to that of f(x). Thus,

g(x) = h(x) + O(f(x))\,

expresses the same as

g(x) - h(x) \in O(f(x))\,.

I'll wrap up with an important excerpt from the wikipedia page on the order of complexity for graph theory a vital model for practical programming applications:

Graph theory

It is often useful to bound the running time of graph algorithms. Unlike most other computational problems, for a graph G = (V, E) there are two relevant parameters describing the size of the input: the number |V| of vertices in the graph and the number |E| of edges in the graph. Inside asymptotic notation (and only there), it is common to use the symbols V and E, when someone really means |V| and |E|. We adopt this convention here to simplify asymptotic functions and make them easily readable. The symbols V and E are never used inside asymptotic notation with their literal meaning, so this abuse of notation does not risk ambiguity. For example O(E + VlogV) means O((E,V) \mapsto |E| + |V|\cdot\log|V|) for a suitable metric of graphs. Another common convention—referring to the values |V| and |E| by the names n and m, respectively—sidesteps this ambiguity.

References:
Complexity and Big O notation:

The following are not just excellent tips for interviewing, they're practical skills any solid computer scientist should have at their disposal and helped guide this blog series topic coverage.