How to prove Big Theta Notation for a program

asymptoticscomputational complexity

I've been struggling with proving a Big Theta Notation for a function. I can understand how one would go about proving a Big Theta Notation when an f(x) is explicitly mentioned, but I don't know how you would do that with a program.

Below I've written down what I thought was a common definition for Big Theta Notation.

enter image description here

In a program's case, what would f(x) be and what would be a good way of proving its Big Theta Notation?

For example, when I look at a typical nested for loop, I imagine that its Big Theta Notation is Θ(n2), where n is the specified number of iterations. How do I prove this using the Big Theta definition, particularly when I have no f(x)?

Just to preface, I have homework where I have to prove the Big Theta Notation of a given program. I just want general advice on how I can prove the Big Theta Notation for any arbitrary program.

Best Answer

When studying the runtime of an algorithm or a program, you really do have an $f(n)$ hiding in there somewhere ($n$ being the size of the input). The point of big-$\Theta$ notation is to forget about annoying constant or insignificant terms and focus on the dominant asymptotic behaviour (since programs are often quite messy).

For instance, suppose we wanted to analyse the runtime of an algorithm which, given a list $A$ of $n$ elements, counts the number of pairs $(i,j)$ with $i<j$ for which $A[i]\neq A[j]$, and then returns this number and also the maximum number in $A$. A naive approach would be:

  • $c:=0$
  • for $j=0,\dots,n-1$
    • for $i=0,\dots,j-1$
      • if $A[i]\neq A[j]$
        • $c \gets c+1$
  • print $c$

If we wanted to analyse its runtime precisely, we would have to account for a lot of annoying constants of computation (such as how much time it takes to modify a variable, access memory, jump between instructions). Since big-$\Theta$ is blind to constants, we can ignore these. So, we can determine $f(n)$ to be the number of instructions of the above algorithm in an idealised world where all instructions take the same amount of time.

Unfortunately, the number of computations also depends on $A$, sort of. Namely, because of the "if" clause, we only run the "$c\gets c+1$" instruction when $A[i]\neq A[j]$. However, big-$\Theta$ saves us again here: in the best case we'll only run the "if" instruction in every loop (if $A$ is a list of $n$ copies of the same thing), and in the worst case we'll always run both the "if" instruction and the "$c\gets c+1$" instruction (if $A$ is a list of $n$ distinct elements). We're bounded between two constants, so big-$\Theta$ allows us to say this doesn't matter. So, we can simplify $f(n)$ to count the contents of this inner loop as a single instruction.

Back to counting instructions. We have two instructions sandwiching our loops, and then the inner loop iterates $j$ times for each $j$ ranging from $0$ to $n-1$. This means that the precise number of times the double loops are run is $\sum_{j=0}^{n-1}j = \frac{n(n-1)}2$.

Already, in our idealised world ignoring constants of computation and the "if" branching, we get $f(n) = \frac{n(n-1)}2+2$ as our runtime. In most cases, this much information is too much information, so big-$\Theta$ comes in again to simplify our lives and allow us to conclude $f(n)\in\Theta(n^2)$, meaning our algorithm is just a quadratic time algorithm.

At the end of the day, big-$\Theta$ (or just big-$O$ in general) allows us to simplify our runtime analyses to just "ballpark counting" how many iterations our loops make, and multiplying them by a "ballpark" on how long it takes for the instructions in the loop to run. Of course, you have to be careful when making "ballpark" estimates (sometimes nested loops have the inner loop behaving so sensitively to the outer loop that the runtime is much better than $\Theta(n^2)$ such as the case with the sieve of Eratosthenes where its runtime is $O(n\log\log n)$), but with enough practise you get used to how to play the game, and recognise when nested loops are $\Theta(n^2)$ or suspect otherwise.

Related Question