[Math] What does Big O actually tell you

algorithmsasymptoticscalculuscomputational complexityreal-analysis

Two days ago I felt very uncomfortable with Big O notation. I've already asked two questions:

  1. Why to calculate "Big O" if we can just calculate number of steps?
  2. The main idea behind Big O notation

And now almost everything has become clear. But there are few questions that I still can't understand:

  1. Suppose we have an algorithm that runs in $1000n$ steps. Why people say that $1000$ coefficient becomes insignificant when $n$ gets really large (and that's why we can throw it away)? This really confuses me because no matter how large $n$ is but $1000n$ is going to be $1000$ times bigger than $n$. And this is very significant (in my head). Any examples why it is considered insignificant as $n$ tends to infinity would be appreciated.

  2. Why Big O is told to estimate running time in worst case? Given running time $O(n)$, how is it considered to be worst case behavior? I mean in this case we think that our algorithm is not slower than $n$, right? But in reality the actual running time could be $1000n$ and it is indeed slower than $n$.

  3. According to the definition, Big O gives us a scaled upper bound of $f$ as $n \to +\infty$, where $f$ is our function of time. But how do we interpret it? I mean, given algorithm running in $O(n)$, we will never be able to calculate the actual number of steps this algorithm takes. We just know that if we double the size of the input, we double the computation time as well, right? But if that $O(n)$ algorithm really takes $1000n$ steps then we also need to multiply the size of the input by $1000$ to be able to visualise how it grows, because $1000n$ and $n$ have very different slopes. Thus in this case if you just double the computation time for the doubled size of the input, you're going to get wrong idea about how the running time grows, right? So how then do you visualise its growth rate?

I want to add that I know the definition of Big O and how to calculate it, so there is no need in explaining it. Also I've already googled all these questions tons of times and no luck. I'm learning calculus at the moment, so I hope I asked this question in the right place. Thank you in advance!

Best Answer

Reading between the lines, I think you may be misunderstanding Big O analysis as being a replacement for benchmarking. It's not. An engineer still needs to benchmark their code if they want to know how fast it is. And indeed in the real world, an $\mathcal{O}(n)$ algorithm might be slower than an $\mathcal{O}(n^2)$ algorithm on real-world data sets.

But, as $n$ approaches infinity, the $\mathcal{O}(n^2)$ algorithm will ultimately be slower. For the sake of example, if we allow constant factors in our Big-O notation, then an $\mathcal{O}(10n)$ algorithm will take more "steps" than an $\mathcal{O}(n^2)$ algorithm, if $n$ is less than $10$ ($10\cdot 10 = 10^2$).

But if $n$ is $100$, then the $\mathcal{O}(n^2)$ algorithm takes ten times as long. If $n$ is $1000$, it takes a hundred times as long. As $n$ grows, so too does this difference. That manner in which the two algorithms differ is what we are analyzing when we use Big O analysis.

Hopefully that example also makes it clear why the constant factor is irrelevant and can be ignored. Whether it's ten, a hundred, a thousand, a million, or a quadrillion ultimately does not matter, because as $n$ approaches infinity, the $\mathcal{O}(n^2)$ algorithm is eventually going to be slower anyway. That's the power of exponential growth.

The crux of it is that Big O analysis is a mathematical concept which does NOT tell an engineer how fast something is going to run or how many steps it's going to take. That's what benchmarking is for. Big O analysis is still a very helpful tool in algorithm design, but if you're interested in exactly how long something takes to run, you'll need to look elsewhere.

Related Question