[Math] Using the substitution method for solving recurrences

recurrence-relations

I have a question.

In my book they have the following recurrence:

$T(n) = 3T(\lfloor n/4\rfloor )+\theta(n^2)$

They try to guess that $T(n) = O(n^2)$ and they then use the substitution method to verify the guess. But they don't show the base case? Isn't that necessary?

I think maybe it is because they don't know what happen with $T(n)$ when $n=1.$ ??

In my book they also have the reccurence $T(n)=2T(\lfloor n/2 \rfloor)+n$ and $T(1)=1$

They then guess that $T(n)=O(n \ln n)$ And they use the substitution method to verify it.

They assume that $T(n)=O(n \ln n)$ for all positive m

$T(n) \leq 2( c \lfloor n/2 \rfloor \ln( \lfloor n/2 \rfloor)+n$

$\phantom{T(n)} \leq cn \ln(n/2)+n$

$\phantom{T(n)} = cn \ln(n)-cn \ln(2)+n$

$\phantom{T(n)} = cn\ln(n)-cn+n$

$\phantom{T(n)} \leq cn \ln(n)$

where the last step holds as long as $c\geq 1$.

Ok. They then say: "Mathematical induction requires us to show that our solution holds for the boundary conditions"

$T(1)\leq cl\ln(1)=0$

which is at odds with $T(1)=1$

but then they take advantage of asymptotic notation requiring them only to prove $T(n)\leq c n \ln(n)$ for $n\geq n_0$ where they get to choose $n_0$

Then they replace $T(1)$ by $T(2)=4$ and $T(3)=5$ as base cases in the inductive proof letting $n_0=2$

And my question is:

Why do I have to replace the base case $T(1)$ with $T(2)$ AND $T(3)$? Why not just replace it with $T(2)=4 $

I can derive $T(2)=4$ from the recurrence and then say

$T(2)\leq c2 \ln(2) = c2$

Where $c \geq 1$ and I choose $c\geq 2$

Why do I have to consider $T(3)$ ?

Best Answer

There are two competing ideas here, that of induction, and that of asymptotics. Normally, induction requires a base case, but because of the way asymptotics work, the base case will be trivially satisfied, because we can always take $c$ to be large enough to outweigh what happens in the early cases. The fact that they did any base cases at all is purely pedagogical, so as to make things look like induction, but as we shall see the base case is not strictly necessary.

We can phrase what we are trying to prove in terms of induction as follows.

The recurrence $T(n)$ is $O(f(n))$ if there exists constants $c$ and $n_0$ such that $T(n+n_0)<cf(n+n_0)$ for every n>0$.

Here we have phrased things in terms of $n+n_0$ just so that the induction can start at $1$, but there is no harm in replacing $n$ with $n_0$ and starting the induction with $1+n_0$.

Assume $f(n)>0$ for all $n>k$. Since we don't care what $c$ is, if all we are concerned about is making the asymptotic statement true, we can take $n_0$ to be $k$, because we can push all the rest of the slack into our choice of the unspecified constant $c$. It might take many terms before $f$ starts growing as fast as $T$, but $c$ can take care of any finite number of terms before this happens.

For example. if $T(n)=10^{6n}$ for $n<10^6$ and $0$ afterwards, $T(n)$ is $O(1)$. We start with $n_0=1$ and we take $c=10^{10^{10^{10}}}$. Or bigger if we want. We take $c$ to be whatever it needs to be so that the base case is immediately satisfied without any checking. Note that we could take $c=10^6$ if we only cared about the base case, but a larger constant was required to make the statement true despite our poor choice of $n_0$.

Of course, we might still need $n_0$ to be large, because the growth of the sequence might depend on $n$ in such a way that it behaves differently for small $n$, and our choice of $c$ can only take care of the base case. We need $n_0$ to make sure that the induction step still holds. In the above example, if we want induction to hold, we want to take $n_0>10^6$.

Related Question