Why are the roots of this recursive defined polynomial bound by the roots of the discriminant of the characteristic polynomial

characteristic polynomialdiscriminantpolynomialsrecursionroots

Let's define a polynomial recursively as:

$$
\begin{align}
p_0(x) &= 1 \\
p_n(x) &= x \sum\limits_{k=1}^n a_k p_{n-k}(x)
\end{align}
$$

Let $a_k$ be an arithmetic progression.

Question: Why are the roots of $p_n(x)$ bound by the roots of the discriminant of the characteristic polynomial of $p_n(x)$?

Here some steps I did so far:

1. Simplification of the recursion

A general formula for the arithmetic progression is: $a_n = a_{n-1} + d = a_1 + (n-1) d$

$$
\begin{align}
p_n(x) &= x \sum\limits_{k=1}^n a_k p_{n-k}(x) = a_1 x p_{n-1}(x) + x \sum\limits_{k=2}^n a_k p_{n-k}(x) \\
&= x a_1 p_{n-1}(x) + x \sum\limits_{k=2}^n a_{k-1} p_{n-k}(x) + xd \sum\limits_{k=2}^n p_{n-k}(x) \\
&= (a_1x+1) p_{n-1}(x) + xd \sum\limits_{k=2}^n p_{n-k}(x) \\
&= (a_1x+1) p_{n-1}(x) + xd p_{n-2}(x) + xd \sum\limits_{k=3}^n p_{n-k}(x) \\
&= (a_1x+2) p_{n-1}(x) + xd p_{n-2}(x) -(a_1x+1)p_{n-2}(x) \\
&= (a_1x+2) p_{n-1}(x) + \left[(d-a_1)x-1\right] p_{n-2}(x) \\
\end{align}
$$

2. Root of the discriminant $\Delta_{\lambda}$ of the characteristic polynomial $\chi(\lambda)$

$$
\begin{align}
\chi(\lambda) &= \lambda^n – (a_1x+2) \lambda^{n-1} – \left[(d-a_1)x-1\right] \lambda^{n-2} = 0 \\
&= \lambda^2 – (a_1x+2) \lambda – \left[(d-a_1)x-1\right] = 0 \\
\Delta_{\lambda} &= (a_1x+2)^2 + 4\left[(d-a_1)x-1\right] = a_1^2x^2 + 4a_1x + 4 + 4dx – 4a_1x – 4 \\
&= (a_1^2 x + 4d)x \\
x_1 &= 0 \\
x_2 &= -\frac{4d}{g_1^2}
\end{align}
$$

Let $x_0$ be a root of $p_n(x)$ such that $p_n(x_0) = 0$ then

$$
-\frac{4d}{a_1^2} < x_0 \leq 0, \forall d, a_1, n
$$

EDIT: There was a mistake in the simplification process of the recurrence relation. I corrected it. It influenced the result for the calculation of the roots of the discriminant. Therefore the accepted answer was using my wrong calculation. But the answer is still applicable as the inequality was still holding.

Best Answer

The characteristic polynomial is crucial to understanding how the function behaves because of how the difference equation works. This answer assumes you are very comfortable with the basics of linear difference equations and that have a good theoretical grasp of how functions work.

We're going to work on getting a closed form for $p_n(x)$. The key realization here is that the value $p_n(1)$, for example, only depends on $p_1(1),p_2(1),p_3(1),\dots,p_{n-1}(1)$. In general, for a fixed $t$, $p_n(t)$ only depends on the values of the other functions at $t$. Therefore, if we fix $t$, we can explore the behavior of the sequence $p_1(t),p_2(t),p_3(t),\dots$ without worrying about how the function behaves in other places.

I'll take your equation, $p_n(x)=(x+2)a_1p_{n-1}(x)+[(d-a_1)x-a_1]p_{n-2}(x)$ as given. If we fix $x$, then we can write $p_n(x)-(x+2)a_1p_{n-1}(x)-[(d-a_1)x-a_1]p_{n-2}(x)=0$, which is a linear difference equation with constant coefficients (remember $x$ is fixed). This means that we can solve it the standard way; assume that $p_n=\lambda^n$ is a solution for some constant $\lambda$ (or, in the world of the function $p_n(x)$, $\lambda$ depends on $x$). This gives us the characteristic function $\lambda^2-(x+2)a_1\lambda-((d-a_1)x-a_1)=0$. This will, of course, give rise to two (either both real or complex conjugate) solutions $\lambda_{1,2}(x)$, and then for suitable constants $c_1,c_2$ we have that $p_n(x)=c_1(\lambda_1(x))^n+c_2(\lambda_2(x))^n$. (This assumes $\lambda_1\neq\lambda_2$: the case where the two are equal is unimportant to the overall explanation and adds extra complexity, so I won't be covering it.)

So far, we've used the same ideas that exist in any beginner's course on difference equations. The only level of abstraction to get your head around is that we are using functions of $x$ instead of sequences. This is important to understand in order to answer the question you posed.

Let's go back to that characteristic polynomial, $\lambda^2-(x+2)a_1\lambda-((d-a_1)x-a_1)=0$. Clearly, a choice of $x,a_1,d$ fixes constant values of $\lambda_{1,2}$. The discriminant of that characteristic polynomial is, as you said, $\Delta=a_1^2x^2+4[a_1(a_1-1)+d]x$, a function of $x$. But what does the discriminant of a quadratic mean? You'll remember that if the discriminant is positive, then there are two distinct real solutions for the quadratic. If the discriminant is negative, then there are two non-real, complex conjugate solutions of the quadratic.

So, if we are seeking to evaluate $p_n(x_1)$, for some fixed real $x_1$, then that choice of $x_1$ will make $\Delta>0$ or $\Delta<0$. (Again, $\Delta=0$ gives the double root, which you should explore in your own time.) If we have $\Delta(x_1)>0$, then the corresponding values $\lambda_{1,2}(x_1)$ will be real and distinct. In that case, we may be able to find a root to $p_n(x_1)=c_1\lambda_1(x_1)^n+c_2\lambda_2(x_1)^n$. But if $\Delta(x)<0$, then the corresponding values $\lambda_{1,2}(x_1)$ will be nonreal and complex conjugates. Let's explore this case in more detail:

Again, we've already chosen $x_1$, so we'll just write $p_n,\lambda_1,\lambda_2$ to save space and keep me sane. Now, we know that $\lambda_{1,2}$ are complex conjugates, and so $\lambda_1^n$ and $\lambda_2^n$ are also complex conjugates. So we can write: $\lambda_1^n=r+bi$ and $\lambda_2^n=r-bi$. So if want $p_n=c_1\lambda_1+c_2\lambda_2=0$, then $r(c_1+c_2)+b(c_1-c_2)i=0$. So either $\lambda_1=\lambda_2=0$ (which is false), or $c_1=c_2=0$ (which is false), or $\lambda_{1,2}$ are pure imaginary and $c_1=c_2$ (which turns out to never be true). So clearly, we can't have $p_n(x_1)=0$.

Let's summarize: when we choose $x_1$ so that $\Delta(x_1)<0$, then we must have complex conjugate solutions $\lambda_{1,2}$ and therefore $p_n(x_1)\neq0$. But if $\Delta(x_1)>0$, then $\lambda_{1,2}$ are real and we might have a root $p_n(x_1)=0$. So, every root $x_0$ of $p_n$ satisfies $\Delta(x_0)>0$. (Technically $\geq 0$, but we're ignoring $\Delta=0$.) But when is $\Delta(x_0)>0$? Why, whenever $x_0$ is in between the two roots $x_1,x_2$ of $\Delta$! So we know that if $p_n(x_0)=0$, then:

$$-4-\frac{4(d-a_1)}{a_1^2}<x_0<0$$

And then we add in the trivial solution $x_0=0$ to get the final inequality.

I hope this helped! Please let me know if you have any questions :)

Related Question