Spivak, Ch. 20, Problem 9c: Understanding comment at end of problem item.

calculusderivativesintegrationsolution-verificationtaylor expansion

(a) Problem $7(i)$ amounts to the equation

$$P_{n,a,f+g}=P_{n,a,f}+P_{n,a,g}$$

Give a more direct proof by writing

$$f(x)=P_{n,a,f}(x)+R_{n,a,f}(x)\tag{1}$$

$$g(x)=P_{n,a,g}(x)+R_{n,a,g}(x)\tag{2}$$

and using the obvious fact about $R_{n,a,f}+R_{n,a,g}$.

(b) Similarly, Problem $7(ii)$ could be used to show that

$$P_{n,a,fg}=[P_{n,a,f}\cdot P_{n,a,g}]_n$$

where $[P]_n$ denotes the truncation of $P$ to degree $n$, the sum of
all terms of $P$ of degree $\leq n$ [with $P$ written as a polynomial
in $x-a$]. Again, give a more direct proof, using the obvious facts
about products involving terms of the form $R_n$.

(c) Prove that if $p$ and $q$ are polynomials in $x-a$ and
$\lim\limits_{x\to 0} \frac{R(x)}{(x-a)^n}=0$ then

$$p(q(x)+R(x))=p(q(x))+\bar{R}(x)$$

where $$\lim\limits_{x\to 0} \frac{\bar{R}(x)}{(x-a)^n}=0$$

Also note that if $p$ is a polynomial in $x-a$ having only terms of
degree $>n$, and $q$ is a polynomial in $x-a$ whose constant term is
$> 0$, then all terms of $p(q(x-a))$ are of degree $>n$.

My question regards the comment at the end of item $(c)$.

Let $p$ and $q$ be polynomials in $x-a$ such that terms in $p$ have degree $>n$, and the constant term in $q$ is $0$.

$$p(x)=\sum\limits_{i=n}^{m_1} a_i (x-a)^i$$

$$q(x)=\sum\limits_{i=1}^{m_2} b_i (x-a)^i$$

What is $q(x-a)$?

Is it

$$q(x-a)=\sum\limits_{i=1}^{m_2} b_i (x-2a)^i$$

Or is it just

$$q(x-a)=\sum\limits_{i=1}^{m_2} b_i (x-a)^i$$

In the first case we have

$$p(q(x-a))=\sum\limits_{i=n}^{m_1} a_i \left ( \sum\limits_{j=1}^{m_2} b_i (x-2a)^j -a \right )^i$$

In the second we have

$$p(q(x-a))=\sum\limits_{i=n}^{m_1} a_i \left ( \sum\limits_{j=1}^{m_2} b_i (x-a)^j -a \right )^i$$

Is this sort of what the comment is saying? If so, is the next step to somehow conclude that all $(x-a)$ factors have a degree $>n$ in this second expression?

Best Answer

There are many typographical errors in part (c). Looking at the intended application of part (c) and its remark, the most general true statement is:

If $p$ and $q$ are polynomials and $\lim\limits_{x\to a}\frac{R(x)}{(x-a)^n}=0$ then $$p(q(x) +R(x)) = p(q(x)) + \overline R(x)$$ where $$\lim_{x\to a}\frac{\overline R(x)}{(x-a)^n}=0.$$ Also note that if $q$ is a polynomial in $x-a$ and $p$ is a polynomial in $x-b$ with $b=q(a)$, then $p(q)$ is a polynomial in $x-a$. Moreover, if $p$ has only terms of degree $\ge n$, then all terms of $p(q)$ are of degree $\ge n$.

To prove (c), first write the Taylor polynomial for $p$ at $a$: $$p(t) = \sum_{k=0}^N \frac{p^{(k)}(a)}{k!}(t-a)^k$$ Substitute $a=q(x)$ and $t=q(x)+R(x)$: $$p(q(x)+R(x))=\sum_{k=0}^N \frac{p^{(k)}(q(x))}{k!}(R(x))^k $$ Subtracting $p(q(x))$ starts the sum at $k=1$: $$\overline R(x):=p(q(x)+R(x))-p(q(x))=\sum_{k=1}^N \frac{p^{(k)}(q(x))}{k!}(R(x))^k $$ Divide through by $(x-a)^n$: $$\frac{\overline R(x)}{(x-a)^n}:=\sum_{k=1}^N \frac{p^{(k)}(q(x))}{k!}\left[\frac{R(x)}{(x-a)^n}\right]^k(x-a)^{n(k-1)} $$ Now let $x\to a$. The quantity in square brackets tends to zero, while every other factor in the sum tends to a constant (possibly zero).

The remark is easily proved: If $p(x)=\sum_{k=n}^N p_k(x-b)^k$ and $q(x)=\sum_{i=0}^Mq_i(x-a)^i$, then $$ p(q(x)) = \sum_{k=n}^N p_k(q(x)-b)^k. $$ But $b=q(a)=q_0$ so $$q(x)-b=q(x)-q_0=\sum_{i=1}^Mq_i(x-a)^i$$ is a polynomial in $x-a$ whose terms $(x-a)^i$ all have degree at least $1$. Therefore $p(q(x))$ is a polynomial in $x-a$, all of whose terms have degree at least $n$.

Slight nitpick before beginning

Contrary to what Zanzag suggests, you do in fact need 2nd order logic to characterize infinity (more specifically countable and uncountable structures per the Löwenheim–Skolem theorem).

TL;DR:

You need to unpack the formal definition of the limits (ie, the "epsilon-delta definitions") and distribute the logical negation of an existential quantifier like this: $$\lnot \exists x\left[P(x)\right] \equiv \forall x \left[ \lnot P(x)\right]$$

The meaning of $\lim \limits_{x \to 0}f(x) \ne 0$ vs. $\lim \limits_{x \to 0}f(x) = \infty$

Notation disclaimer

The notation I'm working with may not be what Spivak uses, but I'm hoping it's really close since I borrowed the choice of variable names from this post about the infinity symbol in Spivak's calculus book.

Preface

I decided to write an "In English" version for those less versed in the predicate logic notation, so feel free to skip to the section "In predicate logic" if you've already got a grasp on the intuition of the formal meanings (aka, the delta-epsilon definitions) of these limits.

Definition of $\lim \limits_{x \to 0}f(x) \ne 0$

In English

So the statement, "$\lim \limits_{x \to 0} f(x) \ne 0$" really means "The limit of $f(x)$ as $x$ approaches $0$ exists (call it $L$) and it's not the number $0$." But the phrase "the limit of $f(x)$ ... exists" needs to be unpacked formally for the predicate logic you're using to make sense.

Formal definition of $\lim \limits_{x \to x_0}f(x) = L$

When we say,

The limit of $f(x)$ exists as the variable $x$ approaches the constant $x_0$ and is some number $L$.

we formally mean

For all positive real numbers $\varepsilon$, there exists some non-zero distance $\delta$ so that if $x$ is within the interval $(x_0 -\delta, x_0 +\delta)$ where $x \ne x_0$, then $f(x)$ is within the interval $(L - \varepsilon, L + \varepsilon).$

Intuition of $\lim \limits_{x \to x_0}f(x) = L$

Intuitively, you can think of $\varepsilon$ as the margin of error that your output $f(x)$ is allowed to be off from $L$. Likewise, you can think of $\delta$ as the distance you have to make $x$ fall within around the constant $x_0$ (but not equal to $x_0$) to get $f(x)$ within the error bounds $L - \varepsilon$ and $L + \varepsilon$. We can also call the interval $(x_0 - \delta, x_0 + \delta)$ an open ball centered at $x_0$. Here's a picture which should give a better intuition than the words alone:

Fun fact, the reason we use the symbols $\varepsilon$ and $\delta$ is because they're the Greek letters for "e" and "d", respectively (for "error" and "distance" thanks to Cauchy).

We can think of $x \in \left( x_0 - \delta, x_0 + \delta \right)$ and $f(x) \in \left( L - \varepsilon, L + \varepsilon \right)$ as the following two 3-part inequalities:

$$ x_0 - \delta < x < x_0 + \delta \tag{A}$$ $$L - \varepsilon < f(x) < L + \varepsilon \tag{B}$$

We can then subtract $x_0$ from each part of $($A$)$, and we can subtract $L$ from each part of $($B$)$ to get these:

$$ - \delta < x - x_0 < \delta \tag{A1}$$ $$- \varepsilon < f(x) - L < \varepsilon \tag{B1}$$

If we recall that $- a < b < a$ is equivalent to $|b| < a$ then we can turn $($A$1)$ and $($B$1)$ into these two inequalities:

$$ \left| x - x_0 \right| < \delta \tag{A2}$$ $$ \left| f(x) - L \right| < \varepsilon \tag{B2}$$

Since the formal definition of the limit of a function is trying to capture the intuition that "$x$ approaches $x_0$ where $x \ne x_0$", we make $($A$2)$ into a 3-part inequality:

$$ 0 < \left| x - x_0 \right| < \delta \tag{A3}$$

In predicate logic: Applying this to $\lim \limits_{x \to 0}f(x) \ne 0$

So with this formal definition in mind, we see that the statement "$\lim \limits_{x \to 0}f(x) \ne 0$" really unpacks to into this:

$$\forall f \left[ \exists L \ne 0 \left[ \forall \varepsilon > 0 \left[ \exists \delta > 0 \left( \; 0 < \left| x - x_0 \right| < \delta \implies \left| f(x) - L \right| < \varepsilon \right) \right] \right] \right] \tag{3}$$

Different kinds of "divergence" (ie, limits that don't exist)

I should note that when you use "$\nexists \lim \limits_{x\to 0}f(x)$", there are a couple ways this might happen. For instance, $f(x)$ might be an oscillating function like $\cos (\frac{1}{x})$ as shown here:

Also, it could be tending toward $\infty$ or $-\infty$ like $\frac{\cos(x)}{x^2}$ and $-\frac{\cos(x)}{x^2}$, respectively. Another picture from the Desmos graphing tool:

Formal definition of $\lim \limits_{x \to x_0}f(x) = \infty$

Since $\infty$ is not a real number (in other words, $\infty \notin \mathbb{R}$) we have to characterize the notion of approaching it without reference to the symbol itself. So in doing this with only real numbers, we get the following definition for "$\lim \limits_{x\to 0}f(x)=\infty$":

For all real numbers $M$, there exists some positive distance $\delta$ so that if $x$ is within the interval $(x_0 -\delta, x_0 +\delta)$ where $x \ne x_0$, then $f(x) > M$

Intuition of $\lim \limits_{x \to x_0}f(x) = \infty$

The intuition here is that we can get within some distance $\delta$ to $x_0$ so that the output of our function $f(x)$ is greater than any real number $M$. Another way to say this is that $f$ is unbounded from above.

In predicate logic: Applying this to $\lim \limits_{x \to x_0}f(x) = \infty$

So we can unpack this as

$$ \forall M \left[ \exists \delta > 0 \left( \; \left| x - x_0 \right| < \delta \implies f(x) > M \; \right) \right]$$

Pulling all the predicate logic together to answer your question

So you're correct in thinking that the way you formulated your first predicate logic statement $(1)$ is off a bit, and $(3)$ does come closer. However, you're fundamentally hiding the actual predicates within the shorthand notation of things like "$\lim \limits_{x\to0}g(x)$" and "$\lim \limits_{x\to0}f(x) = \infty$". Also, when you negate an existential quantifier "$\exists$" it becomes a universal quantifier "$\forall$" and vice versa. And when you say, "the consequent of the biconditional is false ($\nexists \lim \limits_{x\to0}f(x)$)" it might be better to instead think "the consequent of the biconditional is negated ($\lnot \lim \limits_{x\to0}f(x)$)" and expand the shorthand notation into the formal predicates I've given previously.

Another thing worth pointing out is when I wrote statements like "$\exists L \neq 0$" and "$\forall \varepsilon > 0$" this is mild short hand for "$\exists L \in \mathbb{R}\left( L \neq 0 \right)$" and "$\forall \varepsilon \in \mathbb{R} \left( \varepsilon > 0 \right)$" which follows the well-formed formulas "$\exists x \left(P(x)\right)$" and "$\forall x \left(P(x)\right)$" for some predicate $P(\_)$. Things get really long when you try to follow this syntax religiously.

You might be wondering why I used "$\exists L \in \mathbb{R}\left( L \neq 0 \right)$" instead of "$\exists L \left( L \neq 0 \right)$". This has to do with restricting the domain of discourse (ie, the set of considered individuals over which the logical variables range). In this context, we're only interested in real numbers at this level of calculus (as opposed to real numbers with a nonzero imaginary component like in the set of complex numbers $\mathbb{C} = \{ a + bi : a, b \in \mathbb{R} \}$). In fact, "$\forall \varepsilon \in \mathbb{R} \left( \varepsilon > 0 \right)$" hides some of the logical machinery as well; it can also be thought of as "$\forall \varepsilon \in \{\varepsilon \in \mathbb{R} : \varepsilon > 0 \}$" where the set $\{\varepsilon \in \mathbb{R}: \varepsilon > 0 \}$ can be the considered domain of discourse.

Resources on logic (predicate and mathematical)

There's a really nice concise handout from a Harvard lecture on predicate logic here that should help give a general idea of domains of discourse with some straightforward examples. Also, there's a great back in forth of the comments on this answer to a question on "Domain of discourse and quantifying in predicate logic".

Lastly, if you'd like a more through intro into how to use (and understand) mathematical logic, Peter Smith's "Teach Yourself Logic A Study Guide (and other Book Notes)" is both free and wonderfully written. It can serve to bridge the gap that often exists between intro logic courses (usually taught in philosophy departments) and intro maths proof courses. Moreover, he takes incredible care to specify not only books but also section numbers for further study if the topics peak your interest.

A partial example of applying predicate logic

At this point, I feel like it'd be a good exercise for you to see what falls out of this and if there's still the contradiction you previously thought there was. It would take a while for me to fully give an answer in predicate logic, but I'll demonstrate what doing so looks like for part of your equation $(3)$

So what you've written as this: $$\forall f \left[ \forall g \biggl[ \nexists \lim \limits_{x\to0}g(x) \implies \nexists \lim \limits_{x\to0}f(x)g(x) \biggr] \iff \exists \lim \limits_{x\to0}f(x) \neq 0 \; \right] \tag{3}$$ Is really the same as this: $$ \forall f \left[ \forall g \biggl[ \lnot (\exists \lim \limits_{x\to0}g(x)) \implies \lnot (\exists \lim \limits_{x\to0}f(x)g(x)) \biggr] \iff \exists \lim \limits_{x\to0}f(x) \neq 0 \; \right] \tag{3A}$$ By unpacking the the antecedent "$\lnot ( \exists \lim \limits_{x\to0}g(x))$" on left side of the biconditional, we get this: $$ \forall f \Biggl[ \forall g \biggl[ \lnot \biggl( \exists L_1 \Bigl(\forall \varepsilon > 0 \; \bigl( \exists \delta_1 > 0 \: ( \; |x - 0 | < \delta_1 \implies |g(x) - L_1| < \varepsilon \; ) \bigr) \Bigr) \biggr) \biggr] \implies \lnot \exists \lim \limits_{x\to0}f(x)g(x) \Biggr) \iff \exists \lim \limits_{x\to0}f(x) \neq 0 \; \Biggr] \tag{3B}$$ If we distribute the logical negation "$\lnot$", keeping in mind that it needs to go through each level and reverses the existential and universal quantifiers, we get this: $$\forall f \Biggl[ \forall g \biggl[ \biggl( \forall L_1 \Bigl(\exists \varepsilon > 0 \; \bigl( \forall \delta_1 > 0 \: \lnot ( \; |x - 0 | < \delta_1 \implies |g(x) - L_1| < \varepsilon \; ) \bigr) \Bigr) \biggr) \biggr] \implies \lnot \exists \lim \limits_{x\to0}f(x)g(x) \Biggr) \iff \exists \lim \limits_{x\to0}f(x) \neq 0 \; \Biggr] \tag{3C}$$ Note that an implication of the form "$p \implies q$" is equivalent to "$\lnot p \lor q$", so "$\lnot(p \implies q)$" is equivalent to "$\lnot(\lnot p \lor q)$". DeMorgan's law yields "$p \land \lnot q$", so then our negated implication "$\lnot (\; |x - 0 | < \delta_1 \implies |g(x) - L_1| < \varepsilon \; )$" is really "$|x - 0| < \delta_1 \land |g(x) - L_1| \geq \varepsilon$". With this in mind, $(3$C$)$ becomes this: $$ \forall f \Biggl[ \forall g \biggl[ \biggl( \forall L_1 \Bigl(\exists \varepsilon > 0 \; \bigl( \forall \delta_1 > 0 \: ( \; |x - 0| < \delta_1 \; \land \;|g(x) - L_1| \geq \varepsilon \; ) \bigr) \Bigr) \biggr) \biggr] \implies \lnot \exists \lim \limits_{x\to0}f(x)g(x) \Biggr) \iff \exists \lim \limits_{x\to0}f(x) \neq 0 \; \Biggr] \tag{3D}$$

Concluding remarks

I'm going to stop there as I don't have the required time at the moment to finish this example, but I hope this has cleared up some of your confusion regarding how to apply predicate logic and distribute the negation across universal quantifiers. Feel free to comment, and I'll do my best to update this answer accordingly when I have some more time.

Calculus – Understanding Solution Manual Proof for Spivak, Ch. 20, Problem 9d

As a comment on your attempt, note that $R_{n,b,f}$ is not a polynomial. However, by hypothesis it can be written $R_{n,b,f}(x)=(x-b)^n r_{n,b,f}(x)$ where $\lim \limits_{x\to b} r_{n,b,f}(x)=0$.

The proof of (4) involves the corrected versions of both (c) and the remark at the end of (c). It doesn't require $a=0$ or $b=0$ but does assume $b=g(a)$.

Substitute $g(x)$ into $R_{n,b,f}(x) = (x-b)^n r_{n,b,f}(x)$ to obtain $$B:=R_{n,b,f}(g(x)) = [g(x)-b]^n r_{n,b,f}(g(x)).$$ We can write $$ [g(x)-b]^n=\big[P_{n,a,g}(x) + R_{n,a,g}(x)-b\big]^n = p\big(q(x) + R_{n,a,g}(x)\big) $$ where $p$ and $q$ are the polynomials $$p(x):=[x-b]^n,\qquad q(x):=P_{n,a,g}(x).$$ Since $R_{n,a,g}(x)/(x-a)^n\to0$ as $x\to a$, we can apply (c) to get $$ [g(x)-b]^n=p(q(x)) + \overline R(x) $$ where $\lim\limits_{x\to a}\frac{\overline R(x)}{(x-a)^n}=0$. Note now that $q$ is a polynomial in $x-a$, while $p$ is a polynomial in $x-b$ with $b:=g(a)=P_{n,a,g}(a)=q(a)$. So by the remark, $p(q)$ is a polynomial in $x-a$ with degree at least $n$.

Divide $B$ by $(x-a)^n$: $$\frac B{(x-a)^n} = \left[ \frac{p(q(x))}{(x-a)^n} + \frac{\overline R(x)}{(x-a)^n} \right]r_{n,b,f}(g(x)) $$ As $x\to a$, the first term in square brackets tends to a constant, the second term tends to zero, and $r_{n,b,f}(g(x))$ tends to zero because $g(x)$ tends to $g(a)=b$. This proves (4).

As for (5), there's no need to invoke (c) again. The argument establishes two Taylor polynomials of degree $n$ for $f\circ g$ at $a$, both of whose remainders, when divided by $(x-a)^n$, tend to zero as $x\to a$, hence the two polynomials are equal.