For every real $x \gt 0$ and every integer $n \gt 0$ there is one and only one real $y$ such that $y^n = x$.

real-analysis

I'm working through Rudin's proof of this theorem (1.21 in Principles of Mathematical Analysis) and I got stuck at a certain part. Here's the proof up until that part:

Proof.

That there is at most one such $y$ is clear since $0 \lt y_1 \lt y_2$ implies $y_1^n \lt y_2^n$.

Let $E$ be the set consisting of all positive real numbers $t$ such that $t^n \lt x$.

If $t = \frac{x}{1+x}$ then $0 \lt t \lt 1$ (I get this part since $1+x \gt x$ and $x \gt 0$). Hence $t^n \lt t \lt x$. (This is where Rudin lost me. First, how can $t$ be strictly greater than $t^n$ if it's not true for $n=1$. And is $t \lt x$ because $x = t(1+x)$?)

Help with this is greatly appreciated, thank you.

Best Answer

I grabbed my copy of "Principles of Mathematical Analysis" and I believe your confusion is justified. Rudin made a (small) mistake in the proof.

Rudin is simply trying to show that $E$ is non-empty and bounded above in this first part of the proof, which is hinted at by the fact that $t$ plays no further part afterwards. To that end he is trying to construct a number $t$ which would qualify to live in $E$. He goes ahead and constructs this $t$ as follows: $$ t = \frac{x}{x+1}. $$ From this we already have that $t < x$, since $0 < \frac{1}{x+1} < 1$, but we want $t^n < x$. Luckily for us, from the way $t$ has been constructed we know $0 < t < 1$. This means (and here is where Rudin made his mistake), $t^n \leq t < x$. The equality happens exactly when $n = 1$ as you pointed out. We still have $t^n < x$ regardless, and so the conclusion is still valid: $E$ is non-empty.

He makes the same mistake, again forgetting about $n = 1$, when goes about showing that $E$ is bounded above. A strict inequality must be replaced by a non-strict one. However, just like in the case above we are rescued by the fact that the claim $t^n \geq t > x$ still means that $t^n > x$.

However, there is also a much larger mistake. We also require $y > 0$ for the theorem to be true. I was honestly surprised Rudin didn't specify this in his statement of the theorem. He uses it in the proof.