[Math] Integrating powers without much calculus

ho.history-overviewquadratureteaching

I'll jump into the question and then back off into qualifications and context

Using the definition of a definite integral as the limit of Riemann sums, what is the best way (or the very good ways) to establish the results $\int_a^bx^pdx=\frac{b^{p+1}-a^{p+1}}{p+1}$ without building a general theory of integrals?

Context: For better or worse, a common sequence in teaching integral calculus for the first or second time is to define the definite integral as a limit of Riemann sums. One then notes that this (like the limit definition of derivatives) is effective for proving theorems but not very practical for specific calculations. So soon one demonstrates or implies that the definition is valid and then gets to the Fundamental Theorem of Calculus. However it is traditional to use the definition to evaluate $\int_a^bx^pdx$ for $p=0,1,2$ and perhaps $p=3$ using the lovely sum of cubes formula.

It might be tempting to prove the formula in greater generality without using the fundamental theorem (most students are less excited at this prospect than one might expect!). In my early days as a TA I came up with an approach which I thought was great. The students were not impressed and I have not used it since. Anyway, I have not seen it elsewhere although I am confident it is nothing novel. I am not going to reveal it right away just to see if it shows up. I realize that is questionable manners here on MO but I will put it up in a day or two, I just want to see what shows up first.

Here is a very brief sketch of two approaches I have seen:

‍1) Let $S_p(n)=\sum_{k=0}^nn^p$. The explicit formulas for $p=0,1,2$ and also $p=3$ are attractive and not bad to prove by induction. Given an explicit formula for $S_p$ one easily evaluates the usual equal subinterval sums for $\int_0^1x^pdx$ and then extends to $\int_a^b$. But $S_p$ gets more tedious for larger $p$. A clever method of Pascal allows one to use strong induction, the binomial theorem and telescoping sums to derive an explicit formulas for $S_{p}$ for larger values of $p$, limited only by one's patience and stamina:

Take $(k+1)^{p+1}-k^{p+1}=\sum_1^{p+1}\binom{p+1}{j}k^{p+1-j}$ and sum for $k$ from $0$ to $n$ to get $(n+1)^{p+1}-0^{p+1}=\sum_1^{p+1}\binom{p+1}{j}S_{p+1-j}(n)$. Since we know everything except $S_p$, the rest is algebra! This is quickly unpleasant and the final results are not as aesthetic as the first cases. HOWEVER, for the desired application we only need to establish that $S_p(n)=\frac{n^{p+1}}{p+1}+\frac{n^p}{2}+O(n^{p-1})$. That is not hard and shows that $n$ subdivisions yield $\frac{1}{p+1}-\frac{1}{2n} \lt \int_0^1x^pdx \lt \frac{1}{p+1}+\frac{1}{2n}$.

Notes: This is valid for $p$ a non-negative integer. Knowing enough about Bernouli numbers allows explicit formulas but I am interested in fairly elementary methods. I think that one involves the series expansion for $e^x$.

‍2) Due to Fermat: Partition into subintervals using points forming a geometric rather than arithmetic progression. I've seen this in two forms:

2.1) Choose $0 \lt a \lt b$ and divide using $a \lt ar \lt ar^2 \lt \cdots\lt ar^N=b$ so $r=\sqrt[N]{b/a}$. The widths of
the intervals form a geometric progression of common ratio $r$. The values of $x^p$ at the points of division form a geometric progression of common ratio $r^p$. Thus the sum of rectangle areas using left endpoints gives as a lower bound for $\int_a^bx^p$ the geometric series with $N$ terms, first term $a^{p+1}(r-1)$ and ratio $r^{p+1}$. Using righthand enpoints gives a similar upper bound with first term $a^{p+1}(r-1)r^p$. With very little effort one arrives at

$$ \frac{(b^{p+1}-a^{p+1})(r-1)}{r^{p+1}-1} \lt \int_a^bx^pdx \lt \frac{(b^{p+1}-a^{p+1})(r-1)r^p}{r^{p+1}-1}.$$

If $p+1$ is a positive integer we have

$$ \frac{b^{p+1}-a^{p+1}}{1+r+r^2+\cdots+r^p} \lt \int_a^bx^pdx \lt \frac{(b^{p+1}-a^{p+1})r^p}{1+r+r^2+\cdots+r^p}.$$

Now let $N$ go to infinity sending $r$ to $1$ and squeezing to $\int_a^bx^pdx=\frac{b^{p+1}-a^{p+1}}{p+1}$.

This particular approach requires $0 \lt a$. It is an easy extra step to extend the result to rational values of $p$ (except for the challenging $p=-1$) using $\frac{r^{p/q}-1}{r-1}=\frac{u^p-1}{u-1}/\frac{u^q-1}{u-1}$ for $u=r^{1/q}$.

2.2) Similar except now divide the interval $[0,b]$ using a value $0 \lt r \lt 1$ and infinitely many points $ \dotsb br^3 \lt br^2 \lt br \lt b$. Now one has infinite geometric series and the rest proceeds similarly to before letting $r$ increase to $1$.

So that is the flavor of what I am asking about. I do not think this is a big list question unless there are a large number of approaches I have not seen.

CONTINUED To recap, we already know the answer, $\frac{b^{p+1}-a^{p+1}}{b-a}$, which we want for the area $A$ of the region under $x^p$ for $a \le x \le b$, we just want to prove it. (Assume for ease that $0 \lt a$.) A partition $P$ of $[a,b]$ is a sequence $a=x_0 \lt x_1 \lt \cdots \lt x_n=b$. The mesh $m(P)$ of $P$ is $\max(x_{i}-x_{i-1}).$ (There is rarely a reason to have unequal intervals, but Fermat gave one.) We use the sub-intervals, in two ways, as the bases of an assemblage of rectangles with heights determined by the endpoints. Since $x^p$ is monotonic, one is covered by the region and the other covers it. So the two areas provide a lower and an upper bound.

Lower (left) and upper (right) Riemann sums

$$ \sum_1^nx_{i-1}^p(x_i-x_{i-1}) \lt A \lt \sum_1^nx_{i}^p(x_i-x_{i-1}).$$ If we manage to compute or bound these bounds and show that, when the mesh goes to zero, they have a common limit (the one we expect), we are done. The actual bounds we compute are of value only for the interesting, but secondary, topic of speed of convergence. And anyway, if $m(P) \lt \epsilon$ then the difference between the two bounds is less than $(b-a)(b^p-(b-\epsilon)^p),$ which converges to zero. (For $p \lt 0$ use $a-(a+\epsilon)^p$.)

So I propose to instead assign to each sub-interval $[u,v]$ the height $h(u,v)=\frac{v^{p+1}-u^{p+1}}{(p+1)(v-u)}$ and "compute" $\sum_1^nh(x_{i-1},x_i)(x_i-x_{i-1})$ which immediately collapses to, of course, $\frac{b^{p+1}-a^{p+1}}{p+1}$.

Midpoint Riemann sums

Establishing that this has any relevance requires showing that the height $h(u,v)$ is between $u^p$ and $v^p$. This is easy in practice if one simplifies. If you simplify first, then the whole thing looks like magic until you see what was done.

So for $p=5$, obviously $$u^5 \lt \frac{v^5+v^4u+v^3u^2+v^2u^3+vu^4+u^5}{6} \lt v^5.$$ OK, so what? Why not use the average, the geometric mean or $\left(\frac{u+v}{2}\right)^5$? Well, $(v-u)h(u,v)=\frac{v^6-u^6}{6}$ so $\sum_1^nh(x_{i-1},x_i)(x_i-x_{i-1})$ collapses to $\frac{b^6-a^6}{b-a}$.

About as easily

\begin{gather*}
\frac{1}{\sqrt{v}} \lt \frac{2}{\sqrt{u}+\sqrt{v}} \lt \frac{1}{\sqrt{u}} \\
\frac{1}{v^2} \lt \frac{1}{uv} \lt \frac{1}{u^2} \\
\frac{1}{v^4} \lt \left( \frac{1}{v^3u}+\frac{1}{v^2u^2}+\frac{1}{vu^3}\right)/3 \lt \frac{1}{u^4}.
\end{gather*}

It is slightly more fun to show that

$$\sqrt{u} \lt \frac{2(v+\sqrt{vu}+u)}{3(\sqrt{v}+\sqrt{u})} \lt \sqrt{v}.$$

SO:

Is this line of argument valid? Is it interesting? Have you seen it before?

To its credit I'll say that it does not show preference to any particular partition and uses nothing more complex than the two historic treatments above (although maybe it benefits from a modern frame of reference). Also, rather than carefully converging to the correct answer as the partition evolves, it just starts there and stays unaffected. I don't immediately see that it can be applied to any other definite integrals. But the case of $x^p$ has a certain primary importance.

Best Answer

Here is a very simple proof for nonnegative integer $p$.

By elementary combinatorial reasoning, we have $$ \sum_{j=0}^{n-1} \binom{j}{p} = \binom{n}{p+1}, $$ which is the same as $$ \sum_{j=0}^{n-1} j(j-1)\cdots(j-p+1) = \frac{n(n-1)\cdots(n-p)}{p+1}.$$ After scaling that becomes a lower bound for $\int_0^1 x^p dx$.

Similarly, $$ \sum_{j=0}^{n-1} j(j+1)\cdots(j+p-1) = \frac{(n-1)(n)\cdots(n+p-1)}{p+1},$$ which scales to an upper bound.

Now let $n\to\infty$.

Related Question