Real Analysis – Generalisation of Cauchy’s Mean Value Theorem

real-analysis

I apologise in advance if this is an elementary question more fitted for Math Stack Exchange. The reason why I have decided to post here is that the question I am used to seeing on that site are not of the open-ended format of the one I am asking.

It is now the second time I have been studying Calculus (first self-taught, now in school) and we are going over the proof of Cauchy's mean value theorem (the precursor to l'Hopital's rule). I do understand the proof, and the intuitive explanation about parametrised curves in a plane, but I still think the statement of the theorem looks relatively obscure. Why are we considering a ratio and not something else? This led me to try and generalise, and this is where we get to my question.

I first tried finding a function $h: \mathbb{R}^2 \rightarrow \mathbb{R}$ such that for any functions $f,g: D \subset \mathbb{R} \rightarrow \mathbb{R}$ satisfying Cauchy's mean value theorem's hypotheses, for any interval $[a,b] \subset D$, there exists $x \in [a,b]$ such that $h(f'(x), g'(x))=h(f(b-a), g(b-a))$. Beyond making a few tries and finding a few counterexamples, I realised this wasn't really in the spirit of a mean value theorem: we are trying to make an analogy, if we may use this term, between $f'(x)$ and $f(b)-f(a)$, while in both Lagrange and Cauchy's mean value theorems the analogy is made between $f'(x)$ and $\frac{f(b)-f(a)}{b-a}$. So I started looking for $h$ such that there exists $x$ such that
$$
h(f'(x), g'(x))= h\left(\frac{f(b)-f(a)}{b-a}, \frac{g(b)-g(a)}{b-a}\right).
$$

But this didn't really lead me anywhere.

The question I'm asking is precisely this: can we say anything more about functions $h(x,y) \neq \frac{x}{y}$ satisfying these statements? Suppose we simplify even further, and consider, for example, only the functions $h_{\alpha,\beta}(x,y) = x^{\alpha}y^{\beta}$. Can we maybe prove that only those with $\alpha = k, \beta = -k$ for some $k$ work? (in addition, clearly, to those with $\alpha\beta = 0$) Is this even interesting to investigate?

Thanks in advance for helping me. My knowledge doesn't really go far beyond Calculus and Linear Algebra (say, Spivak and Axler's books) but I will try to understand your replies.

Best Answer

When I reviewed this question a few days ago, I thought there was something sounding familiar in it but I did not remembered what it was: now I have remembered. This problem was fully solved by Alessandro Faedo in paper [1]: in his ZBMath review, Peter Bullen says

The author determines all the functions $F(X,Y)$ such that, for every $f(x)$ and $g(x)$ continuous in $[a, b]$ with $f^\prime(x)$ and $g^\prime(x)$ defined for $a<x<b$, there exists a point $\xi$, with $a<\xi<b$ such that $$ F\left(\frac{f(b)-f(a)}{b-a}, \frac{g(b)-g(a)}{b-a}\right) = F(f^\prime (\xi), g^\prime(\xi)) $$ If $F(X,Y)={X/Y}$ then we have the Cauchy mean value theorem.

Synopsis of Faedo's paper

Following Willie Wong's comment, I am adding here a short synopsis of the paper. Indeed, apart from being written in Italian, the exposition does not follow the now common explicit definition/explicit theorem paradigm, but the mathematics is "embedded" almost seamlessy in the prose, thus it is not easily understand for a foreigner: this is a common point of several writings by Italian mathematicians of that generation.

§1. Basic definitions (pp. 489-490).
In this section, the author give the basic definitions he will use throughout the paper.

Let $\Phi(X,Y)\in C^0(\Bbb R^2\setminus E)$ where $E$ is a singular set without interior points such that $$ E= E_1 \cup E_2 $$ where

  • $E_1$ is the set of points of $\Bbb R^2$ such that $$ \lim_{(X,Y)\to (X_o,Y_o)} \Phi(X,Y)=\pm\infty \quad\forall (X_o,Y_o)\in E_1 $$
  • $E_2$ is the set of points of $\Bbb R^2$ such that $$ \lim_{(X,Y)\to (X_o,Y_o)} \Phi(X,Y)\;\text{ does't exist }\; \forall (X_o,Y_o)\in E_2 $$

Definition. $\Phi(X,Y)\in C^0(\Bbb R^2\setminus E)$ is a Cauchy function if, for all $f, g$ continuous on a given closed interval $[a,b]$ and differentiable in its interior, such that

  1. the points $$ X = \frac{f(b)-f(a)}{b-a},\; Y= \frac{g(b)-g(a)}{b-a} $$ do not belong to $E_1$ and
  2. for each $x\in ]a,b[$ the points $X=f^\prime(x)$ and $Y=g^\prime(x)$ do not belong to $E_2$,

there exists at least one point $\xi\in ]a,b[$ such that $$ \Phi\left(\frac{f(b)-f(a)}{b-a}, \frac{g(b)-g(a)}{b-a}\right) = \Phi(f^\prime (\xi), g^\prime(\xi)). \label{1}\tag{1} $$

§2. A slight generalization Cauchy's mean value theorem, and a particular class of Cauchy functions (pp. 490-492).
In this section the author first proves the following theorem:

Theorem (Generalized Cauchy's mean value theorem). If $f, g$ are continuous on a given closed interval $[a,b]$ and differentiable in its interior, and $h, k\in\Bbb R$ are two constants such that

  • $g(b)-g(a)+ k(b-a)\neq 0$
  • equations $f^\prime(x)+h=0$ and $g^\prime(x)+k=0$ are never simultaneously true for every $x\in]a, b[$

then there exists at least one point $\xi\in ]a,b[$ such that $$ \begin{split} \left[\frac{f(b)-f(a)}{b-a} +h\right]\cdot\left[\frac{g(b)-g(a)}{b-a}\right]^{-1} & \\ \frac{f(b)-f(a) + h(b-a)}{g(b)-g(a) + k(b-a)} & = \frac{f^\prime (\xi)+h}{g^\prime(\xi)+k}. \end{split} $$ The proof is a straightforward application of the standard Cauchy's mean value theorem. $\blacksquare$
The above theorem shows that $$ \Phi(X,Y)=(X+h)(Y+k)^{-1} $$ is a Cauchy function, and the author notes immediately that,

  • for any given continuous function $F:\Bbb R \to \Bbb R$, the function $\Phi_1(X,Y)=F((X+h)(Y+k)^{-1})$ is in the same way Cauchy,
  • describes their simple $E_1$ and $E_2$ sets and,
  • by abuse of notation, calls $\Phi_1$ the class of Cauchy functions generated by $(X+h)(Y+k)^{-1}$, which is thus called the generating function of the class.

He then notes that the another way of representing functions of class $\Phi_1$ is to use arbitrary zero-degree homogeneous functions: if $F^\ast(z_1,z_2)$ is such a function, then for each real $c\neq 0$ we have $$ F^\ast(c z_1, c z_2) = F^\ast(z_1,z_2) = F^\ast\!\left(\frac{z_1}{z_2}, 1\right) $$ and we can represent any function of the class $\Phi_1$ as $$ \Phi_1(X,Y)=F^\ast(X+h,Y+k). $$

§3 and §4. The class $\Phi_2$ and an example of homogeneous function which is not Cauchy (pp. 492-493).
The Author shows that also linear functions of the form $$ \Phi_2(X,Y)=c_1 X + c_2 Y $$ are Cauchy functions, and again he defines as $\Phi_2$ the class generated by such function.
In paragraph 4 he shows that there exists a positively homogeneous function, precisely the function $\Phi^\ast(X,Y)=(X^2+Y^2)^{\alpha/2}$, $\alpha\neq 0$ which is not a Cauchy function, proving thus that there are not other classes of Cauchy functions that can be constructed in the same way as $\Phi_1$.

§5. A necessary condition for a function of the form $\Phi(X,Y)=X-\psi(Y)$ to be a Cauchy function (pp. 493-495).
The core result of the paper is the following lemma:

Lemma. Let $\psi\in C^2(\Bbb R)$. A necessary condition for a function $\Phi(X,Y)= X-\psi(Y)$ to be a Cauchy function is that $\psi(Y)$ is linear.
Proof. The demonstration goes by contradiction: assume that $\Psi$ is a Cauchy function and suppose that there exists at least a number $Y_0$ for which $\psi^{"}(Y_0)\neq 0$. Moreover, without restriction to generality, let's assume that $\psi^{"}(Y_0)>0$. Then the function $$ F(Y)=\psi(Y)-Y\psi^\prime(Y_0) $$ has a local minimum in $Y=Y_0$ since $$ F^\prime(Y_0)=0\; \wedge \; F^{"}(Y_0)>0 $$ Now choose a function $g\in C^1([a,b])$ satisfying the following properties:

  • its values at the endpoints of $[a,b]$ satisfy the following relation $$ \frac{g(b)-g(a)}{b-a} =Y_0, $$
  • $|Y_0 -g^\prime(x)|<\delta$ for all $x\in[a,b]$ and a sufficiently small $\delta>0$,
  • $Y_0\neq g(x)$ for all $x$ belonging to a subset of $[a,b]$ of positive (Lebesgue) measure.

For example we can define an indexed family of such functions as $$ g(x) =Y_0 x +\varepsilon \sin\frac{2\pi x}{b-a}\quad 0<\varepsilon<\delta $$

Finally define $f^\prime(x)=\psi(g^\prime(x))$: this implies that $$\DeclareMathOperator{\Dm}{\operatorname{d}\!} f(x)= f(a) + \int\limits_a^x \psi(g^\prime(x))\Dm x \iff \frac{f(b)-f(a)}{b-a} = \frac{1}{b-a}\int\limits_a^b \psi(g^\prime(x))\Dm x. $$ Then, for any $g$ satisfying the above properties, we have $$ \begin{split} \frac{1}{b-a}\int\limits_a^b F(g^\prime(x))\Dm x &= \frac{1}{b-a}\int\limits_a^b \psi(g^\prime(x))\Dm x - \frac{\psi^\prime(Y_0)}{b-a}\int\limits_a^b g^\prime(x)\Dm x\\ & = \frac{1}{b-a}\int\limits_a^b \psi(g^\prime(x))\Dm x - \psi^\prime(Y_0)\frac{g(a)-g(b)}{b-a}\\ &> \frac{1}{b-a}\int\limits_a^b F(Y_0)\Dm x = F(Y_0) =\psi(Y_0) -Y_0 \psi^\prime(Y_0) \\ &= \psi\left(\frac{g(a)-g(b)}{b-a}\right) - \psi^\prime(Y_0) \frac{g(a)-g(b)}{b-a} \end{split} $$ This implies that $$ \begin{eqnarray} \frac{1}{b-a}\int\limits_a^b \psi(g^\prime(x))\Dm x & > & \psi\left(\frac{g(a)-g(b)}{b-a}\right)\\ &\Updownarrow &\\ \frac{f(b)-f(a)}{b-a} & > &\psi\left(\frac{g(a)-g(b)}{b-a}\right)\label{2}\tag{2} \end{eqnarray} $$ while $f^\prime(x) -\psi(g^\prime(x)) =0 $ for each $x\in [a,b]$. This finally implies, contradicting the hypothesis, that $\Phi (X,Y) =X - \psi(Y)$ does not satisfy \eqref{1} thus it is not a Cauchy function: therefore it must be $\psi^{"}(x)=0$ for all $x\in[a,b]$. $\blacksquare$
I said this is the core result of the paper since the main theorem, proved in the following section of the paper, follows from an application of the ideas developed in this lemma.

§6. The main theorem (pp. 496-497).
The characterization of Cauchy functions given by Faedo is expressed by the following

Theorem. Let $\Phi(X,Y)\in C^2(\Bbb R^2\setminus E)$ where $E$ is defined as above. Then a necessary condition for $\Phi(X,Y)$ to be a Cauchy function is that each level curve defined by the equation $$ \Phi(X,Y)=c,\quad c=\text{const.}\label{3}\tag{3} $$ is a piecewise linear curve $\Gamma_c$ whose vertex are points $(X,Y)$ for which, simultaneously, $$ \frac{\partial\Phi}{\partial X}=0\;\wedge\;\frac{\partial\Phi}{\partial Y}=0. $$ Proof. Let $(X_0,Y_0)$ be a point in $\Bbb R^2\setminus E$ for which the partial derivatives $$ \dfrac{\partial\Phi}{\partial X}\bigg|_{(X_0,Y_0)}\text{ and }\dfrac{\partial\Phi}{\partial Y}\bigg|_{(X_0,Y_0)} $$ are not simultaneously equal to zero. For example and without restriction to generality, let $$ \left.\frac{\partial\Phi}{\partial X}\right|_{(X_0,Y_0)}\neq 0. $$ Since $\Phi(x_0,Y_0)=c$, equation \eqref{3} defines a function $X=\psi(Y)$ with $X_0=\psi(Y_0)$ and $$ \Phi(\psi(Y),Y)-c =0 $$ at least in a neighborhood of $Y_0$, thus can find a $\delta >0$ such that if $|Y-y_0| <\delta$ then also $|X_0-\psi(Y)|<\delta$. Moreover, since $\frac{\partial\Phi}{\partial X}\neq 0$, the function $X\mapsto\Phi(X,Y_0)$, considered in a neighborhood of $X=X_0$, takes the value $c$ only in $X=X_0$. From here on it is possible to proceed as in the proof of the lemma in §5: assume that $\psi^\prime(Y_0)>0$ and put $$ \overline X = \frac{f(b)-f(a)}{b-a},\; \overline Y= \frac{g(b)-g(a)}{b-a}. $$ We have that $$ \frac{g(b)-g(a)}{b-a} = Y_0 $$ and moreover $$ \frac{f(b)-f(a)}{b-a} = \frac{1}{b-a}\int\limits_a^b \psi(g^\prime(x))\Dm x. $$ with $|X_0-\psi(g^\prime(x))|<\delta$, and thus $$ \Bigg|\frac{f(b)-f(a)}{b-a} - X_0\Bigg|=\Bigg| \frac{1}{b-a}\int\limits_a^b [\psi(g^\prime(x))-X_0]\Dm x\Bigg|<\delta. $$ This implies $|\overline{X}-X_0|<\delta$ and $|\overline{Y}-Y_0|<\delta$ and due to the arbitrariness of $\delta$ we have $$ \Phi(\overline{X},\overline{Y}) \neq \Phi(X_0,Y_0) = c. $$ Equation \eqref{2} of §5 thus proves that $$ \Phi\left(\frac{f(b)-f(a)}{b-a}, \frac{g(b)-g(a)}{b-a}\right) \neq \Phi(f^\prime (x), g^\prime(x))=c\quad\forall x\in[a,b], $$ thus it must be $\psi^{"}(Y_0)$ for otherwise $\Phi(X,Y)$ is not a Cauchy function. Thus $\psi(Y)$ must be a linear function, and this implies that $$ \psi^\prime(Y) = \text{const.} = - \frac{{\partial \Phi}/{\partial Y}}{{\partial \Phi}/{\partial X}} $$ thus ${{\partial \Phi}/{\partial X}}\neq 0$ implies ${{\partial \Phi}/{\partial Y}}\neq 0$, therefore $X=\psi(Y)$ is defined for every $X$ except those for which ${\partial \Phi}/{\partial X} = {\partial \Phi}/{\partial Y} = 0$. $\blacksquare$

§7 and §8. Properties of the piecewise linear level curve $\Gamma_c$ and rational Cauchy functions.

In §7 Faedo proves that on the points $(X_0,Y_0)$ where the gradient of the Cauchy function $\Phi$ does not vanish, two level curves $\Gamma_c$ and $\Gamma_{c_0}$ with $c\neq c_0$ do not intersect nor self-intersect. In the last paragraph, the Author shows that the only rational Cauchy functions are those belonging to the classes $\Phi_1$ defined in §2 and $\Phi_2$ defined in §3.

Reference

[1] Sandro Faedo, "Sul teorema di Cauchy degli incrementi finiti" [On Cauchy's theorem about finite increments] (Italian), Rendiconti di Matematica, VI Serie 10 (1977), 489-499 (1978), MR0480904, Zbl 0384.26002.