Step (B) isn't really by the induction hypothesis, it's the definition of weak $D^\gamma$ (a.k.a., "formal integration by parts").
Step (C) is by the induction hypothesis, distributing $D^\gamma$ according to the Leibniz rule.
To understand (D), split the sum in (C) in two, express the first one in terms of $\rho$, and then rename the index $\rho$ as $\sigma$:
$$\sum_{\sigma\le \beta} {\beta \choose \sigma} D^\rho \zeta D^{\alpha - \rho} u + \sum_{\sigma\le \beta} {\beta \choose \sigma}D^\sigma \zeta D^{\alpha - \sigma} u \\
= \sum_{\gamma \le \rho\le \alpha} {\beta \choose \rho-\gamma} D^\rho \zeta D^{\alpha - \rho} u + \sum_{\sigma\le \beta} {\beta \choose \sigma}D^\sigma \zeta D^{\alpha - \sigma} u \\
= \sum_{\gamma\le \sigma\le \alpha} {\beta \choose \sigma-\gamma} D^\sigma \zeta D^{\alpha - \sigma} u + \sum_{\sigma\le \beta} {\beta \choose \sigma}D^\sigma \zeta D^{\alpha - \sigma} u $$
It remains to use the aforementioned identity
$\displaystyle {\beta \choose \sigma-\gamma} + {\beta \choose \sigma} = { \alpha \choose \sigma}$, which can be proved by recalling that multinomial coefficient ${ \alpha \choose \sigma}$ is the coefficient of $x^\sigma$ in $$(1+x)^\alpha = (1+x)^\beta(1+x)^\gamma = (1+x)^\beta + x^\gamma (1+x)^\beta$$
Here I am (ab)using notation, how it's customary with multiindices: e.g., $$(1+x)^\alpha = \prod_i (1+x_i)^{\alpha_i}$$ Since $|\gamma|=1$, the factor $(1+x)^\gamma$ is linear: it's simply $1+x_i$ where $i$ is whatever coordinate has $\gamma_i=1$.
Here's the approach that I used on this exercise. I feel like there's probably a simpler way, but I didn't see one without using Fubini's Theorem.
First, use Hölder's inequality to show that $f_n \in L^p$, $f \in L^p$, and $f_n \to f$ in $L^p$ guarantee that $F_n \to F$ pointwise.
In addition, show that $f_1 \le f_2$ pointwise guarantees $F_1 \le F_2$ pointwise.
Verify using Hölder's inequality that $F(x)$ is continuous in $x$ for $x \gt 0$.
The hint which Rudin gives a hint for part a) is to assume the continuous, non-negative compact support case for $f$ to show
$$
\int_0^{\infty}F^p\ dx = -p\int_0^{\infty}F^{p-1}(f-F) \ dx
$$
Rewrite:
$$
\label{a}\tag{*} \int_0^{\infty}F^p\ dx = \frac{p}{p-1}\int_0^{\infty}F^{p-1}f \ dx
$$
Use Lebesgue Monotone Convergence to show ($\ref{a}$) holds when $f$ is the characteristic function of an open set with finite measure.
Show that ($\ref{a}$) holds for non-negative simple functions $s$ which are nonzero only on a set of finite measure as follows: Let $U$ be an open set of finite measure such that $s = 0$ outside of $U$. Use Lusin's theorem to get a sequence $g_n$ of continuous, compactly supported, non-negative functions whose supports are contained in $U$, and such that $g_n \to s$ pointwise a.e., $g_n \to s$ in $L^p$, and with $0 \le g_n \le K\chi_{U}$ for some $K > 0$. Apply Lebesgue Dominated Convergence to show that ($\ref{a}$) holds for $s$.
Show that ($\ref{a}$) holds for any non-negative $f \in L^p$ using Monotone Convergence on a increasing sequence of non-negative simple functions which converges pointwise to $f$.
Now let $f$ be an arbitrary non-negative function in $L^p$.
Suppose (for a contradiction) that $||F||_p =\frac{p}{p-1}||f||_p$ but that it is not the case that $f = 0$ almost everywhere.
Then $||f||_p \gt 0$, and since we are in the equality situation, we also have $||F||_p \gt 0$
Using Hölder's inequality, show that
$$
||F||_p^p = \int_0^{\infty}F^p\ dx = \frac{p}{p-1}\int_0^{\infty}F^{p-1}f \ dx \\
\le \frac{p}{p-1}\left\{ \int_0^{\infty}F^p \ dx \right\}^{1 - \frac{1}{p}} ||f||_p \\
= \frac{p}{p-1} ||F||_{p-1}||f||_p
$$
Thus Hölder's inequality must be an equality, and so there must be a non-negative constant $\alpha$ such that $F = \alpha f$ a.e. or $f = \alpha F$ a.e..
Verify that $\alpha$ must in fact be strictly positive.
The next part is what I wrestled with for some time.
We now have that WLOG $f = \alpha F$ a.e. for some $\alpha \gt 0$.
Verify using ($\ref{a}$) that $\alpha = \frac{p-1}{p}$.
From this we may conclude that
$$
F(x) = \frac{1}{x} \int_0^x \alpha F(t) \ dt
$$
Since $F(t)$ is continuous, it must be the case that $F(x)$ is differentiable for $x \gt 0$. Differentiating gives
$$
xF'(x) = (\alpha - 1)F(x) = -\frac{1}{p}F(x)
$$
Since $||F||_p > 0$, there must be a point $a$ with $F(a) > 0$.
But, the above equation shows that if $F(a) > 0$, then $F(x) > 0$ for $0 < x < a$.
So on $(0, a)$, have $\frac{F'}{F} = -\frac{1}{px}$ and hence $\log F(x) = C -\frac{1}{p} \log(x)$.
So, near zero, we have $F(x) = C x^{-\frac{1}{p}}$ for some $C > 0$. But this is a contradiction since then $F$ is not in $L^p$.
The case where $f$ is an arbitrary (complex-valued) function in $L^p$ follows from the case where $f$ is non-negative.
Let $g = |f|$.
If $||F||_p =\frac{p}{p-1}||f||_p$, then
$$
\frac{p}{p-1}||f||_p = ||F||_p \le ||G||_p \le \frac{p}{p-1}||g||_p = \frac{p}{p-1}||f||_p
$$
So $g = 0$ almost everywhere by the non-negative case, and the same is true for $f$.
Best Answer
I’m grateful for all efforts people made to solve my problem. With the help of my teacher, I finally find a solution. And it turns out that there is some thing wrong in the original problem.
Let $0=x_0<x_1<\cdots<x_n=T$ be such that $\|f\|_{L^\rho(x_{k-1},x_k)}=1/2$ for each $1\leq k\leq n-1$ and $\|f\|_{L^\rho(x_{n-1},x_n)}\leq1/2$. A simple calculation gives that $\|f\|_{L^\rho(0,x_k)}=\frac{k^{1/\rho}}2$ for $1\leq k\leq n-1$.
Next we do the iteration. By Minkowski's inequality and Hölder's inequality \begin{align*} \|\varphi\|_{L^\gamma(0,x_{k+1})}&\leq\eta+\|f\varphi\|_{L^\beta(0,x_{k+1})}\\ &\leq\eta+\|f\varphi\|_{L^\beta(0,x_{k})}+\|f\varphi\|_{L^\beta(x_k,x_{k+1})}\\ &\leq\eta+\|f\|_{L^\rho(0,x_k)}\|\varphi\|_{L^\gamma(0,x_k)}+\|f\|_{L^\rho(x_k,x_{k+1})}\|\varphi\|_{L^\gamma(x_k,x_{k+1})}\\ &\leq\eta+\frac{k^{1/\rho}}2\|\varphi\|_{L^\gamma(0,x_k)}+\frac12\|\varphi\|_{L^\gamma(x_k,x_{k+1})}\\ &\leq\eta+\frac{k^{1/\rho}}2\|\varphi\|_{L^\gamma(0,x_k)}+\frac12\|\varphi\|_{L^\gamma(0,x_{k+1})}. \end{align*} Note that $1\leq \rho<\infty$, we thus have $$\|\varphi\|_{L^\gamma(0,x_{k+1})}\leq 2\eta+k^{1/\rho}\|\varphi\|_{L^\gamma(0,x_k)}\leq 2\eta+k\|\varphi\|_{L^\gamma(0,x_k)}.$$ By induction we can easily deduce that $$\|\varphi\|_{L^\gamma(0,x_k)}\leq 2\eta (k+1)!$$
Fix $t\in(0,T)$, then there is some $1\leq k\leq n$ such that $t\in [x_{k-1},x_k)$, so \begin{align*} \|\varphi\|_{L^\gamma(0,t)}&\leq 2\eta (k+1)!\\ &=2\eta \Gamma(k+2)\\ &=2\eta \Gamma\left(3+\frac{k-1}2\cdot2\right). \end{align*} Since $\|f\|_{L^\rho(0,t)}\geq \|f\|_{L^\rho(0,x_{k-1})}=\frac{(k-1)^{1/\rho}}2$, we conclude that \begin{align*} \|\varphi\|_{L^\gamma(0,t)}&\leq 2\eta\Gamma\left(3+2(2\|f\|_{L^\rho(0,t)})^\rho\right)\\ &\leq\eta\Phi\left(2^\rho\|f\|_{L^\rho(0,t)}^\rho\right). \end{align*}