From my less-than-expert (where's Terry when you need him?) point of view, a possible reason seems to be the following (I wouldn't call it something going wrong or even a difficulty):
The statement of restriction estimates only give you estimates where the left hand side is an isotropic Lebesgue space, in the sense that you get an estimate $L^q_tL^r_x$ with $q = r$. This naturally excludes the end-point, which requires $r > q$.
Why is this? The reason is that the restriction theorems only care about the local geometry of the hypersurface, and not its global geometry. (For example, the versions given in Stein's Harmonic Analysis requires either the hypersurface to have non-vanishing Gaussian curvature for a weaker version, or that the hypersurface to be finite type for a slightly stronger version. Both of these conditions are assumptions on the geometry of the hypersurface locally as a graph over a tangent plane.) Now, on each local piece, you do have something more similar to the classical dispersive estimates with $r > q$, which is derived using the method of oscillatory integrals (see, for example, Chapter IX of Stein's book; the dispersive estimate (15) [which has, morally speaking $q = r = \infty$ but with a weight "in $t$", so actually implies something with $q < \infty$] is used to prove Theorem 1, which is then used to derive the restriction theorem). But once you try to piece together the various "local" estimates to get an estimate on the whole function, you have no guarantee of what the "normal direction" is over the entire surface. (The normal direction, in the case of the application to PDEs, is the direction of the Fourier conjugate of the "time" variable.) So in the context of the restriction theorem, it is most natural to write the theorem using the $q = r$ version, since in the more general context of restriction theorems, there is no guarantee that you would have a globally preferred direction $t$.
(Note that Keel-Tao's contribution is not in picking out that time direction: that Strichartz estimates can be obtained from interpolation of a dispersive inequality and energy conservation is well known, and quite a bit of the non-endpoint cases are already available as intermediate consequences of the proof of restriction theorems. The main contribution is a refined interpolation method to pick out the end-point exponents.)
Edit: I realized that the explanation of the former second step of proof below was a little bit obscure since, while entirely correct, did not clarify enough why the choice of integrability exponent is not done by guessing. Therefore I decided to substitute it by a similar but more direct procedure and put the former step 2 in the notes for a brief proof of their equivalence.
I have not seen the proof of Young's inequality you allude to: however, the answer to your question i.e. whether it is possible to prove it in another way, remaining at the same level of knowledge, is yes. The proof given below is inspired by and follows the one in the nice monograph [1], pp. 26-27 and it is based on the standard and generalized (i.e. involving three or more functions) Hölder's inequalities and by a judicious choice of the integrability exponents associated to two or three factors expressing $| f(y)g(x-y) |$: the introduction of an auxiliary function $h$ is not required.
Young's Inequality. Let $p,q,r\in\Bbb R$ be such that
$$
1\le p\le q\le +\infty, \quad 1+\frac{1}{r}=\frac{1}{p}+\frac{1}{q},\label{1}\tag{1}
$$
and let $f\in L^p(\mathbf{R}^d)$ and $g\in L^q(\mathbf{R}^d)$: if
$$
f\ast g(x)=\int\limits_{\mathbf{R}^d} f(y)g(x-y)\mathrm{d}y,
$$
then
$$
\Vert f\ast g\Vert_{r}\le \Vert f\Vert_{p}\Vert g\Vert_q\label{2}\tag{2}
$$
Comment. The strategy of the proof goes as follows: first, in every range of values of $p, q, r$ defined by conditions \eqref{1}, we will express $|f(y)g(x-y)|$ as the product of three factors
$$
| f(y)g(x-y) |=\big(|f(y)|^p|g(x-y)|^q\big)^\frac{1}{s_1}|g(x-y)|^{\frac{q}{s_2}} |f(y)|^{\frac{p}{s_3}}.\label{step1}\tag{Step 1}
$$
Since we want to estimate the the $L^r$ norm of the convolution, we assume $s_1=r$.
Now \ref{step1} implies that the coefficients $s_1, s_2, s_3$ must satisfy the following conditions
$$
p\left(\frac{1}{s_1}+\frac{1}{s_3}\right)=1\quad q\left(\frac{1}{s_1}+\frac{1}{s_2}\right)=1,\label{c1}\tag{C1}
$$
We thus have a non-homogeneous linear system in the $s_i^{-1}$ variables, $i=1,2,3$ which is uniquely solvable, provided $pq\neq0$, and the second step consist in solving it for the unknown exponents: explicitly
$$
\begin{pmatrix}
1 & 0 & 0\\
q & q & 0\\
p & 0 & p
\end{pmatrix}
\begin{pmatrix}
s_1^{-1}\\
s_2^{-1}\\
s_3^{-1}\\
\end{pmatrix}=
\begin{pmatrix}
{\frac{1}{r}}\\
{1}\\
{1}\\
\end{pmatrix}\iff
\begin{pmatrix}
s_1^{-1}\\
s_2^{-1}\\
s_3^{-1}\\
\end{pmatrix}=
\begin{pmatrix}
{\frac{1}{r}}\\
{\frac{1-\frac{q}{r}}{q}}\\
{\frac{1-\frac{p}{r}}{p}}\\
\end{pmatrix}\label{step2}\tag{Step 2}
$$
The third and final step is to estimate the $L^r$ norm of the convolution $f\ast g$ by applying to equation \eqref{step1} one of the various forms of Hölder's inequality. This of course can be done since it is easily verified that
$$
\frac{1}{s_1} + \frac{1}{s_2} + \frac{1}{s_3}=1.\label{c2}\tag{C2}
$$
Proof. If $r=\infty$, then \eqref{2} is a direct consequence of the standard Holder's inequality, since
$$
\frac{1}{p}+\frac{1}{q}=1.
$$
Assuming $r<+\infty$, \eqref{2} must be verified for the three ranges defined by conditions \eqref{1}, i.e.
- $1<p<r$ and $1<q< r$;
- $p=1<q=r$;
- $p=r$ and $q=1$.
- Case 1: this is the most general case. From \ref{step2} we have
$$
\begin{cases}
s_1=r\\
\\
s_2=\dfrac{q}{1-\frac{q}{r}}\\
s_3=\dfrac{p}{1-\frac{p}{r}}
\end{cases},
$$
and thus equation \eqref{step1} becomes
$$
| f(y)g(x-y) |=\big(|f(y)|^p|g(x-y)|^q\big)^\frac{1}{r}|g(x-y)|^{1-\frac{q}{r}} |f(y)|^{1-\frac{p}{r}}.\label{3}\tag{3}
$$
Estimating the convolution $f\ast g$ by using \eqref{3} and the generalized Hölder inequality gives
$$
|f\ast g(x)|\le \bigg(\int\limits_{\mathbf{R}^d} |f(y)|^p|g(x-y)|^q\mathrm{d}y\bigg)^{\!\frac{1}{r}}\Vert f\Vert_{p}^{1-\frac{p}{r}} \Vert g\Vert_q^{1-\frac{q}{r}}\label{4}\tag{4}
$$
and applying the generalized Hölder inequality to \eqref{4} finally gives
$$
\begin{split}
\Vert f\ast g\Vert_r &\le \Vert f\Vert_{p}^{1-\frac{p}{r}} \Vert g\Vert_r^{1-\frac{q}{r}}\bigg(\int\limits_{\mathbf{R}^d}\mathrm{d}x \int\limits_{\mathbf{R}^d} |f(y)|^p|g(x-y)|^q\mathrm{d}y\bigg)^{\frac{1}{r}}\\
& = \Vert f\Vert_{p}^{1-\frac{p}{r}} \Vert g\Vert_q^{1-\frac{q}{r}} \bigg(\int\limits_{\mathbf{R}^d}|f(y)|^p \mathrm{d}y \int\limits_{\mathbf{R}^d} |g(x)|^q\mathrm{d}x\bigg)^{\!\frac{1}{r}}\\
& = \Vert f\Vert_{p}\Vert g\Vert_r
\end{split}
$$
- Case 2 and Case 3: in these cases, the right side of equation \eqref{3} reduces to the product of two terms and inequality \eqref{2} is obtained by means of the standard Hölder inequality. Explicitly,
$$
| f(y)g(x-y)| =
\begin{cases}
\big(|f(y)||g(x-y)|^q\big)^\frac{1}{r}|f(y)|^{1-\frac{1}{r}}&\text{ in case 2}\\
\big(|f(y)|^p|g(x-y)|\big)^\frac{1}{r}|g(x-y)|^{1-\frac{1}{r}}&\text{ in case 3}
\end{cases}.\qquad\blacksquare
$$
Final notes
- This proof is entirely elementary and does not require the "guessing" of the coefficients $s_1, s_2, s_3$, which are instead well defined and calculable.
- Why the second step above is entirely equivalent to the formerly proposed one? Because the non-homogeneous linear system obtained by considering directly, without assuming a priori $s_1=r$, the conditions implied by equation \eqref{3} and by the necessity of using Hölder's inequality i.e. \eqref{c1} and \eqref{c2}, is perfectly equivalent to \eqref{step2}. To see this, is sufficient to write it down and solve it
$$
\begin{pmatrix}
p & 0 & p\\
q & q & 0\\
1 & 1 & 1
\end{pmatrix}
\begin{pmatrix}
s_1^{-1}\\
s_2^{-1}\\
s_3^{-1}\\
\end{pmatrix}=
\begin{pmatrix}
{1}\\
{1}\\
{1}\\
\end{pmatrix}\iff
\begin{pmatrix}
s_1^{-1}\\
s_2^{-1}\\
s_3^{-1}\\
\end{pmatrix}=
\begin{pmatrix}
{\frac{1}{p}+\frac{1}{p}-1}\\
{1-\frac{1}{p}}\\
{1-\frac{1}{q}}\\
\end{pmatrix},
$$
and then use \eqref{1} to express $s_1, s_2$ and $s_3$ respectively as functions of $r$, $r$ and $q$, $r$ and $p$. Finally, it is wort to note that, in the approach above, the truth of \eqref{c2} is a consequence of the implicit use of \eqref{1}.
- As a final remark, let me say that Besov, Il'in and Nikol’skiĭ prove \eqref{2} ([1], p. 27-28) first for $d=1$ (without showing explicitly \eqref{step1} and \eqref{step2}) and then for vector exponents and $d\ge 2$, i.e. $\mathbf{p}=(p_1,\ldots,p_d)$, $\mathbf{q}=(q_1,\ldots,q_d)$ and $\mathbf{r}=(r_1,\ldots,r_d)$ where each of their $i$-th component satisfies relation \eqref{1}: the result is used in the development of the theory of anysotropic function (Sobolev and Besov) spaces.
Bibliography
[1] Oleg V. Besov, Valentin P. Il’in, Sergei M. Nikol’skiĭ (1978), Integral representations of functions and imbedding theorems. Vol. I, Ed. by Mitchell H. Taibleson. Translation from the Russian. (English) Scripta Series in Mathematics. Washington, D.C.: V. H. Winston & Sons. New York-Toronto-London: John Wiley & Sons, ISBN: 0-470-26540-X, pp. VIII+345, MR0519341, Zbl 0392.46022.
Best Answer
You can take $f(x) = e^{- \alpha |x|^{2}}$ (Gaussians) to be a "test function" in order to prove that the one of the equivalences is not true.