One difference between the special case of the Lebesgue measure and the general case is translation-invariance.
Since the Lebesgue measure is translation-invariant, integrals of the form $\int \phi(x-y)\,dx$ do not actually depend on $y$. As a consequence, finiteness of this integral for some value of $x$ is equivalent to it being uniformly bounded with respect to $y$.
For a general measure $\mu$, finiteness and uniform boundedness of $\int |x-y|^{-n}\,d\mu(x)$ are different things. I claim that the crucial property is uniform boundedness. Specifically, the following are equivalent:
- $\mathcal{I}_\alpha$ is a bounded operator from $L^1_\mu$ to $L^{n/(n-\alpha)}_\mu$
- There exists a constant $M$ such that for $\mu$-a.e. $y$ we have $\int |x-y|^{-n}\,d\mu(x)\le M$.
In the proofs, all Lebesgue norms are taken with respect to $\mu$ (the Lebesgue measure is never considered).
Proof of $2\implies 1$. By duality, $\|\mathcal{I}_\alpha f\|_{n/(n-\alpha)} = \sup\{\int ( \mathcal{I}_\alpha f)\,g : \|g\|_{n/\alpha}\le 1\}$.
For any such $g$ Hölder's inequality and assumption 2 imply
$$\int \frac{g(x)}{|x-y|^{n-\alpha}}\,d\mu(x) \le \|g\|_{n/\alpha} \int |x-y|^{-n}\,d\mu(x) \le M$$
for $\mu$-a.e. $y$. Hence,
$$ \int (\mathcal{I}_\alpha f )\,g \le \int \int \frac{|g(x)||f(y)|}{|x-y|^{n-\alpha}}\,d\mu(x)\,d\mu(y)
\le M\int |f(y)|\,d\mu(y) = M \|f\|_1 $$
which establishes 1.
Proof of $1\implies 2$. Fix $y$ such that every neighborhood of $y$ has positive $\mu$-measure (this is true
for $\mu$-a.e. point $y$). Let $f_k$ be a sequence of positive functions such that $\int f_k(x)\,d\mu(x)=1$
and $f_k$ is supported in the $(1/k)$-neighborhood of $y$. As in your earlier question, pointwise convergence $\mathcal{I}_\alpha f_k(x)\to |x-y|^{n-\alpha}$ together with Fatou's lemma imply that $$\int |x-y|^{-n}\,d\mu(x)\ge \liminf_{k\to\infty} \int (\mathcal{I}_\alpha f_k)^{n/(n-\alpha)} \le C^{n/(n-\alpha)}$$
where $C$ is the operator norm of $\mathcal{I}_\alpha$. $\Box$
Notice first that by duality, your inequality is equivalent to the Hardy-Littlewood-Sobolev inequality
$$
\iint \frac{f(x)\,g(y)}{|x-y|^\alpha}\,\mathrm d x\,\mathrm d y ≤ C_{p,\alpha,n}\, \|f\|_{L^p}\,\|g\|_{L^{q'}}.
$$
where $q' = \tfrac{q}{q-1}$.
The maximizers and the optimal constant are only explictly known in the case $p=q'$. In this case $p=\frac{2\,n}{2\,n-\alpha}$ and as proved by Lieb
$$
C_{p,\alpha,n} = \pi^{\alpha/2} \,\frac{\Gamma(\tfrac{n}{2}-\tfrac{\alpha}{2})}{\Gamma(n-\tfrac{\alpha}{2})} \left(\frac{\Gamma(n)}{\Gamma(\tfrac{n}{2})}\right)^{1-\alpha/n}
$$
and the optimizers are the functions of the form
$$
f(x) = \frac{C}{(a^2+|x-b|^2)^{(2n-\alpha)/2}}
$$
See e.g. Theorem 4.3 in the book Analysis by Lieb and Loss (and the remarks after).
In the other cases, the optimizers are known to exist, but I think they are not known explicitly, and the optimal constant is not known. However, there are bounds from above for the constant, and the minizers might be linked to the fast diffusion/porous media equation (see e.g. Hardy-Littlewood-Sobolev inequalities via fast diffusion flows)
Best Answer
There is a direct and self-contained proof of HLS inequality in Analysis by Lieb and Loss, Theorem 4.3. It uses nothing but layer cake representation, Hölder's inequality, and clever manipulation of integrals. A bit too long to reproduce here, though.
Also, the boundedness of Hardy-Littlewood maximal function is much more straightforward than the general Marcinkiewicz interpolation theorem; it is presented in the textbooks as a consequence of the latter just because the authors would like it to be one. Stein proves it as Theorem 1.1.1 in Singular integrals and differentiability properties of functions. First, the covering lemma is used to prove the weak $(1,1)$ inequality $$m(E_\alpha)\le C\alpha^{-1}\int_{\mathbb R^n} |f(x)|\,dx \tag{1}$$ where $E_\alpha = \{x:Mf(x)>\alpha\}$.
Fix $\alpha$ and let $f_1=f\chi_{|f|\ge \alpha/2}$. Since $|f|\le f_1+\alpha/2$, it follows that $$\{x:Mf(x)>\alpha\}\subset \{x:Mf_1(x)>\alpha/2\}$$ Apply $(1)$ and use the layercake representation of $\int (Mf)^p$: $$ \int_{\mathbb R^n} (Mf(x))^p\,dx = p\int_0^\infty \alpha^{p-1} m(E_\alpha)\,d\alpha \le p \int_0^\infty \alpha^{p-1} \frac{C}{\alpha}\left( \int_{|f|>\alpha/2}|f(x)|\,dx\right)\,d\alpha $$ Switch the order of integration on the right to get $$ C p \int_{\mathbb R^n}|f(x)|\,dx \int_0^{2|f(x)|} \alpha^{p-2} \,d\alpha = C'\int_{\mathbb R^n}|f(x)|^p\,dx $$ as desired.
And now that I typed all this, I see that the Wikipedia article Hardy–Littlewood maximal function also gives this proof.