Prove properties of Dirac delta from the definition as a distribution

dirac delta

I want to prove some properties of Dirac delta $\delta$ from the definition as a distribution without using the pseudo-definition at Classical Electrodynamics, Jackson – third edition p.26:

$$\delta(x-a) = 0 \text{ for } x \ne a \text{, and}$$
$$\int \delta(x-a) \, dx = 1$$

I want to prove the properties provided in the book. $f(x)$ is an arbitrary function:

$$\begin{align}
(1)\;\;\;\;\;& \int f(x) \delta(x-a) \, dx = f(a) \\
(2)\;\;\;\;\;& \int f(x) \delta'(x-a) \, dx = -f'(a) \\
(3)\;\;\;\;\;& \delta(f(x)) = \sum_i \frac{1}{\left| \frac{df}{dx}(x_i) \right|}\delta(x – x_i) \\
\end{align}$$

On Wikipedia, the Dirac delta is defined as a linear functional on the space of test functions:

$$\delta[\varphi] = \varphi(0)$$

$\varphi$ is any test function.

It is easy to prove the three properties if $f(x)$ is a test function. However, I question how to extend the properties from applying on the test functions to applying on arbitrary functions.

Researches

Equivalence between Dirac Delta definition as a measure and as a distribution explain the relationship between as measure and as distribution. I don't entirely understand it. As Wikipedia mentioned,

Generally, when the term "Dirac delta function" is used, it is in the sense of distributions rather than measures,…

I think I should not try to prove the three properties from the definition as a measure.

Prove Dirac delta function equation is related to the (2) property. But the result contains an extra part: $-f(0)\delta'(0)$. I don't know how to remove the extra part.

How to prove δ(kx)=δ(x)|k| by using properties of a test function. proves $\langle \delta(kx), \varphi(x) \rangle = \frac{1}{|k|} \langle \delta(x), \varphi(x) \rangle$, but it didn't extend the test function $\varphi$ to an arbitrary function $f(x)$.

Best Answer

If you want to work with the true definitions, then you have to understand first that since the Dirac delta is not a function, expressions like $\int f(x)\,\delta(x-a)\,\mathrm d x$ have a priori no meaning.

Prerequisites.

  • Indeed, the Dirac delta can be viewed as a distribution. But in general if $T$ is a distribution, then $T(x)$ has no meaning, and so neither has $\int f(x)\,T(x)\,\mathrm d x$. But, distributions were created so as to extend functions. For example, any function $f$ that is locally integrable defines a distribution $T_f$ defined by is action on tests function $\varphi$ as $$ \langle T_f,\varphi\rangle = \int f(x)\,\varphi(x)\,\mathrm d x $$ (where I use the notation $\langle T,\varphi\rangle$ instead of $T[\varphi]$). By abuse of notation, one identify $T_f$ and $f$ as representing the same thing and so just writes $f$ instead of $T_f$ (in the same way that one identify the integers as a subset of complex numbers for example). Of course this does not mean that now you have to throw away the meaning of $$ \int f(x)\,u(x)\,\mathrm d x $$ when $u$ is not a test function but the integral still makes sense!

  • If you want to work rigorously with the Dirac multiplied by more general functions and integrated, the measure version is the more logical. Indeed, using measure theory, there is a general meaning to $\int f(x)\,\mu(\mathrm d x)$, the integration against a measure $\mu$. As a measure, $\delta_a$ is defined by its actions on sets (to every set, it assigns a value, the measure of this set). For example, the Lebesgue measure of a segment $[a,b]$ is $|b-a|$, and then the integral with respect to the Lebesgue measure is the typical integral. The Dirac delta at a point $a$ is then defined as a measure by $$ \delta_a(A) = 1 \text{ if } a\in A \\ \delta_a(A) = 0 \text{ if } a\notin A $$ and then it is indeed a theorem and not just a notation that for any continuous function $u$, $$ \int u(x)\,\delta_a(\mathrm d x) = u(a) $$ (i.e. your Equation $(1)$). Notice that "$\delta_a(\mathrm d x)$" here denotes what could be by slight abuse of notation denoted $"\delta(x-a)\,\mathrm d x"$. The point is that even if measures or distributions are not functions, they are made so as to generalize functions and so it is often convenient to write the results as if they were functions.

  • In the same way as for functions, any measure can be associated to a distribution by defining for any test function $\varphi$ $$ \langle \mu,\varphi\rangle = \int \varphi(x)\,\mu(\mathrm d x). $$ Similarly as for functions, for any measure, you could also replace $\varphi$ by a continuous function $u$ and this would still make sense.

  • So why distributions, will you ask me? Because there are elements in that are even wilder and for which the only thing you can multiply them with are smooth functions. In particular, when taking a lot of derivatives. For example the derivative of the Dirac delta, $\delta'$ can only be defined as a distribution, and not as a measure. In this case, the equation $$ \langle\delta_a',\varphi\rangle = -\varphi'(a), $$ (that is your Equation $(2)$) is a definition of $\delta_a$ as a distribution. There is no other rigorous meaning that I know that one could give to "$\int u(x)\,\delta'(x-a)\,\mathrm d x$".

Answer to the question.

At the end, now that you might understand things better, the true answer to your question is the following theorem: if you can prove something for a distribution using test functions, and if this distribution verifies $|\langle T,\varphi\rangle| \leq C\,\|\varphi\|_{L^\infty}$ for any test function, then it can be represented as a measure and this measure is unique (this is called the Riesz representation theorem).

Applications.

As you say, suppose you want to define the Dirac delta as a distribution by the formula $$\tag{4}\label{4} \langle\delta,\varphi\rangle = \varphi(0) $$ for any test function. Since it verifies the hypotheses of the Riesz theorem, there exists a unique measure $\mu$ such that for any test function $\varphi$, $$ \varphi(0)= \langle\delta,\varphi\rangle = \int \varphi(x)\,\mu(\mathrm d x). $$ But we already know that this is true for the measure $\delta_0$ defined previously! Hence, the unique measure extending $\delta$ is $\delta_0$, and it has the property that for any continuous function $u$ (that might not be a test function) $$ \int u(x)\,\delta_0(\mathrm d x) = u(0). $$ This proves your Equation $(1)$ for any continuous $u$, starting with the distributional definition$~$\eqref{4}.

Shifting from $\delta$ to $\delta(x-a)$ cannot be done unless we define what $\delta(x-a)$ means. By analogy with the simple change of variable for functions, for which $$ \int f(x-a)\,\varphi(x)\,\mathrm d x = \int f(x)\,\varphi(x+a)\,\mathrm d x $$ we can define for any distribution $$ \langle T(x-a),\varphi\rangle := \langle T,\varphi(\cdot+a)\rangle. $$ Then it follows that $$ \langle \delta(x-a),\varphi\rangle = \varphi(0+a) = \varphi(a) $$ This holds only for test functions. But as previouly, we deduce from the Riesz theorem that the unique measure associated to $\delta(x-a)$ is $\delta_a$, and so by identifying by abuse of notation $\delta_a(\mathrm d x) = \delta(x-a)\,\mathrm d x$, it yields $$ \int u(x)\,\delta(x-a)\,\mathrm d x = u(a) $$ for any continuous function $u$.

Conclusion.

It is written in Wikipedia that $\delta$ is usually meant as a distribution. The reason is that the space of distributions contains more elements, that can all be differentiated infinitely many times, hence it is convenient to think of every function, measure, and their derivatives as all being part of the big set of distributions. Then for people that know what they are doing, the convenient abuse of notation is to replace the notation $\langle T,\varphi\rangle$ by the integral $\int T(x)\,\varphi(x)\,\mathrm d x$ without even looking if $T$ is a measure or not. And more generally to use notations as if every object was a function (derivative, shift, change of variable). Sometimes, when things become subtle, it is however better to come back to more precise definitions.

Related Question