Functional Analysis – Square Integrable for Universal Approximation

approximation-theorydense-subspacesfunctional-analysishilbert-spaceslp-spaces

Let's consider square-integrable functions $f \in L^2\left(I_n\right)$ with the definition of the $\textit{discriminatory}$:

$\textbf{Definition:}$. The activation function $\sigma$ is called discriminatory in $L^2$ sense if:

(i) $0 \leq \sigma \leq 1$;

(ii) if $g \in L^2\left(I_n\right)$ such that
$$
\int_{I_n} \sigma\left(w^T x+\theta\right) g(x) d x=0, \quad \forall w \in \mathbb{R}^n, \theta \in \mathbb{R}
$$

then $g=0$ almost everywhere.

$\textbf{Lemma:}$ Let $\sigma$ be a discriminatory function in $L^2$ sense. Then, the finite sums of the form
$$
G(x)=\sum_{j=1}^N \alpha_j \sigma\left(w_j^T x+\theta_j\right), \quad \forall w_j \in \mathbb{R}^n, \theta_j, \alpha_j \in \mathbb{R}
$$

are dense in $L^2\left(I_n\right)$.

Proof

I want to prove this by contradiction. Assume that the set of such finite sums is not dense in $L^2(I_n)$. This implies a function $f \in L^2(I_n)$ which cannot be approximated arbitrarily closely by any function of the form $G(x)$.

Given the properties of $L^2$ spaces, we can consider the orthogonal complement of the span of the functions of the form $G(x)$. If our set is not dense, then there exists a non-zero function $g \in L^2(I_n)$ orthogonal to every function of the form $G(x)$. Mathematically, this means:
$$\int_{I_n} g(x) \sum_{j=1}^N \alpha_j \sigma(w_j^T x + \theta_j) dx = 0, \quad \forall w_j \in \mathbb{R}^n, \theta_j, \alpha_j \in \mathbb{R}.$$

Due to linearity of integration, this simplifies to:
$$\sum_{j=1}^N \alpha_j \int_{I_n} g(x) \sigma(w_j^T x + \theta_j) dx = 0, \quad \forall w_j \in \mathbb{R}^n, \theta_j, \alpha_j \in \mathbb{R}.$$

Since $\alpha_j$ are arbitrary, for this equality to hold for all choices of $\alpha_j$, it must be that each integral itself is zero:
$$\int_{I_n} g(x) \sigma(w^T x + \theta) dx = 0, \quad \forall w \in \mathbb{R}^n, \theta \in \mathbb{R}.$$

However, by our assumption, $\sigma$ is discriminatory in the $L^2$ sense. This means that the only function $g(x)$ that satisfies the above condition for all $w$ and $\theta$ is the zero function, $g = 0$ almost everywhere in $I_n$.

This is a contradiction because we assumed that $g$ is non-zero in $L^2(I_n)$. Therefore, our initial assumption that the set of such finite sums is not dense in $L^2(I_n)$ must be false. This implies that the finite sums of the form $G(x)$ are indeed dense in $L^2(I_n)$, completing the proof.

I wanted to know whether all these assumptions and proof logic are rigorous enough or require additional procedures to be completed. Any form of suggestions, detailed analysis would be very appreciated.

Best Answer

Proof

I want to prove this by contradiction. Assume that the set of such finite sums is not dense in $L^2(I_n)$. This implies a function $f \in L^2(I_n)$ which cannot be approximated arbitrarily closely by any function of the form $G(x)$.

Given the properties of $L^2$ spaces, we can consider the orthogonal complement of the span of the functions of the form $G(x)$. If our set is not dense, then there exists a non-zero function $g \in L^2(I_n)$ orthogonal to every function of the form $G(x)$. Mathematically, this means: $$\int_{I_n} g(x) \sum_{j=1}^N \alpha_j \sigma(w_j^T x + \theta_j) dx = 0, \quad \forall w_j \in \mathbb{R}^n, \theta_j, \alpha_j \in \mathbb{R}.$$

Due to linearity of integration, this simplifies to: $$\sum_{j=1}^N \alpha_j \int_{I_n} g(x) \sigma(w_j^T x + \theta_j) dx = 0, \quad \forall w_j \in \mathbb{R}^n, \theta_j, \alpha_j \in \mathbb{R}.$$

Since $\alpha_j$ are arbitrary, for this equality to hold for all choices of $\alpha_j$, it must be that each integral itself is zero: $$\int_{I_n} g(x) \sigma(w^T x + \theta) dx = 0, \quad \forall w \in \mathbb{R}^n, \theta \in \mathbb{R}.$$

However, by our assumption, $\sigma$ is discriminatory in the $L^2$ sense. This means that the only function $g(x)$ that satisfies the above condition for all $w$ and $\theta$ is the zero function, $g = 0$ almost everywhere in $I_n$.

This is a contradiction because we assumed that $g$ is non-zero in $L^2(I_n)$. Therefore, our initial assumption that the set of such finite sums is not dense in $L^2(I_n)$ must be false. This implies that the finite sums of the form $G(x)$ are indeed dense in $L^2(I_n)$, completing the proof.

Related Question