Solved – Can a neural network learn a functional, and its functional derivative

derivativefunctionmachine learningneural networks

I understand that neural networks (NNs) can be considered universal approximators to both functions and their derivatives, under certain assumptions (on both the network and the function to approximate). In fact, I have done a number of tests on simple, yet non-trivial functions (e.g., polynomials), and it seems that I can indeed approximate them and their first derivatives well (an example is shown below).

What is not clear to me, however, is whether the theorems that lead to the above extend (or perhaps could be extended) to functionals and their functional derivatives. Consider, for example, the functional:
\begin{equation}
F[f(x)] = \int_a^b dx ~ f(x) g(x)
\end{equation}

with the functional derivative:
\begin{equation}
\frac{\delta F[f(x)]}{\delta f(x)} = g(x)
\end{equation}

where $f(x)$ depends entirely, and non-trivially, on $g(x)$. Can a NN learn the above mapping and its functional derivative? More specifically, if one discretizes the domain $x$ over $[a,b]$ and provides $f(x)$ (at the discretized points) as input and $F[f(x)]$ as output, can a NN learn this mapping correctly (at least theoretically)? If so, can it also learn the mapping's functional derivative?

I have done a number of tests, and it seems that a NN may indeed learn the mapping $F[f(x)]$, to some extent. However, while the accuracy of this mapping is OK, it is not great; and troubling is that the computed functional derivative is complete garbage (though both of these could be related to issues with training, etc.). An example is shown below.

If a NN is not suitable for learning a functional and its functional derivative, is there another machine learning method that is?

Examples:

(1) The following is an example of approximating a function and its derivative: A NN was trained to learn the function $f(x) = x^3 + x + 0.5$ over the range [-3,2]:
function

from which a reasonable approximation to $df(x)/dx$ is obtained:
function derivative

Note that, as expected, the NN approximation to $f(x)$ and its first derivative improve with the number of training points, NN architecture, as better minima are found during training, etc.

(2) The following is an example of approximating a functional and its functional derivative: A NN was trained to learn the functional $F[f(x)] = \int_1^2 dx ~ f(x)^2$. Training data was obtained using functions of the form $f(x) = a x^b$, where $a$ and $b$ were randomly generated. The following plot illustrates that the NN is indeed able to approximate $F[f(x)]$ quite well:
functional

Calculated functional derivatives, however, are complete garbage; an example (for a specific $f(x)$) is shown below:
functional derivative

As an interesting note, the NN approximation to $F[f(x)]$ seems to improve with the number of training points, etc. (as in example (1)), yet the functional derivative does not.

Best Answer

This is a good question. I think it involve theoretical mathematical proof. I have been working with Deep Learning (basically neural network) for a while (about a year), and based on my knowledge from all the papers I read, I have not seen proof about this yet. However, in term of experimental proof, I think I can provide a feedback.

Let's consider this example below:

enter image description here

In this example, I believe via multi-layer neural network, it should be able to learn both f(x) and also F[f(x)] via back-propagation. However, whether this apply to more complicated functions, or all functions in the universe, it require more proofs. However, when we consider the example of Imagenet competition --- to classify 1000 objects, a very deep neural network are often used; the best model can achieve an incredible error rate to ~5%. Such deep NN contains more than 10 non-linear layers and this is an experimental proof that complicated relationship can be represented through deep network [based on the fact that we know a NN with 1 hidden layer can separate data non-linearly].

But whether ALL derivatives can be learned required more research.

I am not sure if there any machine learning methods that can learn the function and its derivative completely. Sorry about that.

Related Question