Solved – Sensitivity Analysis in Deep Neural Networks

feature selectionneural networkspythonsensitivity analysis

Following a question already answered (Extracting weight importance from One-Layer feed-forward network) I am looking for
inference about relevance of inputs in neural networks.

Considering a deep net, where reconstructing the input importance by going backward through the layers from the output node of interest may be difficult or time consuming, I was wondering whether there was some theoretical framework in performing sensitivity analysis for neural network, basically slightly change an input and consider how the ouptut node of interest changes.

Is there a canonical way of performing some sort of sensitivity analysis in neural networks?

I would really welcome some Python code to do so, if there is any

Best Answer

The sensitivity analysis you suggest corresponds to examining the partial derivatives of the outputs with respect to the inputs. Say the output vector $y \in \mathbb{R}^m$ is given by $y= f(x)$ , where $x \in \mathbb{R}^d$ is the input vector and $f$ is the function the network implements. The Jacobian of the outputs w.r.t. the inputs is:

$$J_{ij}(x) = \frac{\partial}{\partial x_j} f_i(x)$$

The Jacobian gives the local rate of change of each output w.r.t. each input, so it tells us how $f$ will behave in response to infinitesimal perturbations. If we start with input $x$ and add an infinitesimal value $\Delta$ to the $j$th input, we expect the $i$th output to increase by $\Delta J_{ij}(x)$.

If $J_{ij}(x)$ has large magnitude, it means that output $i$ is sensitive to input $j$ in the vicinity of $x$. Because $f$ is, in general, nonlinear, this notion of sensitivity depends on the input; it may be large in some regions and near zero in others. If you want some kind of summary measure of how strongly the outputs depend on the inputs, you'd have to aggregate over multiple input values. For example, you could take the absolute value of the Jacobian, averaged over all inputs in the training set (which acts as a surrogate for the expected value w.r.t. the underlying distribution of inputs). Of course, this kind of summary will end up discarding information, so could be misleading in some circumstances.

You can use the chain rule to derive an expression for the Jacobian, similarly to how you'd derive the gradient of the loss function w.r.t. the parameters for use with backprop. You can also compute it using automatic differentiation, using a library like Theano, TensorFlow, etc. There's not much reason to perform finite differencing (i.e. actually simulate the perturbation and measure the change in output), unless the function your network implements is nondifferentiable (in which case the Jacobian doesn't exist).

A couple caveats: If the inputs have different units/scales than each other, the sensitivities will also have different units/scales, and can't be directly compared. Standardizing/scaling the inputs is one possible solution. It's also important to keep in mind is that this type of analysis tells us about the model itself, but not necessarily the underlying distribution that generated the data. For example, if two inputs are correlated, the model might end up using the first but not the second. In this case, we'd find that the sensitivity is high for the first input and low for the second, but should not conclude that the first input is inherently more important for predicting the output in general.

This article should be of interest.

Related Question