Solved – Neural network restrict output range based on input

loss-functionsneural networks

My goal to implement a neural net which outputs a range of values with an additional restriction: based on an input value selected among the features, certain output values will be forbidden.

For example, suppose an input value is in the range 1 to 100. The output would be forced to be greater than the input value but less than 100. Another example would be to restrict output to the input +-10.

Would simply adding +Inf to output values in the forbidden range accomplish this?

Best Answer

One way to code such constraints is using something akin to an SVM hinge loss. For $f_i(x)$ be greater than $x_i$, this is morally equivalent to adding a loss term $\alpha_i\max(x_i-f_i(x),0)$, where $\alpha_i$ is a hyperparameter. To make this more stable, typicall you'd add a separation hyperparameter: $\alpha_i\max(x_i-f_i(x)+\beta_i,0)$. You can add strict upper/lower bounds in a similar way. In general this is a soft constraint because you're only enticing your model to try and make the output bigger than the input, but sometimes this will be false. By making $\alpha_i$ very large, it will do its best to avoid these issues. Theoretically if $\alpha_i$ is big enough, and your neural network is big enough, it will find a way to satisfy the constraints exactly. But this heavily risks over-regularizing the network (are you worried more about accuracy or escaping your constraints?).

If you want to strictly force your output to be bigger than the input, you might need an adhoc activation function in your last layer, that's also connected to your inputs. So for example if $z_i$ are the outputs of the prior layer, maybe something like:

$$a(Wz+x)$$

where the weight matrix $W$ is restricted to have only positive coefficients, and $a$ is monotonically increasing. Packages like keras have options to make such weight restrictions. If you want to further restrict the range of $a$, you can play around with the activation functions range. For example if $a(x)$ is a sigmoid, its values are in $(0,1)$, but $99a(x)+1$ takes values in $(1,100)$.