Neural Networks – How to Impose a Condition on a Neural Network

kerasneural networkstensorflow

I am building a neural network model with TensorFlow and Keras in python. My model is performing well on unseen data in the way I desire and everything is fine. but the problem that I don't have any idea how to implement a solution for it is this:
consider my neural network has the input like this

Input = [i1, i2, i3, i4, i5]

and the output of the network is only single value and we call it

Output = O

I want that the output of the neural network be greater than specific input value. here for example I want that O > i3. despite the very good performance of my neural network on Test Data (unseen data) but in some cases the mentioned condition will be violated and this is a problem for me.

Best Answer

A dirt-simple solution is to add a regularization term, so your loss function is $\text{loss} + \lambda \text{ReLU} (i_3 - O)$. This adds a penalty whenever your inequality is violated, so the model will tend to respect the constraint.

While this solution is inexact, It will be more challenging to solve this exactly because constrained optimization is not something NN libraries are designed for.

Some related solutions:

Loss function in machine learning - how to constrain?

Related Solutions

Solved – Struggling to make a neural network mimic a basic if statement

The curvature of the cost surface with these particular inputs and outputs makes this a bit of a pathological example. A 'good' solution can be found by just outputting 0.333 all the time, and if you take a small step away from this solution for one of the inputs in the correct direction, it's probably often being cancelled out by a large increase in cost for one of the other inputs.

Still, if you make your weights initialisation more sensible (should be centred at zero) and standardise your inputs, then you can get this to work:

import tensorflow as tf
import numpy as np
sess = tf.InteractiveSession()
INPUTS_AMOUNT = 1
HIDDEN_NODES_AMOUNT = 10
HIDDEN_NODES_AMOUNT_2 = 10
OUTPUTS_AMOUNT = 1

# define placeholder for input and output
x_ = tf.placeholder(tf.float32, shape=[None, INPUTS_AMOUNT], name="x-input")
y_ = tf.placeholder(tf.float32, shape=[None,OUTPUTS_AMOUNT], name="y-input")

# Since we're using a relu, the weights are initiated appropriately to avoid dead (-ve) neurons
### FIX WEIGHTS INITIALISATION
W = tf.Variable(tf.random_uniform([INPUTS_AMOUNT, HIDDEN_NODES_AMOUNT], -0.1, 0.1))
b = tf.Variable(tf.zeros([HIDDEN_NODES_AMOUNT]))
hidden  = tf.nn.relu(tf.matmul(x_,W) + b)

### FIX WEIGHTS INITIALISATION
W1 = tf.Variable(tf.random_uniform([HIDDEN_NODES_AMOUNT, HIDDEN_NODES_AMOUNT_2], -0.1, 0.1))
b1= tf.Variable(tf.zeros([HIDDEN_NODES_AMOUNT_2]))
hidden1  = tf.nn.relu(tf.matmul(hidden,W1) + b1)
### FIX WEIGHTS INITIALISATION
W2 = tf.Variable(tf.random_uniform([HIDDEN_NODES_AMOUNT_2,OUTPUTS_AMOUNT], -0.1, 0.1))
b2 = tf.Variable(tf.zeros([OUTPUTS_AMOUNT]))
hidden2 = tf.matmul(hidden1, W2) + b2
y = tf.nn.sigmoid(hidden2)
# Training function allows for error calculations for value between 0 and 1
cost = tf.reduce_mean(( (y_ * tf.log(y)) + ((1 - y_) * tf.log(1.0 - y)) ) * -1)
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cost)

# Specify the data to go into the placeholders
INS = [[0.9], [1.0], [1.1]]
### Z-SCORE INPUTS
INS = np.array(INS)
INS = (INS-INS.mean())/INS.std()
OUTS = [ [0], [1], [0]]
init = tf.global_variables_initializer()
sess.run(init)
# Train on the input data, doesn't actually need 100000 to converge
for i in range(100000):
    sess.run(train_step, feed_dict={x_: [INS[i%3]], y_: [OUTS[i%3]]})
    if i % 2000 == 0:
        print('Output for debugging', sess.run(y, feed_dict={x_: INS, y_: OUTS}))

Solved – Overfitting in neural network

Without knowing a lot more about the model, nor the data used, it is hard to answer these questions with and rigour. That aside, the values you provide would make the think it is a reasonable model and does not necessarily overfit the training data.

for your second question, my first line of action would always be to plot the training and test accuracy over each epoch (iteration), then look at how the curves develop. I generally hope to see a test curve that shadows the training curve, always a little lower. Here is a diagram with a short explanation taken from the amazing cs231n course from Stanford.

Image source

Course Homepage

All the material and video lectures are freely available and would be a great place for you to improve your understanding whilst working on Deep Learning topics.

Best Answer

Related Solutions

Solved – Struggling to make a neural network mimic a basic if statement

Solved – Overfitting in neural network

Related Question