Solved – How to define multiple losses in machine learning

loss-functionsmachine learningmultitask-learningtensorflow

I'm using TensorFlow for training CNN for classification. In machine learning, there are several different definitions for loss function. In general, we may select one specific loss (e.g., binary cross-entropy loss for binary classification, hinge loss, IoU loss for semantic segmentation, etc.). If I took multiple losses in one problem, for example:

loss = loss1 + loss2

Does this belong to multi-task learning or multi-objective optimization? Should we add some trainable linear weights for the loss?

loss = a*loss1 + (1-a)*loss2

Best Answer

Using two losses means that you are interested in optimizing both of them. It may come from that you are doing two different tasks and you are sharing some part of your model between two tasks. Or in some way, it may come from that you look for optimizing over a multi objective situation. Here is some examples using multiple losses:

  • Training a classifier and using regularization term (e.g. L1 L2 etc. regularization terms). In this case you are interested in classifying inputs but you don't want your weights to grow too large. So you will optimize some linear combination of two losses. (that linear combination weight would be a hyper parameter and you can tune it using hyper parameter tuning techniques just like Cross Validation)

  • Training a network that has an input (images of clothes) and classifies it in two ways: 1- its color 2- its size. In this example, one can think its better to train two disjoint models, but in some situations sharing some layers of Neural Network helps the generalization of the model. In this example, you would like to optimize a summation of two losses and it would work nicely in most cases. (I think it isn't necessary to add a hyper parameter to it because minimizing summation of partially independent losses, leads to optimizing both of them in the best way.)

Doing this optimization in Tensorflow would be a piece of cake :)

# add both loss
final_loss = tf.add(loss1,loss2)
train_op = tf.train.AdamOptimizer().minimize(final_loss)

Or you may like to:

optimizer1 = tf.train.AdamOptimizer().minimize(loss1)
optimizer2 = tf.train.AdamOptimizer().minimize(loss2)
# in training:
_, _, l1, l2 = sess.run(fetches=[optimizer1, optimizer2, loss1, loss2], feed_dict={x: batch_x, y: batch_y})

Or maybe:

optimizer1 = tf.train.AdamOptimizer().minimize(loss1)
optimizer2 = tf.train.AdamOptimizer().minimize(loss2)
# in training:
_, l1 = sess.run(fetches=[optimizer1, loss1], feed_dict={x: batch_x, y: batch_y})
_, l2 = sess.run(fetches=[optimizer2, loss2], feed_dict={x: batch_x, y: batch_y})

Hope that be helpful :)