Solved – Loss functions that act on real-valued output vectors (and NOT just on 1-hot vectors)

loss-functionslstmrecurrent neural networktorch

I am trying to modify Andrej Karpathy's char-RNN code. As far as I understand, the loss function used in his code for a LSTM is the Softmax function function (in the file model/LSTM.lua ). I understand Softmax is the multi-class equivalent of the Logistic loss function (used for 2-class classification).

The site here says that the Softmax function must have only ONE of its inputs high (i.e, output of the network should be 1-hot vector before applying softmax).
Softmax works with 1-hot input only

I want to modify the code to train an LSTM which gives a real valued vector (i.e, if there are C classes, a C-dimensional vector of floats) and then apply an appropriate loss function on top of it.

My question is, which loss functions are suitable for real-valued vector inputs and how to modify the LSTM in char-RNN implementation to achieve it?
(It would be helpful if such a function already exists in the lua/torch libraries. If it doesn't, it would be really helpful if you could provide me the appropiate lua code and integrate it with the LSTM.lua file. I am new to Torch/Lua programming and Deep Learning as such and would really appreciate you providing your precious time to help me. Please forgive me if some of my queries seem foolish.)

Best Answer

If you have a continuous output that is not restricted to a certain interval, e.g. in a regression problem, you could use the regression criterions. Often the mean squared error (MSE) is used which is also implemented in Torch/Lua.

Related Question