Solved – Best loss function for very sparse real-valued data

loss-functionsmathematical-statisticspredictive-models

Suppose the target output of my data prediction model is an $M\times N$ matrix where $95\%$ of the values are $0.0$ and the other values are anywhere between $0.0$ and $1.0$, what would be a good loss function to use for this kind of data?

As long as my model outputs a lot of $0$'s the MSE would be really small even at the start (about $10^{-3}$), and it has a hard time learning the values properly that are bigger than $0$

Any ideas? Thanks!

Best Answer

Can you do something with asymmetric loss, e.g. the cost of predicting zero when it should be non-zero is different from the cost of predicting non-zero when it should be zero.