Suppose the target output of my data prediction model is an $M\times N$ matrix where $95\%$ of the values are $0.0$ and the other values are anywhere between $0.0$ and $1.0$, what would be a good loss function to use for this kind of data?
As long as my model outputs a lot of $0$'s the MSE would be really small even at the start (about $10^{-3}$), and it has a hard time learning the values properly that are bigger than $0$
Any ideas? Thanks!
Best Answer
Can you do something with asymmetric loss, e.g. the cost of predicting zero when it should be non-zero is different from the cost of predicting non-zero when it should be zero.