Solved – How to ensure that probabilities sum up to 1 in group when doing binary prediction on group members

boostingmachine learningnormalizationprobability

Assume I predicted the probability for samples to be in the positive class using logistic regression for binary targets.
The samples come in groups of 5 but I'm predicting individually and I would like to ensure that the sum of probabilites for the positive class in one group sum up to one.

For example the following rows are the

prob pos., prob neg.]
[0.1, 0.9]
[0.4, 0.6 ]
[0.2 , 0.8 ]
[0.1 , 0.9 ]
[0.3 , 0.7 ]

so the sum for the positive label is 1.1.
I'm looking for a machine learning algortihm which naturally fixes this issue or another method. Right now I'm just normalizing along the column axis but I think I loose some predictive power doing just that.

After doing some research on my own, I came across listwise learning to rank algorithms. Do they fix this? Or can I use the rank implementations in xgboost for that?

Best Answer

The softmax function is like a logistic function for multiple outputs. The theory for the softmax function shows that if you have many predicted probabilities from logistic functions then you can divide each probability by the sum of all the probabilities to get an adjusted probability. This adjustment to the probability will force them to sum to 1.