# Machine Learning – Validity of Unique True Labels for Given Log Loss Value

classificationlog-lossloss-functionsmachine learningsupervised learning

Aggarwal's 2021 ICML paper "Label Inference Attacks from Log-loss Scores", seems to argue that the answer to the question in the title is "YES". The paper claims that, given predicted probability values (such as from a logistic regression or neural network) and the log-loss calculated from those predictions and the true labels, there is an algorithm that will produce the true labels.

This means that the label order giving such a log-loss must be unique. If not, then we could have $$\left\{dog, cat, dog, cat\right\}$$ and $$\left\{dog, dog, cat, cat\right\}$$ result in the same log-loss values.

How would the algorithm be able to pick which set of labels is the correct set!?

Coding $$dog$$ and $$0$$ and $$cat$$ as $$1$$, I believe that I have such an example.

library(ModelMetrics)
p  <- c(0.2, 0.8, 0.8, 0.2)
y0 <- c(0, 1, 0, 1) # dog, cat, dog, cat
y1 <- c(0, 0, 1, 1) # dog, dog, cat, cat
ModelMetrics::logLoss(y0, p) == ModelMetrics::logLoss(y1, p)


Both log-loss values are the same.

The "Results on Real Binary Classification Datasets" section sure makes it look like we can just take a vector of labels and a vector of claimed probability values; calculate the log-loss from the claimed probability values and labels; and perfectly reconstruct the labels, as implied by the, "As our attacks construct…" sentence, yet that is impossible for the p that I gave. There are multiple plausible label vectors (y0 and y).

ICML is one of the top conferences in machine learning, so I do not want to be too quick to dismiss the Aggarwal paper. Nonetheless, I seem to have developed a counterexample.

What's going on?

EDIT

After seeing some answers posted, their claim sounds more plausible, but it still seems to have counterexamples. Consider their example after theorem 1, where $$L_v(\sigma)$$ is the log loss for true labels $$\sigma$$ and predictions $$v$$.

As an example, for N = 5, let v = [2, 3, 5, 7, 11]. Suppose
the true labeling is [0, 1, 1, 0, 1]. Then, the adversary
observes $$L_v(\sigma) = 1/5 \ln ( 2304/ 55 )$$
(obtained by plugging in $$v$$
and $$\sigma$$ in Equation 2). For reconstructing the labels, observe
that $$T = 3\times4\times6\times8\times12 = 6912$$, so that all we need is to
compute primes that divide $$T \exp (−NL_v(\sigma)) = 165 = 3 \times 5 \times 11$$. This tells us that only the labels for the second,
third and fifth datapoints must be 1, which is indeed true.

However, I get multiple possible $$\sigma$$ values that correspond to the same log-loss value.

library(ModelMetrics)

p  <- c(2, 3, 5, 7, 11)
sigma0 <- c(0, 1, 1, 0, 1) # dog, cat, cat, dog, cat
sigma1 <- c(0, 0, 1, 1, 1) # dog, dog, cat, cat, cat
ModelMetrics::logLoss(sigma0, p) == ModelMetrics::logLoss(sigma1, p)


EDIT 2

Yes, I did make a mistake. This is the correct simulation, and it gives distinct loss values.

v  <- c(2, 3, 5, 7, 11) # The prime number code from the paper
p <- v/(1 + v) # Transform to probability values
sigma0 <- c(0, 1, 1, 0, 1) # dog, cat, cat, dog, cat
sigma1 <- c(0, 0, 1, 1, 1) # dog, dog, cat, cat, cat
ModelMetrics::logLoss(sigma0, p) == ModelMetrics::logLoss(sigma1, p) # Unequal!