Probability – How to Determine Class Membership Probability Given Certain Properties

classificationdistributionsprobability

It's kind of difficult to formulate this question since I am not an expert in probabilities.

Assume we have objects with three binary properties, e.g. a ball can be black or white, hard or soft, heavy or light. One might conclude that there are eight possible types of ball, but that's not the case here. Not all combinations are possible. We have total of four types, black-hard-heavy, black-soft-light, white-hard-light and white-soft-heavy. In other words, any two properties determine the value of the third.

Imagine our senses are a bit dulled so we can't say with certainty whether the ball is black or white, hard or soft, etc.
We take a random ball from a bag and try to tell which properties it has. For example, let's say we are 80% sure the ball is black, 70% sure that it's hard and 60% sure that it's heavy. What is the probability distribution among all four classes?

If all combinations were possible, this would be easy to calculate. We would just multiply all three probabilities, but in this case, we know that there are only four types in the bag and the probabilities among classes should add up to 1.

Here are some test cases that I calculated by intuition.

  • 100% black, 50% hard, 50% heavy -> P(black,hard,heavy) = P(black,soft,light) = 50%, the rest is 0%
  • 100% black, 100% hard, >0% heavy -> P(black,hard,heavy) = 100% (or very close to it), the rest is 0%.
  • 100% black, 100% hard, 0% heavy -> P(black,hard,heavy) = P(black,soft,light) = P(white,hard,light) = 33%, P(white,soft,heavy) = 0%. Since we definitely know that there is no such ball as black-hard-light, we can say that we were wrong in one of the properties, hence the 33%.

It would also be very helpful to generalise this in the following ways:

  • properties are non binary. E.g. properties could take one of four possible values (e.g. colour could be black, white, blue or green). That would make total of 64 combinations, but let's say only 16 are possible.
  • the bag might contain unequal number of each type, e.g. 50% of all balls are black-hard-heavy, 30% are black-soft-light, 15% are white-hard-light and 5% are white-soft-heavy

I hope the question is clear. If more information is required, I can provide it.

Best Answer

I'm not sure if I fully understand the question but aren't you just looking for conditional probabilities? Something of the form of P(black | Hard and Heavy) read as the probability of the ball being black given it is hard and heavy. If this is a foreign concept to you I highly recommend you read up about them as they're a cornerstone of statistics. https://en.wikipedia.org/wiki/Conditional_probability

Related Question