Conditional Probability – What Does Conditional Independence Mean Semantically?

conditional probabilityconditional-independenceinferenceintuitionprobability

I've just spent the last 3 hours reading every post, question, Medium article, and textbook entry on conditional independence, and I still don't really understand it. Can somebody explain what it means in layman's terms? I think I'm getting tripped up on the semantics.

One example I frequently come upon is: a person's height and vocabulary are conditionally independent, given age. What I don't understand is, what are they when not given age? Dependent? Wikipedia says so.

This is preposterous to me. Height is not dependent upon vocabulary, nor vice versa. (This brings up another question: does dependent mean "caused by?" Does it mean "correlated with?") In any case, it seems to me that any two events are either empirically independent or they are empirically dependent. Their relationship to each other cannot magically change. Bart and Lisa are still brother and sister regardless of whether or not we know that Homer is father to both.

The existence of a notion called "conditional" independence doesn't make sense to me. And the semantics confuse me. Indeed, it seems that height and vocabulary are conditionally dependent–on age! In other words, age is a variable that binds these things (making them less independent).

My head is in knots.

Best Answer

It may be useful for you to initially think of 'dependence' and 'independence' here as statements about correlation or association, rather than as direct statements of causality. They are useful in developing a framework in which to think about causality, though this is easier to comprehend once the basic idea is in place.

In your example, once we have accounted for our knowledge about the age of an individual, height tells us nothing about their vocabulary and vice versa. In other words: at any particular age, being taller or shorter should have no association with vocabulary (I'm going to ignore rare medical conditions that may cause reductions in both). This makes height and vocabulary conditionally independent.

But now assume that we do NOT know anything about age. If I told you that the individual was 40 cms tall, you could probably make a better than random guess about their vocabulary - simply because (i) very short individuals are likely to be infants and (ii) infants are likely to have a smaller vocabulary than adults.

Similarly, if I gave told you than an individual knew ~100 words in total, you would probably guess that they are very short - because (i) individuals with a tiny vocabulary are likely to be children, and (ii) children are likely to be shorter than adults.

This makes height and vocabulary conditionally dependent on each other. This means that though A and B do not affect each other directly, they both affect C, and so information about C and one of A or B therefore tells you something about the other.

Related Question