Solved – How to interpret a mosaic plot in R

data visualizationinferenceinterpretationr

The mosaic plot shown bellow is the product of this code

mosaicplot(table(survey$Air,survey$Satisfaction),
  main = "Satisfaction with Airconditioning and Overall Satisfaction")

Bear in mind that the satisfactions were coded as binary, where levels of satisfaction of 1,2,3 are coded as 0, and levels of satisfaction of 4,5 were coded as 1. I do not really know how to interpret the mosaic plot. Any help is appreciated! enter image description here

Best Answer

A mosaic plot helps to visualize the statistical association between two categorical variables, in your case between having AC (0, 1) and high satisfaction (0, 1).

  1. The frequency distribution of the first variable (AC) is represented on the horizontal axis. The relative frequency of each factor level is proportional to the width of the corresponding segment on the x-axis. In your case, most observations are 0, so no AC was available. In pseudo-code, you would obtain the corresponding values by prop.table(table(AC))

  2. The joint frequency distribution (the relative frequencies of each combination of the two factors) is represented by the areas of the corresponding rectangles. In your case, there are many dissatisfied persons without AC. You would get these relative frequencies by table(AC, satisfaction).

  3. Within each level of the first variable, the frequency distribution of the second variable (satisfaction) is shown vertically. If these conditional distributions look similar, there is no or only a weak association between the two factors. If they look very different like in your case, you would speak of a clear association. In the no-AC group (the left vertical bar), a clear minority (about 10%) is satisfied. In the AC group (the right vertical bar), more than half of the persons were satisfied. In pseudo.code, you would get the corresponding proportions by prop.table(table(AC, satisfaction), margin = 1).

Typically, the third information is usually the reason to look at a mosaic plot. The order of the variables is essential.