Solved – How to visualize both total counts of categories and proportions of subcategories in a plot

categorical datadata visualizationggplot2many-categoriesr

Suppose I have samples drawn from categories A, B, C. Within those categories, I have subcategories d,e,f which are found in all 3 categories. I want to visualize how many samples I have form categories A, B, C and the proportional composition of subcategories d,e,f within each category.

One way to do this is a bar plot (I'm using ggplot2, not that it matters too much) with bars for A, B, C, heights proportional to their total number of samples. Within each bar I partition it by fill color based on the composition of d,e,f within the category. The problem with this is that since A, B, and C will be different heights, it's almost impossible to visually compare the proportions – e.g. proportion of d in A with the proportion of d in B.

To see the proportions, I can renormalize the heights to 100% instead of the sample count so that bars for A, B, and C are now equal height. However, now I can't visualize the counts in A, B, and C.

Is there an elegant way to visualize both of these piece of information simultaneously?

Best Answer

This example of embedded/layered bar plots may represent one alternative. The three main categories are represented by individual bars, then embedded within are subcategory bars (created in ggplot2).

Blog Link (Learning R)