Solved – Visualizing multiple size distributions in one plot

distributionsr

I have 17 size distributions for different coral species, and I would like to be able to compare these distributions in one plot. However, the distributions are very different, so when I naively tried to overlay their density plots, many of the distributions were so small compared to the largest one that they were just crowded into the bottom-left corner.

Is there a better way to visualize these distributions in one plot which will allow me to compare relative sizes as well see the distribution within species?

Best Answer

Perhaps a joy plot would bring you happiness?

http://austinwehrwein.com/data-visualization/it-brings-me-ggjoy/

This plot shows 12 months of temperature data with a separate histogram for each month. The histograms are sort of layered over each other. For this example, you'll need to download the CSV of data from the link, then the code is as follows:

library(ggjoy)
library(hrbrthemes)
weather.raw$month<-months(as.Date(weather.raw$CST))
weather.raw$months<-factor(rev(weather.raw$month),levels=rev(unique(weather.raw$month)))

#scales
mins<-min(weather.raw$Min.TemperatureF)
maxs<-max(weather.raw$Max.TemperatureF)

ggplot(weather.raw,aes(x = Mean.TemperatureF,y=months,height=..density..))+
  geom_joy(scale=3) +
  scale_x_continuous(limits = c(mins,maxs))+
  theme_ipsum(grid=F)+
  theme(axis.title.y=element_blank(),
        axis.ticks.y=element_blank(),
        strip.text.y = element_text(angle = 180, hjust = 1))+
  labs(title='Temperatures in Lincoln NE',
       subtitle='Median temperatures (Fahrenheit) by month for 2016\nData: Original CSV from the Weather Underground')

enter image description here

UPDATE

The necessary dataset is now included with the ggjoy package, so instead of downloading the CSV file, you can just run the following code to get a very similar plot:

library(ggjoy)
ggplot(lincoln_weather, aes(x = `Mean Temperature [F]`, y = `Month`)) +
  geom_joy(scale = 3, rel_min_height = 0.01) +
  scale_x_continuous(expand = c(0.01, 0)) +
  scale_y_discrete(expand = c(0.01, 0)) +
  labs(title = 'Temperatures in Lincoln NE',
       subtitle = 'Mean temperatures (Fahrenheit) by month for 2016\nData: Original CSV from the Weather Underground') +
  theme_joy(font_size = 13, grid = T) + theme(axis.title.y = element_blank())
Related Question