I have a question which seems to be very elemental but I have found lot of disagreement about it. I have a situation where we took several measurements of the same individuals. I was suggested to make a plot like this:
Nevertheless, I have found some critics to this approach, since some reviewers mention it is not possible to "easily" see some values just like the mean/median and the variation of the data. In this paper recently published (http://journals.plos.org/plosbiology/article?id=10.1371%2Fjournal.pbio.1002128#pbio.1002128.s007), it is suggested that both, the scatter and lines plot could be used, but I would like to hear other opinions. If somebody knows what is the "best" or most accepted way to plot this data and/or knows an alternative way, I would be really thanked!
Best Answer
I agree with the reviewer that it's very hard to read.
I personally love using heatmaps (aka pseudocolor plots aka checkerboard plots) for this kind of thing:
And small multiples can be very nice as well:
Andrew Gelman has blogged about these kinds of displays before, too. There's a time and place for plots like yours. He calls them "spaghetti plots", and as Nick Cox mentioned in the comments they actually work better when the series starts in one place and fans out, or when the lines don't overlap much, more like raw dried spaghetti than cooked. I tend to like the heat map (which a commenter on that post calls a "lasagna plot") better, because it scales almost arbitrarily. Prof Gelman is also the one who turned me onto small multiples.
Note however that heatmaps tend to work better when they aren't constrained to greyscale. For instance the one I made would greatly benefit from a red/blue diverging color scheme with white at zero
We make graphs to facilitate comparisons. Whenever you make a plot, you should ask yourself which comparisons it facilitates, and which comparisons it obfuscates.
The R code for these: