Solved – What’s a good book or reference for data visualization

data visualizationreferences

I'm looking for some references on creating effective graphs/data visualizations.

I've found a bunch of books that show how to create data visualizations using certain tools (like R/ggplot vs python/pandas) but that's not really what I'm looking for. I'm looking for a reference that explains different types of charts with respect to stats/math. I want more theory than process.

I want to know the different types of charts and how to use them. Anything helps!

Best Answer

I think that the work of William Cleveland is going to be closer to what you want that that of Tufte. Cleveland wrote two books:

  1. Visualizing Data (1993)
  2. The Elements of Graphing Data (1985)

The first book, in particular, may be what you want. Here is a publisher's description:

Visualizing Data is about visualization tools that provide deep insight into the structure of data. There are graphical tools such as coplots, multiway dot plots, and the equal count algorithm. There are fitting tools such as loess and bisquare that fit equations, nonparametric curves, and nonparametric surfaces to data. But the book is much more than just a compendium of useful tools. It conveys a strategy for data analysis that stresses the use of visualization to thoroughly study the structure of data and to check the validity of statistical models fitted to data. The result of the tools and the strategy is a vast increase in what you can learn from your data. The book demonstrates this by reanalyzing many data sets from the scientific literature, revealing missed effects and inappropriate models fitted to data.

An even more theoretical book is The Grammar of Graphics by Leland Wilkinson. The description:

This book was written for statisticians, computer scientists, geographers, researchers, and others interested in visualizing data. It presents a unique foundation for producing almost every quantitative graphic found in scientific journals, newspapers, statistical packages, and data visualization systems. While the tangible results of this work have been several visualization software libraries, this book focuses on the deep structures involved in producing quantitative graphics from data. What are the rules that underlie the production of pie charts, bar charts, scatterplots, function plots, maps, mosaics, and radar charts? Those less interested in the theoretical and mathematical foundations can still get a sense of the richness and structure of the system by examining the numerous and often unique color graphics it can produce. The second edition is almost twice the size of the original, with six new chapters and substantial revision. Much of the added material makes this book suitable for survey courses in visualization and statistical graphics.

This book is very theoretical.