Solved – Rules of thumb for “modern” statistics

I like G van Belle's book on Statistical Rules of Thumb, and to a lesser extent Common Errors in Statistics (and How to Avoid Them) from Phillip I Good and James W. Hardin. They address common pitfalls when interpreting results from experimental and observational studies and provide practical recommendations for statistical inference, or exploratory data analysis. But I feel that "modern" guidelines are somewhat lacking, especially with the ever growing use of computational and robust statistics in various fields, or the introduction of techniques from the machine learning community in, e.g. clinical biostatistics or genetic epidemiology.

Apart from computational tricks or common pitfalls in data visualization which could be addressed elsewhere, I would like to ask: What are the top rules of thumb you would recommend for efficient data analysis? (one rule per answer, please).

I am thinking of guidelines that you might provide to a colleague, a researcher without strong background in statistical modeling, or a student in intermediate to advanced course. This might pertain to various stages of data analysis, e.g. sampling strategies, feature selection or model building, model comparison, post-estimation, etc.

Best Answer

Don't forget to do some basic data checking before you start the analysis. In particular, look at a scatter plot of every variable you intend to analyse against ID number, date / time of data collection or similar. The eye can often pick up patterns that reveal problems when summary statistics don't show anything unusual. And if you're going to use a log or other transformation for analysis, also use it for the plot.

Best Answer

Related Solutions

Solved – Rules of thumb for minimum sample size for multiple regression

Sample size as optimisation problem

A Rough Rule of Thumb

G Power 3

Multiple Regression tests multiple hypotheses

Accuracy in Parameter Estimation

Solved – Basic easy rules for statistics

Related Question