Solved – Data matrix, predictor matrix, observation matrix, model matrix, and design matrix. What do they mean

matrixmodelingregressionterminology

Is there a clear distinction between these terms? To the best of my knowledge:

Suppose we have $N$ observations and $p$ predictors.

  • predictor matrix $\in \mathbb{R}^{N\times p}$ is synonymous to observation matrix and data matrix. They contain the raw, untreated data. design matrix refers to the same concept in the context of a designed experiment.

  • model matrix is the result of applying some basis expansion* to the predictor matrix.

However, according to Wikipedia, design matrix and model matrix are synonymous:

In statistics, a design matrix, also known as regressor matrix or model matrix or data matrix, is…

Furthermore, MathWorks offers a function to

Convert predictor matrix to design matrix

* see Elements of Statistical Learning, chapter 5 and this question

Best Answer

I wouldn't get caught up in the terms. Just know they are referring to your data. Every discipline (engineering, CS, statistics) has different terms for the same thing.

However, to dive in to the detail, if your data is all numerical (no categorical data), then the model matrix = design matrix because there are no categorical values to expand on (no contrasts). A design matrix will most likely contain categorical values like gender, race, or some other type of binary/categorical status. A categorical matrix with these categorical values need to be one-hot coded to be numerically meaningful. Then, depending on your contrasts settings, you may see k-1 categorical vectors from the k categorical values.

An example of these types of settings are included in R's documentation contrasts.

Depending on your settings, you may see the following:

> warpbreaks =  warpbreaks[order(runif(dim(warpbreaks)[1])),] ## random shuffle
> head(model.matrix(breaks ~ wool, data = warpbreaks)) ##
     (Intercept) woolB
30           1     1
39           1     1
32           1     1
16           1     0
6            1     0
7            1     0
> head(model.matrix(breaks ~ wool - 1, data = warpbreaks))
     woolA woolB
30     0     1
39     0     1
32     0     1
16     1     0
6      1     0
7      1     0

Python's patsy also has similar settings.

Related Question