Solved – Simple, multiple, univariate, bivariate, multivariate – terminology

bivariatemultiple regressionmultivariate analysismultivariate regressionunivariate

I do realise (some of) this has already been addressed here (e.g., Why do we need multivariate regression (as opposed to a bunch of univariate regressions)?, Explain the difference between multiple regression and multivariate regression, with minimal use of symbols/math, and Defining a univariate regression). Nevertheless, I hope I can get some more insight on this.

What are the differences among the terms in the title? Does this vary (and therefore causes confusion)? Or is there one proper terminology, and some people just use it incorrectly?

  1. For example, from what I understand, simple (linear) regression would be where we have one response and one explanatory variable. Multiple (linear) regression is when we have one response and multiple explanatory variables. So far so good – I'm not confused here yet.

  2. But when one says "multivariate", does this imply multiple RESPONSES, or multiple EXPLANATORY VARIABLES (or both)? I assume the former (also judging from the links provided), but people seem to use these interchangeably.
    By extension, "univariate" would likely mean a single response?

  3. Further, the term "multivariate analyses" (PCA, discriminant analysis, etc.) usually implies multiple responses, right?

  4. And finally, the term "bivariate". Zuur et al. 2009 (Mixed effects models and extensions in ecology with R) state that:

    The bivariate linear regression model is defined by: $$Y_i = α + β × X_i + ε_i \\ \text{ where }ε_i \sim N(0, σ^2)$$

    This seems to refer to one response and one predictor (i.e. simple linear regression). So, in this case, does the term "bivariate" refer to two variables in total (one response, one predictor)?

Is there some standard terminology or is it just a mess?

Best Answer

As for Question 1, you are correct with what you said.

As for Question 2, multivariate stands for an analysis involving more than one response variables. To my knowledge there is no differentiation in terminology with respect to the predictor variables. To be consistent one could maybe say, but I am not sure, "simple multivariate regression" when multiple responses and one predictor variable are present.

As for Question 3, I'd say you are right again.

As for Question 4, the term bivariate refers to a situation when there are two continuous variables in total, i.e. an analysis that can be visualized in a 2d scatter plot (simple linear regression and correlation for example).

So now what does univariate refers too? I think (and I might be wrong) that is the case when you have one response and one or more categorical predictor(s). So, for example you measure the heights of trees coming from the same parent tree, or the weight of chicken fed with the different feeds. This type of analyses would be analyzed as a t-test or Analysis of Variance.

The difference between univariate and bivariate can be seen when you visualize the data. If you plot something as a bar graph, (or dot plot) it is univariate, if you plot something on a 2d scatter plot, it is bivariate. I might be wrong here but I am sure if that's the case someone will comment!

Related Question