Solved – Why did statisticians define random matrices

distributionsmathematical-statisticsrandom matrixrandom variable

I studied mathematics a decade ago, so I have a math and stats background, but this question is killing me.

This question is still a bit philosophical to me. Why did statisticians develop all sort of techniques in order to work with random matrices? I mean, didn't a random vector solve the problem? If not, what is the mean of the diferent columns of a random matrix? Anderson (2003, Wiley) considers a random vector a special case of a random matrix with only one column.

I don't see the point of having random matrices (and I'm sure that's because I'm ignorant). But, bear with me. Imagine I have a model with 20 random variables. If I want to compute the joint probability function, why should I picture them as a matrix instead of a vector?

What am I missing?

ps: I'm sorry for the poorly tagged question, but there were no tags for random-matrix and I can't create one yet!

edit: changed matrix to matrices in the title

Best Answer

It depends which field you're in but, one of the big initial pushes for the study of random matrices came out of atomic physics, and was pioneered by Wigner. You can find a brief overview here. Specifically, it was the eigenvalues (which are energy levels in atomic physics) of random matrices that generated tons of interest because the correlations between eigenvalues gave insight into the emission spectrum of nuclear decay processes.

More recently, there has been a large resurgence in this field, with the advent of the Tracy-Widom distribution/s for the largest eigenvalues of random matrices, along with stunning connections to seemingly unrelated fields, such as tiling theory, statistical physics, integrable systems, KPZ phenomena, random combinatorics and even the Riemann Hypothesis. You can find some more examples here.

For more down-to-earth examples, a natural question to ask about a matrix of row vectors is what its PCA components might look like. You can get heuristic estimates for this by assuming the data comes from some distribution, and then looking at covariance matrix eigenvalues, which will be predicted from random matrix universality: regardless (within reason) of the distribution of your vectors, the limiting distribution of the eigenvalues will always approach a set of known classes. You can think of this as a kind of CLT for random matrices. See this paper for examples.

Related Question