Statistics – Good Introduction from an Algebraic Point of View

ct.category-theorypr.probabilityreference-requestst.statistics

There are already lots of questions on this subject like

Is there an introduction to probability theory from a structuralist/categorical perspective?

Is there a combinatorial/topological treatment of statistical independence?

What is the algebraic equivalent of independent elements?

and related field called ergodic theory which in fact study different things.

However, as a new category theorist with almost no statistics background I don't aim to learn these advanced topics, but to understand very basic notions like random variable and expectation from a algebraic perspective.

For example, we can define a type family

Rand: Type->Type

, and a real random variable can be defined as

randnum : Rand Real

and Expectation as

E : Rand a -> Real

It seems that statistics is the one of the most recalcitrant subject for algebraic approach, but I think it is not the case, we can just treat it as any other abstract object, and define axioms on this abstract random type. The notations and formulas in every introduction book of statistics I have read soon become utterly ugly due to lack of a proper foundation, which is really painful for someone ingrained with abstract algebra and functional programming. However,statistics is extremely useful for machine learning and the modelling of human brain and many others.

I have created a repo for a basic type-directed understanding of statistics in haskell https://github.com/doofin/alg-statistics

Best Answer

Lucien Le Cam developed an approach to statistics that largely disposes of measure-theoretic probability and replaced probability measures and random variables with certain Banach lattices. The approach can be found in Le Cam's book Asymptotic Methods in Statistical Decision Theory and the more accessible Comparison of Statistical Experiments by Torgersen.

Keeping the traditional measure theoretic approach to statistics but studying it by a category-theoretic approach is Statistical Decision Rules and Optimal Inference by Cencov.

For basic material on linear regression, there is also The Coordinate-Free Approach to Linear Models by Wichura; this is an area amenable to an approach that is likely to be more comfortable for an algebraist. This is the only book in the list that might be said to be introductory.

That being said, anyone who actually wants to work in statistics needs to be familiar with the standard literature and approach. Warts and all. Much of statistical theory is about inequalities; more analysis than algebra.

Related Question