[Math] What’s an efficient way to calculate covariance for a large data set

na.numerical-analysisst.statistics

What is the best algorithm for computing covariance that would be accurate for a large number of values like 100,000 or more?

Best Answer

Check out How to calculate correlation accurately. There are two common formulas that are algebraically equivalent but one has much better numerical properties than the other.