Solved – Bartlett’s theory

degrees of freedomfisher-transformneuroimagingvariancez-statistic

This neuroimaging paper has been cited thousands of times. In it, a method is proposed for computing the correlations among several seed regions and all other brain voxels. Part of this method involves z-scoring Fisher z values:

To combine results across subjects and compute statistical significance, correlation coefficients were converted to a normal distribution by Fischer's [sic] z transform (25). These values were converted to z scores (i.e., zero mean, unit variance, Gaussian distributions) by dividing by the square root of the variance, computed as $1 / \sqrt{(n – 3)}$, where $n$ is the degrees of freedom in the measurement. Because individual time points in the BOLD signal are not statistically independent, the degrees of freedom must be corrected according to Bartlett's theory (25). The correction factor for independent frames was calculated to be 2.34, resulting in 318 / 2.34 = 135.9 df.

No other mention is made of Bartlett's theory — that is, they don't explain how they came to the value of 2.34.

Reference 25 is this textbook. Although the textbook seems excellent and reasonably priced, I would like to acquire a basic understanding of Bartlett's theory without having to buy it, so that I might replicate this method. Of course my first course of action was to perform several Google searches; they, however, turned up nothing.

What is Bartlett's theory, in this context?

Best Answer

I suspect what Fox et al 2005 refers to is Bartlett 1946 which is a more "general" form of the AR1-based variance estimator (Bartlett 1935). Bartlett 1946's estimator was later adapted for bivariate time series by Quenouille 1947 as a DoF estimator.

Suppose $X$ and $Y$ are two time series of length $N$ where $\rho_{XX,k}$ and $\rho_{YY,k}$ are the autocorrelation coefficients of $X$ and $Y$, respectively, on lag $k$. Then Quenouille 1947 found the effective DoF to be

$$ \hat{N} = N \left(\sum_{k=-\infty}^{\infty} {\rho}_{XX,k} {\rho}_{YY,k}\right)^{-1}, $$

while Bayley & Hammersley 1946 found, $$ \hat{N} = N\Big(1+2\sum_{k=1}^{N-1}\frac{(N-k)}{N}\rho_{XX,k}{\rho}_{YY,k}\Big)^{-1}. $$

There are many approximations of Bartlett's original estimator. One nice review of these variants can be found in Pyper and Peterman 1998.

It is however very important to note that all above estimators assume $X$ and $Y$ are uncorrelated ($\rho = 0$, which in neuroimaging is far from reality). The problem is that once the assumption is violated, these estimators remarkably overestimate the variance due to a confounding of autocorrelation and crosscorrelation, a phenomena also known as statistical aliasing; see Appendix D of Afyouni et al 2018.

So: No correction over-estimates DoF (underestimates variance) and the above corrections under-estimates DoF (overestimates variance). What can be done? See the estimator has recently been proposed in Afyouni et al 2018, \begin{equation} \begin{split} \mathbb{V}({\hat\rho})&=N^{-2}\left[\vphantom{\sum_k^M}(N-1)(1-\rho^2)^2 \right. \\ &\quad +\rho^2 \sum_k^M w_k (\rho_{XX,k}^2 + \rho_{YY,k}^2 + \rho_{XY,k}^2 + \rho_{XY,-k}^2)\\ &\quad -2 \rho \sum_k^M w_k (\rho_{XX,k} + \rho_{YY,k}) (\rho_{XY,k} + \rho_{XY,-k}) \\ &\quad +2 \left.\sum_k^M w_k (\rho_{XX,k}\rho_{YY,k}+\rho_{XY,k}\rho_{XY,-k}) \right], \end{split} \label{Eq:fastMEIntro} \end{equation}

where $w_i=N-2-k$. While this is an involved expression, we show that -- with sensible regularisation of the autocorrelation and crosscorrelation function -- this gives accurate DoF / variance estimates over a range of settings. (See also Roy 1989 for an asymptotic derivation of the same).


Bartlett, M. S. (1946). On the Theoretical Specification and Sampling Properties of Autocorrelated Time-Series. Supplement to the Journal of the Royal Statistical Society, 8(1), 27. http://doi.org/10.2307/2983611

Bartlett, M. S. (1935). Some Aspects of the Time-Correlation Problem in Regard to Tests of Significance. Journal of the Royal Statistical Society, 98(3), 536. http://doi.org/10.2307/2342284

Quenouille, M. H. (1947). Notes on the Calculation of Autocorrelations of Linear Autoregressive Schemes. Biometrika, 34(3/4), 365. http://doi.org/10.2307/2332450

Bayley, G. V., & Hammersley, J. M. (1946). The “Effective” Number of Independent Observations in an Autocorrelated Time Series. Supplement to the Journal of the Royal Statistical Society, 8(2), 184. http://doi.org/10.2307/2983560

Pyper, B. J., & Peterman, R. M. (1998). Comparison of methods to account for autocorrelation in correlation analyses of fish data, 2140, 2127–2140.

Afyouni, Soroosh, Stephen M. Smith, and Thomas E. Nichols. "Effective Degrees of Freedom of the Pearson's Correlation Coefficient under Serial Correlation." bioRxiv (2018): 453795. https://www.biorxiv.org/content/early/2018/10/25/453795

Roy, R. (1989). Asymptotic covariance structure of serial correlations in multivariate time series. Biometrika, 76(4), 824–827. http://doi.org/10.1093/biomet/76.4.824