Solved – Using sample standard deviation as the dependent variable in a Gamma regression

gamma distributiongeneralized linear modelregression

I need to do a regression with sample standard deviation of a distribution as the dependent variable. The approximation the number of trials is the independent variable.

My question is: what should the expected relationship be between these two variables? I'm assuming that there is a square root relationship (Y=sqrt(X)), with a Gamma distribution of residuals (variance is related to mean).

So, I decided to use a Gamma generalised linear model to fit the data, but no matter which link function or weighting system I use, I always get a higher magnitude of residuals onthe "fewer trials" part, and the curve seems to underestimate the true relationship. What am I doing wrong?

Best Answer

One problem I see is that in Gamma regression the shape parameter is assumed constant, and the scale parameter is the one that changes.

However, if your underlying variables are $N(\mu_i,\sigma^2_i)$ from which you draw a sample of size $n_i$ and compute the sample variance $s^2_i$, then $(n_i-1)s^2/\sigma^2_i$ will be $\chi^2_{n-1}$, and a $\chi^2_{\nu}$ is a $\text{Gamma}(\nu/2,2)$ (shape-scale parameterization). So for your data it seems that the shape parameter varies with $n-1$, but not the scale parameter.

Note that as $n-1$ increases the shape of the chi-square becomes less skew, but the shape in the Gamma GLM modelled as you suggest does not become less skew as $n-1$ increases.

You might even be better working with an overdispersed Poisson model; at least it will become less skew as $n-1$ increases, unlike the Gamma model.

Even better perhaps would be to model the likelihood more directly.

Related Question