I use the generalized form of the Student's-t distribution:

\begin{align*}

f(l|\nu ,\mu ,\beta) = \frac{\Gamma (\frac{\nu+1}{2})}{\Gamma (\frac{\nu}{2}) \sqrt{\pi \nu} \beta} \left(1+\frac{1}{\nu}\left(\frac{l – \mu}{\beta}\right)^2 \right)^{\text{$-\frac{1+\nu}{2}$}}

\end{align*}

I want to have standardized version, i.e. mean zero and a variance of one. Therefore, I set $\mu=0$ and \begin{align*}

\beta=\sqrt{\frac{\nu-2}{\nu}}

\end{align*}

which ensures, that the variance is equal to one. If I now insert this I get after some derivations

\begin{align*}

f(l|\nu) =(\pi (\nu-2))^{-\frac{1}{2}}\Gamma \left(\frac{\nu}{2} \right)^{-1} \Gamma \left(\frac{\nu+1}{2} \right) \left(1+\frac{l^2}{\nu-2} \right)^{-\frac{1+\nu}{2}}

\end{align*}

Now my question is: What is the formula for the kurtosis?

Is it still $\frac{6}{\nu-4}$?

E.g. consider these data and the following R code:

```
pinumber<-3.141592653589793
startvalue<-2
loglikstandardizedt <-function(par){
if(par>0) return(-sum(log((pinumber*(par-2))^(-1/2)*gamma(par/2)^(-1)*gamma((par+1)/2)*(1+standresidsapewma^2/(par-2))^(-(1+par)/2))))
else return(Inf)
}
optim(startvalue, fn=loglikstandardizedt, method="Brent",lower=2,upper=250)
param = optim(startvalue,loglikstandardizedt, method="BFGS")$par
```

If I look at the plot, to see how good the fit is, I do the following code:

```
# control output
denstiystandtresid<-function (x) (pinumber*(param-2))^(-1/2)*gamma(param/2)^(-1)*gamma((param+1)/2)*(1+x^2/(param-2))^(-(1+param)/2)
plot(density(standresidsapewma),ylim=c(0,0.8))
curve(denstiystandtresid,col="red",add=TRUE)
```

This gives me the following plot:

As you can see, the fit is, let's say fairly ok. Now, I am interested in the excess kurtosis. The data has the excess kurtosis of

```
kurtosis(standresidsapewma)
```

which gives `0.6470055`

I would expect, since the fit is quite ok in the tails, that the fitted distribution has almost the same excess kurtosis, but if I calculated it via the following way (the estimate output for $\nu$ is 8.85009):

$\frac{6}{\nu-4}=\frac{6}{8.85009-4}=1.23709$?

which is pretty much more than 0.64. This seems to be wrong to me, since I believe, that the fit in the tail is quite ok, so the kurtosis should be almost the same? Is my formula for calculating the ex kurtosis in case of a standardized Student's-t distribution wrong? Or what is my mistake?

## Best Answer

Sample moments typically converge slowly to the true moments. This is the reason why you are observing such discrepancies between the two methods. For instance, run the following code several times

In many cases the two estimators (sample kurtosis and MLE) differ. You got one of those samples where they differ.

Moreover (and maybe more importantly), the sample kurtosis converges to the true kurtosis while the MLE kurtosis converges to the kurtosis of the distribution that better fits the true distribution according to this criterion.

I agree with @whuber that the fit of your proposal distribution is pretty bad. You are unnecesarilly restricting the distribution (a Student-t would provide a much better fit for almost the same computational cost). Check