Solved – Fitting t distribtution to financial data

distributionsfittingnormal distributionr

The data I am using can be found here: http://uploadeasy.net/upload/nwv0.rar
The variable is called "alvsloss".

I want to fit a distribution to my financial data. First of all, I started with a normal distribution.

The pdf is given by:

$\frac{1}{\sqrt{2\pi\sigma^2}}\operatorname{exp}\left\{-\frac{\left(x-\mu\right)^2}{2\sigma^2}\right\}$

and the log likelihood, which is to be maximized is given by

$ \ln\mathcal{L}(\mu,\sigma^2) = \sum_{i=1}^n \ln f(x_i;\,\mu,\sigma^2) = -\frac{n}{2}\ln(2\pi) – \frac{n}{2}\ln\sigma^2 – \frac{1}{2\sigma^2}\sum_{i=1}^n (x_i-\mu)^2. $

which gives as a solution

$ \hat{\mu} = \overline{x} \equiv \frac{1}{n}\sum_{i=1}^n x_i, \qquad \hat{\sigma}^2 = \frac{1}{n} \sum_{i=1}^n (x_i – \overline{x})^2. $

I can calculate this in R via mean(alvsloss) and sd(alvsloss) and get the values
-0.0004270872 and 0.02304159 or I do

library(MASS)
fitdistr(alvsloss, "normal")

and get the solution

       mean             sd      
  -0.0004270872    0.0230371244 
 ( 0.0004535429) ( 0.0003207033)

Ok, I understand this. But now my problem arises, I tried this with the t distribution. The pdf is given by:

$\textstyle\frac{\Gamma \left(\frac{\nu+1}{2} \right)} {\sqrt{\nu\pi}\,\Gamma \left(\frac{\nu}{2} \right)} \left(1+\frac{x^2}{\nu} \right)^{-\frac{\nu+1}{2}}\!$

the log likelihood can be found here:
http://de.scribd.com/doc/52587265/80/STUDENT%E2%80%99S-t-DISTRIBUTION formula (17.8) (x is denoted by $\epsilon$. I am not understanding where the $h_i$ comes from in the formula, but this is not my main question.)

First question: Seems to me, that there is no closed form solution?

Second question:
I do fitdistr(alvsloss, "t")

and get the output

  m               s               df      
  -0.0004919768    0.0130128873    2.6340459185 
 ( 0.0003182568) ( 0.0003453702) ( 0.1620424078)
Es gab 28 Warnungen (Anzeige mit warnings())

I am not understanding, why I have three parameters now? The df is the $\nu$, but the others? What is R doing? Since there is no closed form solution, I guess it is using a iterated algorithm? Should I bother about the 28 warnings? They are:

Warnmeldungen:

1: In dt((x - m)/s, df, log = TRUE) : NaNs wurden erzeugt
2: In dt((x - m)/s, df, log = TRUE) : NaNs wurden erzeugt
3: In dt((x - m)/s, df, log = TRUE) : NaNs wurden erzeugt
4: In dt((x - m)/s, df, log = TRUE) : NaNs wurden erzeugt
5: In dt((x - m)/s, df, log = TRUE) : NaNs wurden erzeugt
6: In dt((x - m)/s, df, log = TRUE) : NaNs wurden erzeugt
7: In dt((x - m)/s, df, log = TRUE) : NaNs wurden erzeugt
8: In log(s) : NaNs wurden erzeugt
9: In log(s) : NaNs wurden erzeugt
10: In log(s) : NaNs wurden erzeugt
11: In log(s) : NaNs wurden erzeugt
12: In log(s) : NaNs wurden erzeugt
13: In log(s) : NaNs wurden erzeugt
14: In log(s) : NaNs wurden erzeugt
15: In log(s) : NaNs wurden erzeugt
16: In log(s) : NaNs wurden erzeugt
17: In dt((x - m)/s, df, log = TRUE) : NaNs wurden erzeugt
18: In log(s) : NaNs wurden erzeugt
19: In log(s) : NaNs wurden erzeugt
20: In dt((x - m)/s, df, log = TRUE) : NaNs wurden erzeugt
21: In dt((x - m)/s, df, log = TRUE) : NaNs wurden erzeugt
22: In dt((x - m)/s, df, log = TRUE) : NaNs wurden erzeugt
23: In log(s) : NaNs wurden erzeugt
24: In log(s) : NaNs wurden erzeugt
25: In log(s) : NaNs wurden erzeugt
26: In log(s) : NaNs wurden erzeugt
27: In log(s) : NaNs wurden erzeugt
28: In log(s) : NaNs wurden erzeugt

And how can I plot this distribution with the estimated parameters?

Best Answer

The estimators correspond to $(\mu,\sigma,\nu)$, the parameters of a Student-$t$ distribution with location parameter $\mu\in{\mathbb R}$, scale parameter $\sigma>0$ and $\nu>0$ degrees of freedom. This density is simply given by

$$f(x;\mu,\sigma,\nu)=\dfrac{1}{\sigma}\dfrac{\Gamma \left(\frac{\nu+1}{2} \right)} {\sqrt{\nu\pi}\,\Gamma \left(\frac{\nu}{2} \right)} \left(1+\dfrac{(x-\mu)^2}{\nu \sigma^2} \right)^{-\frac{\nu+1}{2}}.$$

There seems to be no closed expression for these estimators.

The warnings in this case are harmless. You can check this by finding the MLE using the command optim as follows

# log-likelihood function
loglik <-function(par){
if(par[2]>0 & par[3]>0) return(-sum(log(dt((alvsloss-par[1])/par[2],df=par[3])/par[2])))
else return(Inf)
}

# optimisation step
optim(c(0,0.1,2.5),loglik)

The following code shows how to plot the fitted density together with a kernel density estimator.

# fitted density
param = optim(c(0,0.01,2.5),loglik)$par
fit.den <- function(x) dt((x-param[1])/param[2],df=param[3])/param[2]

curve(fit.den,-0.15,0.15)
points(density(alvsloss),type="l",col="red")
Related Question