Solved – Moments from pdf

density functionmoments

I got confused reading about moments and their relationship with the pdf.

Given a pdf and the values of the parameters, can we calculate the moments of the distribution? More importantly, what is the formula for the second and third moment, (variance and skewness)?

I saw a formula for the variance with an integral minus the mean squared. Is this an integral over all the support of the pdf? What is the equivalent for the third moment?

Best Answer

The moments are defined in terms of integrals.

For continuous random variables

$E(X)=\int_{-\infty}^\infty x f(x) dx$

More generally:

$E(X^k)=\int_{-\infty}^\infty x^k f(x) dx$

$E[(X-\mu)^k]=\int_{-\infty}^\infty (x-\mu)^k f(x) dx$

See Wikipedia on Moments (mathematics).

Given a pdf and the values of the parameters, can we calculate the moments of the distribution?

If we can evaluate the relevant integral, yes.

More importantly, what is the formula for the second and third moment, (variance and skewness)?

The skewness of a random variable is not the third moment of that variable.

Wikipedia on skewness

Variance is the second central moment, so it follows from the formula I gave above by putting $k=2$.

I saw a formula for the variance with an integral minus the mean squared.

Yes, using basic properties of expectation, you can write $E[(X-\mu)^2]=E[X^2]-\mu^2$.

See Wikipedia on variance.

Is this an integral over all the support of the pdf?

Strictly the integral is over the real line, but the pdf is only non-zero within its support, so effectively, yes.

What is the equivalent for the third moment?

\begin{eqnarray} E[(X-\mu)^3]&=&E[(X^3-3\mu X^2+3\mu^2 X - \mu^3)]\\ &=&E(X^3)-3\mu E(X^2)+3\mu^2 E(X) - E(\mu^3)\\ &=&E(X^3)-3\mu E(X^2)+2\mu^3 \end{eqnarray}

The general case is given by Wikipedia in the article on central moments

Related Solutions

Solved – Calculation of Higher-Order Cross-moments

You are in essence looking for a multivariate measure of skew and kurtosis. There are many. I would start with the most establish ones, which are the multivariate skew and kurtosis measures of Mardia 1977 [0].

It seems to me you are more asking for an implementation than about the measures themselves. I don t know of any matlab implementation, but the R code below (from the R library psych) should be fairly easy to translate in matlab:

mardia <- function(x, na.rm=TRUE, plot=TRUE) {
  cl <- match.call()
  x <- as.matrix(x)     # in case it was a dataframe
  if(na.rm) x <- na.omit(x)
  n <- dim(x)[1]
  p <- dim(x)[2]
  x <- scale(x,scale=FALSE)  # zero center
  S <- cov(x)
  S.inv <- solve(S)
  D <- x %*% S.inv %*% t(x)
  b1p <- sum(D^3)/n^2
  b2p <- tr(D^2)/n 
  chi.df <- p*(p+1)*(p+2)/6
  k <- (p+1)*(n+1)*(n+3)/(n*((n+1)*(p+1) -6))

  small.skew <- n*k*b1p/6
  M.skew <- n*b1p/6
  M.kurt <- (b2p - p * (p+2))*sqrt(n/(8*p*(p+2)))
  p.skew <- 1-pchisq(M.skew,chi.df)
  p.small <- 1 - pchisq(small.skew,chi.df)
  p.kurt <- 2*(1- pnorm(abs(M.kurt)))
  d = sqrt(diag(D))
  if(plot) {qqnorm(d)
            qqline(d)}
  results <- list(n.obs=n, n.var=p, b1p=b1p,b2p= b2p, skew=M.skew, 
                  small.skew=small.skew, p.skew=p.skew, p.small=p.small, 
                  kurtosis=M.kurt, p.kurt=p.kurt, d=d, Call=cl)
  class(results) <- c("psych", "mardia")
  return(results)
}

[0] K.V. Mardia (1970). Measures of multivariate skewness and kurtosis with applications. Biometrika, 57(3):pp. 519-30, 1970.

Solved – Why the first moment is standardized before computing higher moments, but higher moments are not

Since the question was updated, I update my answer:

The first part (To compute the skewness, why not standardize both the mean and the variance?) is easy: That is precisely how it's done! See the definitions of skewness and kurtosis in wiki.

The second part is both easy and hard. On one hand we could say that it is impossible to normalize random variable to satisfy three moment conditions, as linear transformation $X \to aX + b$ allows only for two. But on the other hand, why should we limit ourselves to linear transformations? Sure, shift and scale are by far the most prominent (maybe because they are sufficient most of the time, say for limit theorems), but what about higher order polynomials or taking logs, or convolving with itself? In fact, isn't it what Box-Cox transform is all about -- removing skew?

But in the case of more complicated transformations, I think, the context and the transformation itself becomes important, so maybe that is why there are no more "moments with names". That does not mean that r.v.s are not transformed and that the moments are not calculated, on the contrary. You just chose your transformation, calculate what you need and move on.

The old answer about why centralized moments represent shape better than raw:

The keyword is shape. As whuber suggested, by shape we want consider the properties of the distribution that are invariant to translation and scaling. That is, when you consider variable $X + c$ instead of $X$, you get the same distribution function (just shifted to the right or left), so we would like to say that its shape stayed the same.

The raw moments do change when you translate the variable, so they reflect not only the shape, but also a location. In fact, you can take any random variable, and shift it $X \to X + c$ appropriately to get any value for its, say, raw third moment.

The same observation holds for all odd moments and to lesser extent for even moments (they are bounded from below and lower bound does depend on shape).

The centralized moment, on the other hand, does not change when you translate the variable, so that's why they are more descriptive of the shape. For example, if your even centralized moment is large, you known that random variable has some mass not too close to mean. Or if your odd moment is zero, you known that your random variable has some symmetry around mean.

The same argument extends to scale, which is transformation $X\to cX$. The usual normalization in this case is division by standard deviation, and the corresponding moments are called normalized moments, at least by wikipedia.

Best Answer

Related Solutions

Solved – Calculation of Higher-Order Cross-moments

Solved – Why the first moment is standardized before computing higher moments, but higher moments are not

Related Question