Lognormal Distribution – Is it Possible to Analytically Integrate x Multiplied by the Lognormal Probability Density Function

distributionslognormal distribution

Firstly, by analytically integrate, I mean, is there an integration rule to solve this as opposed to numerical analyses (such as trapezoidal, Gauss-Legendre or Simpson's rules)?

I have a function $\newcommand{\rd}{\mathrm{d}}f(x) = x g(x; \mu, \sigma)$ where
$$
g(x; \mu, \sigma) = \frac{1}{\sigma x \sqrt{2\pi}} e^{-\frac{1}{2\sigma^2}(\log(x) – \mu)^2}
$$
is the probability density function of a lognormal distribution with parameters $\mu$ and $\sigma$. Below, I'll abbreviate the notation to $g(x)$ and use $G(x)$ for the cumulative distribution function.

I need to calculate the integral
$$
\int_{a}^{b} f(x) \,\rd x \>.
$$

Currently, I'm doing this with numerical integration using the Gauss-Legendre method. Because I need to run this a large number of times, performance is important. Before I look into optimizing the numerical analyses/other pieces, I would like to know if there are any integration rules to solve this.

I tried applying the integration-by-parts rule, and I got to this, where I'm stuck again,

$\int u \,\mathrm{d}v = u v – \int v \mathrm{d}u$.
$u=x \implies \rd u = \rd x$
$\rd v = g(x) \rd x \implies v = G(x)$
$u v – \int v \rd x = x G(x) – \int G(x) \rd x$

I'm stuck, as I can't evaluate the $\int G(x) \rd x$.

This is for a software package I'm building.

Best Answer

Short answer: No, it is not possible, at least in terms of elementary functions. However, very good (and reasonably fast!) numerical algorithms exist to calculate such a quantity and they should be preferred over any numerical integration technique in this case.

Quantity of interest in terms of normal cdf

The quantity you are interested in is actually closely related to the conditional mean of a lognormal random variable. That is, if $X$ is distributed as a lognormal with parameters $\mu$ and $\sigma$, then, using your notation, $$ \newcommand{\e}{\mathbb{E}}\renewcommand{\Pr}{\mathbb{P}}\newcommand{\rd}{\mathrm{d}} \int_a^b f(x) \rd x = \int_a^b \frac{1}{\sigma \sqrt{2\pi}} e^{-\frac{1}{2\sigma^2}(\log(x) - \mu)^2} \rd x = \Pr(a \leq X \leq b) \e(X \mid a \leq X \leq b) \>. $$

To get an expression for this integral, make the substitution $z = (\log(x) - (\mu + \sigma^2))/\sigma$. This may at first appear a bit unmotivated. But, note that using this substitution, $x = e^{\mu + \sigma^2} e^{\sigma z}$ and by a simply change of variables, we get $$ \int_a^b f(x) \rd x = e^{\mu + \frac{1}{2}\sigma^2} \int_{\alpha}^{\beta} \frac{1}{\sqrt{2\pi}} e^{-\frac{1}{2} z^2} \rd z \> , $$ where $\alpha = (\log(a) - (\mu + \sigma^2))/\sigma$ and $\beta = (\log(b) - (\mu + \sigma^2))/\sigma$.

Hence, $$ \int_a^b f(x) \rd x = e^{\mu + \frac{1}{2}\sigma^2} \big( \Phi(\beta) - \Phi(\alpha) \big) \>, $$ where $\Phi(x) = \int_{-\infty}^x \frac{1}{\sqrt{2\pi}} e^{-z^2/2} \rd z$ is the standard normal cumulative distribution function.

Numerical approximation

It is often stated that no known closed form expression for $\Phi(x)$ exists. However, a theorem of Liouville from the early 1800's asserts something stronger: There is no closed form expression for this function. (For the proof in this particular case, see Brian Conrad's writeup.)

Thus, we are left to use a numerical algorithm to approximate the desired quantity. This can be done to within IEEE double-precision floating point via an algorithm of W. J. Cody's. It is the standard algorithm for this problem, and utilizing rational expressions of a fairly low order, it's pretty efficient, too.

Here is a reference that discusses the approximation:

W. J. Cody, Rational Chebyshev Approximations for the Error Function, Math. Comp., 1969, pp. 631--637.

It is also the implementation used in both MATLAB and $R$, among others, in case those make it easier to obtain example code.

Here is a related question, in case you're interested.

Related Solutions

Solved – Lognormal Distribution & Probability

Q1. No, you don't need to treansform your vector. But you do need to test if the lognormal is a good distribution to fit or not.

Q2. I will show you how you can assess visually by using a histogram and fit a lognormal distribution (i.e. estimating mean and variance). The required probability is the last line of the code. Here I have plugged in the estimated mean and standard deviation. If you want to find the probability that the number of units of cars sold is less that $z$ then it would be: $$P(X\leq z)=plnorm(z,meanlog=\hat{\mu},sdlog=\hat{\sigma}),$$ where $\hat{\mu}$ and $\hat{\sigma}$ are estimated mean and standard deviation. So $$P(750\leq X\leq800)=P(X\leq 800)-P(X<750).$$ For the log-normal distribution, the analytic formula for the $P(X\leq z)$ is: $$P(X\leq z)=\frac{1}{2}+\frac{1}{2}erf\Big[\frac{\ln(z)-\hat{\mu}}{\sqrt{2}\hat{\sigma}}\Big],$$ where $erf$ is the error function defined in here.

> cars <- c(4950,2475,2017,917,1100,825,1650,1283,1008,1283,642,550,788,825,715,1082,1118,770,605,825)
> require(MASS)
> hist(cars, freq=F)
> fit<-fitdistr(cars,"log-normal")$estimate
> lines(dlnorm(0:max(cars),fit[1],fit[2]), lwd=3)
> fit
  meanlog     sdlog 
6.9806382 0.5162654 
> plnorm(800,meanlog=fit[1],sdlog=fit[2])-plnorm(750,meanlog=fit[1],sdlog=fit[2])
[1] 0.04072665
>

Best Answer

Related Solutions

Solved – Lognormal Distribution & Probability

Related Question