[Math] Mean and Variance of a Logarithmic Distribution

expected valueprobabilityvariance

I'm studying probability theory and ran into a question regarding the mean and variance of distributions that I'm having difficulty with. To be a bit more specific,

Question: (From Introduction to Probability (2e) – Blitzstein & Hwang)

Let $X$ have PMF

$$P(X=k)\ =\ c\frac{p^k}{k}\ (k = 1,\ 2,\ …)$$

where $p$ is a parameter with $0 \lt p \lt 1$ and $c$ is a normalizing constant. We have $c=\frac{-1}{\log(1-p)}$, as seen from the Taylor series

$$-\log(1-p) = p\ +\ \frac{p^2}{2}\ +\ \frac{p^3}{3}\ +\ …$$

This distribution is called the Logarithmic distribution. Find the mean and variance of $X$.

I was able to find $E(X)$ quite easily, but am having trouble finding $E(X^2)$.

\begin{align}
E(X)& = \sum_{k=0}^\infty k \times c\frac{p^k}{k} \\
& = c\sum_{k=1}^\infty k \times \frac{p^k}{k} \\
& = c\sum_{k=1}^\infty p^k \\
& = \frac{-1}{(1-p)\log(1-p)} \\
\end{align}

When finding $E(X)$, I'm using the same approach, but with $k^2$ instead of $k$. This leaves me with $\sum_{k=1}^\infty kp^k$. How do I deal with this kind of series?

EDIT

Apparently, my mean calculation is also wrong. From this Wikipedia page it states that the expected value for a Logarithmic distribution is

$$E(X) = \frac{-p}{(1-p)\log(1-p)}$$

I don't understand how the $p$ appeared in the numerator. Would anybody be kind enough to give me a hint?

Thank you!

Best Answer

Apologies for the confusion, I've found the answer thanks to a hint from Fede Poncio in the comment on the original question. Also, the second part of my question in the Edit section originates from my initial lack of understanding of geometric series.


\begin{align} E(X) & = \sum_{k=0}^\infty k \times c \frac{p^k}{k}\\ & = c\sum_{k=1}^\infty p^k \\ & = c\sum_{k=1}^\infty p \times p^{k-1} \\ & = -\frac{p}{\log(1-p)}\sum_{k=1}^\infty p^{k-1} \\ & = -\frac{p}{\log(1-p)}\frac{1}{1-p} \\ & = -\frac{p}{(1-p)\log(1-p)} \end{align}


I forgot that geometric series should start from the case of $x^0$, not $x^1$. And since I changed the base of the summation from $k=0$ to $k=1$, the appropriate changes to the ratio $p$ were also made.


\begin{align} E(X^2) & = \sum_{k=0}^\infty k^2 \times c\frac{p^k}{k} \\ & = cp\sum_{k=1}^\infty kp^{k-1} \\ & = -\frac{1}{\log(1-p)} \times \frac{p}{(1-p)^2} \\ & = -\frac{p}{(1-p)^2\log(1-p)} \\ \end{align}


\begin{align} Var(X) & = E(X^2)\ -\ (E(X))^2 \\ & = -\frac{p}{(1-p)^2\log(1-p)}\ -\ \frac{p^2}{(1-p)^2(\log(1-p))^2} \\ & = -\frac{p\log(1-p)\ +\ p^2}{(1-p)^2(\log(1-p))^2} \\ & = -\frac{p^2\ +\ p\log(1-p)}{(1-p)^2(\log(1-p))^2} \end{align}


You can verify that these are correct from the Wikipedia page for the Logarithmic distribution.


For anybody who's also wondering how I initially did about the series:

$$\sum_{k=1}^\infty kp^{k-1}$$

as Fede Poncio pointed out, it helps if you observe that this is the first derivative of the series:

\begin{align} &\sum_{k=0}^\infty p^k = \frac{1}{1-p} \\ &\sum_{k=1}^\infty kp^{k-1} = \frac{1}{(1-p)^2} \\ \end{align}

Related Question