Generating Function Intuition

generating-functionsintuitionprobability theorysoft-questiontaylor expansion

I am trying to understand the use of generating functions. I understood that we can compress a sequence into a generating function, so that each coefficient of the polynomial that it generates are the elements of the sequence. But I don't understand what does the inputs change?

Let's say we have the generating function: $$G(x)=\sum^\infty_{k=0} p_k x^k$$

What happens when we give different values to $x$, what is changing intuitively? I thought the $x^k$ term was there to encode the coefficient's location in the sequence, since we can't add $p_ax^a$ and $p_bx^b$ if $ a \neq b$, so that the terms stay heterogenous. But I saw that for a probability distribution the property $G(1)=1$ must hold. Is this the only case where giving a value to x is useful?

Thanks a lot in advance for the explanations.

Best Answer

If $X$ is a discrete random variable taking values in the non-negative integers $\{0,1, \dots\}$, then the probability generating function of $X$ is defined as:

$$\color{blue}{\displaystyle G(z)=\mathbb{E} \left(z^{X}\right)=\sum_{x=0}^{\infty }p(x)\;z^{x}}$$

where $p$ is the probability mass function of $X$. The choice of $z$ instead of $x$ is simply related to the idea that what we are doing is a z transform.

Notice in what follows that $z$ is acting like a clothesline to hang up the values of interest, which are recovered after differentiating, and evaluating at $0$ to recover the PMF, or at $1$ for the moments, respectively. This magic happens thanks to the fact that $z$ either becomes $0$ in the entire tail of terms (PMF), or $1.$ But in either case it is not related to the random variable, and does not contribute any information - it is the equivalent of a dummy variable.

CHARACTERISTICS:

  1. IT GIVES YOU PROBABILITIES by differentiating:

$$\color{blue}{\large p_i = \left. \frac{1}{i!}\quad\frac{d^i \, G(z)}{dx^i} \right|_{z=0}=\frac{1}{i!} \;G^{(i)}\;(0)}$$

  1. $G\,(1)=1$ because $$\displaystyle\sum_{i=0}^\infty p_i \; 1^i=1$$

  2. First differential

$$G^{(1)}(z) =\frac{d}{dz}\mathbb E\left[z^X\right]=\mathbb E\left[X\,z^{X-1}\right]$$

  1. The first differential evaluated at $1$ gives you the mean: $$G^{(1)}(1) =\left.\mathbb E\left[X\,z^{X-1}\right]\right|_{z=1}=\mathbb E\left[X\quad1^{X-1}\right]= \mathbb E[X].$$

  2. The second derivative evaluated at $1$ is the factorial momment, and is NOT the variance, because the second term is not squared.

$$\begin{align}G^{(2)}\;(1) &=\frac{d^2}{dz^2}\; \left.\mathbb E\left[z^X\right]\right|_{z=1}\\[2ex]&=\mathbb E\left[X\;(X-1)\;z^{X-2}\right]\\[2ex]&=\mathbb E\left[X\;(X-1)\right]\\[2ex]&=\mathbb E\left [X^2-X\right ]\\[2ex]&=\mathbb E\left[X^2\right] - \mathbb E\left[X\right]\end{align}$$

  1. Generalizing, then, the $i$-th derivative evaluated at $1$ is the $i$-th factorial moment:

$$G^{(i)}\;(1)= \mathbb E\left[X\;(X-1)\;\cdots\;(X-i+1)\right]$$

  1. To get the variance,

$$\begin{align}\sigma^2 &= \mathbb E\left[X^2\right]-\mathbb E\left[X\right]^2 \\[2ex] &=G^{(2)}\;(1)+G^{(1)}\;(1)-\left[G^{(1)}\;(1)\right]^2 \end{align}$$

  1. We can get raw moments by differentiating the pgf and multipling it by $z$:

$$\mathbb E\left[X^i\right]= \left. \left( z\;\left(\frac{d}{dz}\right)^i \; G(z)\right)\right|_{z=1}$$

Related Question