[Math] Mean and variance of Binomial Distribution.

binomial distributionmeansprobability distributionsvariance

I was reading a paper that gives a dynamic programming model of an R&D project. It said that the performance drift (the uncertainty in the performance of the product being developed) follows a binomial distribution. From period t to the next period, the performance may unexpectedly improve
with probability p, or it may deteriorate with probability 1−p because of unexpected adverse events. We generalize the binomial distribution by allowing the performance improvement and deterioration, respectively, to be “spread” over the next N performance states with transition probabilities.

pij = $p/n$ if $j\in$ {$i+1/2,…,i+n/2$} , $(1-p)/n$ if $j\in$ {$i-1/2,…,i-n/2$}

I understand upto this point.

Now it says that mean of this distribution is $((N + 1)/4)(2p − 1)+i$ and the variance is $((N + 1)/8)(N/3+(N+1)(1/3 −((2p − 1)$2$/2)))$

I have no idea how do they get this mean and variance. From what I know about binomial distribution, mean is np and variance is npq. ($q=1-p$). Can anyone please show how have they derive this mean and variance?

Edit 1: pij means probability of going from product performance i to product performance j. So basically if product performance is iat time t and j at time t+1 and if $j>i$ then product performance improved and vice-versa. See the left section of this figure and the Note underneath to understand the meaning of "spread":

enter image description here

Best Answer

I'll answer how they derived the mean or expected value of the distribution, and then I'll leave it to you to derive the variance (it'll involve a similar sort of method).

This is assuming $i=0$ (you can easily prove that for an arbitrary $i$ it just amounts to adding $i$ at the end). Think of this as just 'shifting' the distribution $i$ units to the right (or left).

For a random variable $X$ we define the 'mean' or expected value of $X$ as

$\sum_{\forall x} P(X=x)x$ where $P(X=x)$ is the probability that the random variable $X$ takes on the value $x$.

Lets first define the random variable in this scenario, here it is:

$X = {\frac{i}{2}}$, $\forall i \in +/- \{1,2,3,...N\}$

We can also define a probability function for $X$ as

$P(X=x) = {\frac{p}{N}}\\ \forall i \in + \{1,2,...,N\}$

$P(X=x) = {\frac{1-p}{N}} \\ \forall i\in - \{1,2,...,N\}$

So

$E(x) = \sum_{\forall x}P(X=x)x = \sum_{i \in \{1,2,...,n\}}\frac{p}{N}\frac{i}{2} + \sum_{i \in \{-1,-2,...,-n\}}\frac{1-p}{N}\frac{i}{2} $

Also,

$\sum_{i \in \{1,2,...,n\}}\frac{p}{N}\frac{i}{2} = \frac{1}{2}\frac{p}{N} \sum_{i \in \{1,2,...,n\}}{i}$

But $ \sum_{i \in \{1,2,...,n\}}{i}$ is just the sum of the first $N$ natural numbers, which is known to be $\LARGE \frac{N(N+1)}{2}$

So $\sum_{i \in \{1,2,...,n\}}\frac{p}{N}\frac{i}{2} = \frac{1}{2}\frac{p}{N} \sum_{i \in \{1,2,...,n\}}{i} = \frac{1}{2}\frac{p}{N}\frac{N(N+1)}{2} = \frac{p(N+1)}{4}$

Similarly, we can also show that $\sum_{i \in \{-1,-2,...,-n\}}\frac{1-p}{N}\frac{i}{2} = \frac{p-1(N+1)}{4}$

And so $E(x) = \sum_{\forall x}P(X=x)x = \sum_{i \in \{1,2,...,n\}}\frac{p}{N}\frac{i}{2} + \sum_{i \in \{-1,-2,...,-n\}}\frac{1-p}{N}\frac{i}{2} = \frac{p(N+1)}{4} + \frac{(p-1)(N+1)}{4} = \frac{2p-1(N+1)}{4}$