Bayesian – Calculating Posterior Standard Deviation from Normal Sample with Discrete Prior in R

Suppose the sample $(7,2,6,12,10,9)$ is well approximated by a normal distribution with mean $\mu$ and standard deviation $6$. Use the discrete prior distribution $(8,9,10,11)$ of possible values for $\mu$ all equally likely. Find the posterior distribution and the posterior standard deviation.

So I computed the posterior using R as follows

> sample = c(7,2,6,12,10,9)
> sample.var = 6^2/length(sample)
> mu = c(8,9,10,11)
> prior = rep(0.25,4)
> likelihood= dnorm(mean(sample),mean = mu, sd = sqrt(sample.var))
> posterior = (likelihood*prior)/(sum(likelihood*prior))
> posterior
[1] 0.3434827 0.2989415 0.2202344 0.1373414
> posterior.mean = sum(posterior*mu)
> posterior.mean
[1] 9.151435

So I have the posterior probability of the values for $\mu$, and I computed the posterior mean.

But Im not sure what exactly the posterior standard deviation is supposed to be.

I have the formula, $\sqrt{\frac{\sum(x_i-\mu)^2}{n}}$ My assumption was the posterior mean should be $\mu$ in this formula, but I'm not sure if the $x_i$ values should be the sample, or the prior values.

My guess is it is this value:

> posterior.sd =  sqrt(sum((mu-posterior.mean)^2/4))
> posterior.sd
[1] 7.298286

rm(list = ls()) #create observations x<-c(7,2,6,12,10,9) #create prior, note this could also be done as #prior<-rep(.25,4) but it would change the equation a bit #prior mass prior<-.25 #parameter space mu<-seq(8,11) #create cells to form the calculations of the posterior from posterior_locus<-matrix(0,nrow = length(x),ncol = length(mu)) #note that the below method is "bad form" #I did not know if you knew the apply family of functions, so I used nested #loops #this would be very slow in a large data set for (i in 1:length(x)) { for (j in 1:length(mu)) { posterior_locus[i,j]<-dnorm(x[i],mean = mu[j],sd=6) } } # the "correct" way to solve the posterior would be to use # posterior<-apply(posterior_locus,2,sum) #this is the for-loop solution #this initializes posterior posterior<-rep(1,length(mu)) for (i in 1:length(x)) { for (j in 1:length(mu)) { posterior[j]<-posterior[j]*posterior_locus[i,j] } } posterior<-posterior*prior #normalize to unity posterior<-posterior/sum(posterior) #calculate the posterior mean #note this section could be skipped and the code could be embedded in the variance calculation posterior.mean<-sum(posterior*mu) #calculate the posterior variance #note this section could also be skipped since it could be embedded into the standard deviation calculation directly posterior.variance<-sum(posterior*((mu-posterior.mean)**2)) #calculate the posterior standard deviation posterior.sd<-sqrt(posterior.variance) plot(mu,posterior,type = "l")

Best Answer

There are a couple of types of errors in your code. I wrote the code using nested loops, but good form would recommend using a member of the apply family of functions instead. I also printed you a sanity graph. Your posterior mean is about 9.15 and your posterior standard deviation is about 1.04.

One thing to remember is that your posterior mean has to be inside your supported set of possible parameter values. Likewise, you posterior standard deviation should be sensible given the structure of your mass function.

It is better code to write the posterior as a function and the likelihood as a function and to use apply functions over the data and the parameter space. However, as you did not do that, I did not assume that you knew how to do that. Apply functions can be a bit difficult to read if you haven't used them.

Best Answer

Related Solutions

Solved – Why is square root taken for sample count “N” in standard deviation formula

Bayesian – Posterior for Discrete Prior from Normal Sample with Known Variance

Related Question