Bayesian Estimation – How to Estimate Distribution Parameters with Censored Data

bayesiancensoringestimationmarkov-chain-montecarlor

I am trying to understand the effect of censored data on parameter estimation, in particular with respect to Bayesian modeling. Here is an illustrating example with some code:

Assume you have $N = n+m$ i.i.d observations from an exponential distribution with parameter $\lambda=3$. We will also define a value $L$ such that any observation less than $L$ is censored. With this in mind, the following is the likelihood for the data
\begin{align}
\ell(x|\lambda) &= \prod_{i=1}^mF(x)\prod_{i=1}^nf(x)\\
&=\prod_{i=1}^m(1-e^{-\lambda L})\prod_{i=1}^n\lambda e^{-\lambda x_i}
\end{align}

where $m$ and $n$ are the number of censored and noncensored observations.

We also place a prior such that $\lambda\sim\text{Gamma}(0.01,0.01)$. Now, no matter what the level of censoring is, as long as I have a single uncensored data point my models seems to do well recovering the true $\lambda$. However, as soon as I make the entire data set censored, I cannot estimate $\lambda$ well. This doesn't seem too shocking to me given that you don't have any information other than the censoring point, however, I would have thought that the estimate of $\lambda$ would be whatever the prior mean is, which is not the case. Should you be recovering the prior if you have all censored data?

Here is come code that estimates $\lambda$ in an MCMC. To change the percent of censoring you just need to change the value for prob in the quantile() function.

likelihood <- function(x, censor, lambda){
  
  llik1 <- log(1 - exp(-lambda * x[censor == 1])) 
  llik2 <- log(lambda) - lambda * x[censor == 0]
  lik  <- sum(llik1) + sum(llik2)
  
  return(lik)
}

prior <- function(lambda, sigma){
  
  prior <- dgamma(lambda, .01, .01, log = TRUE)
  
  return(prior)
}

posterior <- function(x, censor, lambda){
  
  post <- likelihood(x, censor, lambda) + prior(lambda)
  
  return(post)
}


set.seed(4)
N = 100
x = rexp(N, 3)
L = quantile(x, prob = 1) # Censoring point
censor = ifelse(x <= L, 1, 0) # Censoring indicator

x[censor == 1] <- L

S <- 10000
lambda <- rep(NA, S)
lambda[1] <- 1

for(i in 2:S){
  
  # MCMC for lambda
  lambda.star = lambda[i-1] + rnorm(1,0)
  
  if(lambda.star < 0){
    alpha = 0
  }else{
    ratio <- exp(posterior(x, censor, lambda.star) - posterior(x, censor, lambda[i-1]))
    alpha <- min(1, ratio)
  }
  
  if(runif(1) < alpha){
    lambda[i] <- lambda.star
  }else{
    lambda[i] <- lambda[i - 1]
  }
  
}


lambda <- lambda[5000:S]
mean(lambda)

Best Answer

It is not the case that you should recover the prior if all your data is censored; there is still information in the data.

Consider the following situation. The data is i.i.d Gaussian with unknown mean $\mu$ (whether or not we know the variance is irrelevant to this example.) My prior on the mean is that $\mu \sim N(0,10)$. My censoring point is $\gamma = 10$. I draw 100 observations; every one is above the censoring point. Without getting into the details of what my posterior should look like, why, given that I have observed 100 values $> 10$ and none $< 10$, and the underlying distribution is Normal, would I hold to the belief that there's a 50-50 chance that the true mean $< 0$ (which is a straightforward implication of the prior)? For any proposed value of $\mu \leq 10$, there is only a $2^{-100}$ probability that all 100 observations would lie above $\mu$ (thanks to the symmetry of the Gaussian); clearly our posterior should be substantially different than our prior if we want to capture this information.

Related Question