Solved – R fitting Poisson distribution with weighting

distributionspoisson distributionr

I am wanting to fit a Poisson distribution but I am wanting to weight some of the values as more important. Is there a way to do this in R with fitdistr() or some equivalent function? Here is what I have at the moment:

randoms <- rpois(15,10)
weighting <- seq(1, 100, by=1)
fit <- fitdistr(randoms, "poisson")

I would like to use the 'weighting' data as a way to emphasize the observation at the end of the data set.

Best Answer

Note that 'fitdistr does nothing but maximum likelihood estimation. That is to say, you can do it by yourself by writing down the likelihood. Below is an example for the poisson distribution in R. It can be adapted to upweight/downweight the contribution to the likelihood of each data.


density (as in R) $$f(x; \lambda) = \lambda^x \frac{\exp(-\lambda)}{x!}, \qquad \lambda > 0$$

likelihood $$L(\lambda; \mathbf{x}) = \prod_{i=1}^n \left\{ \lambda^{x_i} \frac{\exp(-\lambda)}{x_i!} \right\} $$

log-likelihood

$$\ell(\lambda; \mathbf{x}) = \sum_{i=1}^n \left\{ x_i \log(\lambda) - \lambda - \log(x_i!) \right\}$$

2nd derivatie of $\ell$:

$$ \frac{d^2 \ell}{d\lambda^2}(\lambda; \mathbf{x}) = \sum_{i=1}^n - \frac{x_i}{\lambda^2} = -\frac{1}{\lambda^2} n\bar{x}$$


#------data------
set.seed(730)
sample <- rpois(1000, 10)
#----------------

 ################################################################################
 # Using 'fitdistr'                                                             #
 ################################################################################

 library(MASS)
 fitdistr(x=sample, densfun="Poisson")
 lambda  
   10.1240000 
   ( 0.1006181)

 ################################################################################
 # writing down the log-likelihood explicitly                                   #
 ################################################################################

 #------minus log-likelihood------
 mloglik <- function(lambda2, sample) #lambda2 = log(lambda) in (-\infty, \infty)
 {
  - sum(sample * lambda2 - exp(lambda2) - log(factorial(sample)))
 }
 #--------------------------------

 #------optimisation------
 res <- nlm(f=mloglik, p=1, sample=sample)
 #------------------------

 #------recover lambda------
 lambda <- exp(res$estimate)
 round(lambda, 7)
 [1] 10.12399
 #--------------------------

 #------standard error------
 #square root of negative inverse second derivative of the log-likelihood
 se <- lambda / sqrt(length(sample) * mean(sample))
 round(se, 7)
 [1] 0.100618
 #--------------------------