R – How to Setup Panjer’s Recursion Correctly in Actuarial Science

actuarial-sciencebinningrself-study

I have a table of $k=(0,1,2,3,4,5,6)$ and $number=(40544,8082,1205,145,20,3,1)$

I need to fit data by a Compound Poisson-Gamma distribution and then make a discretization and compare results with observations.

My attemp is:

The compound Poisson-Gamma distribution can be realized as Negative Binomial distibution:

library(actuar)
library(fitdistrplus)

number <- c(rep(0, 40544), rep(1, 8082), rep(2, 1205), rep(3, 145), rep(4, 20), rep(5, 3), rep(6, 1))

fitnb <- fitdist(number, "nbinom")
shape = fitnb$estimate[1] # size = 2.080522         
lambda = fitnb$estimate[2] # mu = 0.2205776

As the result I know the lambda and shape parameters for the discretisation:

n = 6 # we have values from 0 to 6
fx <- discretize(pgamma(x, shape = shape, 1), 
                 method = "unbiased", from = 0, to = n, step = 1.0, 
                 lev = levgamma(x, shape = shape, 1))
sum(fx) # 0.9804088 < 1

Following the documentation and answer I appied the aggregateDist() function with changed parameters lambda and convolve:

The recursion will fail to start if the expected number of claims is
too large. One may divide the appropriate parameter of the frequency
distribution by $2^n$ and convolve the resulting distribution $n$ =
convolve times.

Fsc <- aggregateDist("recursive", 
                     model.freq = "poisson", 
                     model.sev = fx, 
                     lambda = lambda/(2^n), 
                     convolve = n, x.scale = 1)

But I seen again the

Warning message:
In panjer(fx = model.sev, dist = dist, p0 = p0, x.scale = x.scale,  :
  maximum number of recursions reached before the probability distribution was complete

I was surprised on the output:

summary(Fsc)
#    Aggregate Claim Amount Empirical CDF:
#            Min.      1st Qu.       Median         Mean      3rd Qu.         Max. 
#    0.000000e+00 0.000000e+00 0.000000e+00 4.261483e-01 0.000000e+00 3.200000e+04 

Finally, I have tried to compare obtained results (black) with original data (red):

plot(Fsc, do.points = FALSE, verticals = TRUE, xlim = c(0, n))
femp <- c(40544, 8082, 1205, 145, 20, 3, 1)/50000
sum(femp) # 1
plot(stepfun(0:6, diffinv(femp)), pch = 19, col = "red", add = TRUE)

enter image description here

Edit. After the jbowman's comment I have tried to use the "negative binomial":

# https://rdrr.io/cran/actuar/src/R/panjer.R
# r <- size     # size     
# p <- 1/(1+mu) # prob

Fsc_ng <- aggregateDist("recursive", 
                         model.freq = "negative binomial", 
                         model.sev = fx, 
                         size = shape, prob = 1/(1+lambda), echo = TRUE, maxit = 10)

# x Pr[S = x]   Cumulative probability
# 0 0.68415174  0.68415174
# 1 0.083957895 0.76810964
# 2 0.079291596 0.84740123
# 3 0.055756862 0.9031581
# 4 0.036292288 0.93945038
# 5 0.022994652 0.96244504
# 6 0.01271653  0.97516157
# 7 0.0066424692    0.98180404
# 8 0.0040147675    0.9858188
# 9 0.0023141156    0.98813292
# 10    0.0013060718    0.98943899

Question. How to setup the parameters of aggregateDist() function?

Best Answer

There is no need to discretize a gamma distribution; the negative binomial distribution is already discrete. Once you have the parameter estimates for the negative binomial, you are done, except for the plotting:

size <- 2.080522
mu <- 0.2205776

est_cdf <- pnbinom(0:6, mu=mu, size=size)
emp_cdf <- cumsum(c(40544, 8082, 1205, 145, 20, 3, 1)) / 50000

plot(emp_cdf ~ c(0:6), type = 'b', pch=17, xlab="X values", ylab="CDF")
lines(est_cdf ~ c(0:6), type = 'b', lty=2, pch=16, col=2)
legend("topleft", col=c(1,2), pch=c(17,16),
       legend = c("Empirical CDF",
                  "NB Estimated CDF")
)

which generates:

enter image description here

showing an almost perfect fit, with the red lines and points on top of the black ones. For a numerical comparison of the negative binomial fit (est_cdf) with the data (emp_cdf), we get:

est_cdf
[1] 0.8108675 0.9725817 0.9964581 0.9995712 0.9999502 0.9999944 0.9999994
emp_cdf
[1] 0.81088 0.97252 0.99662 0.99952 0.99992 0.99998 1.00000
Related Question