I have a table of $k=(0,1,2,3,4,5,6)$ and $number=(40544,8082,1205,145,20,3,1)$
I need to fit data by a Compound Poisson-Gamma distribution and then make a discretization and compare results with observations.
My attemp is:
The compound Poisson-Gamma distribution can be realized as Negative Binomial distibution:
library(actuar)
library(fitdistrplus)
number <- c(rep(0, 40544), rep(1, 8082), rep(2, 1205), rep(3, 145), rep(4, 20), rep(5, 3), rep(6, 1))
fitnb <- fitdist(number, "nbinom")
shape = fitnb$estimate[1] # size = 2.080522
lambda = fitnb$estimate[2] # mu = 0.2205776
As the result I know the lambda and shape parameters for the discretisation:
n = 6 # we have values from 0 to 6
fx <- discretize(pgamma(x, shape = shape, 1),
method = "unbiased", from = 0, to = n, step = 1.0,
lev = levgamma(x, shape = shape, 1))
sum(fx) # 0.9804088 < 1
Following the documentation and answer I appied the aggregateDist() function with changed parameters lambda and convolve:
The recursion will fail to start if the expected number of claims is
too large. One may divide the appropriate parameter of the frequency
distribution by $2^n$ and convolve the resulting distribution $n$ =
convolve times.
Fsc <- aggregateDist("recursive",
model.freq = "poisson",
model.sev = fx,
lambda = lambda/(2^n),
convolve = n, x.scale = 1)
But I seen again the
Warning message:
In panjer(fx = model.sev, dist = dist, p0 = p0, x.scale = x.scale, :
maximum number of recursions reached before the probability distribution was complete
I was surprised on the output:
summary(Fsc)
# Aggregate Claim Amount Empirical CDF:
# Min. 1st Qu. Median Mean 3rd Qu. Max.
# 0.000000e+00 0.000000e+00 0.000000e+00 4.261483e-01 0.000000e+00 3.200000e+04
Finally, I have tried to compare obtained results (black) with original data (red):
plot(Fsc, do.points = FALSE, verticals = TRUE, xlim = c(0, n))
femp <- c(40544, 8082, 1205, 145, 20, 3, 1)/50000
sum(femp) # 1
plot(stepfun(0:6, diffinv(femp)), pch = 19, col = "red", add = TRUE)
Edit. After the jbowman's comment I have tried to use the "negative binomial":
# https://rdrr.io/cran/actuar/src/R/panjer.R
# r <- size # size
# p <- 1/(1+mu) # prob
Fsc_ng <- aggregateDist("recursive",
model.freq = "negative binomial",
model.sev = fx,
size = shape, prob = 1/(1+lambda), echo = TRUE, maxit = 10)
# x Pr[S = x] Cumulative probability
# 0 0.68415174 0.68415174
# 1 0.083957895 0.76810964
# 2 0.079291596 0.84740123
# 3 0.055756862 0.9031581
# 4 0.036292288 0.93945038
# 5 0.022994652 0.96244504
# 6 0.01271653 0.97516157
# 7 0.0066424692 0.98180404
# 8 0.0040147675 0.9858188
# 9 0.0023141156 0.98813292
# 10 0.0013060718 0.98943899
Question. How to setup the parameters of aggregateDist() function?
Best Answer
There is no need to discretize a gamma distribution; the negative binomial distribution is already discrete. Once you have the parameter estimates for the negative binomial, you are done, except for the plotting:
which generates:
showing an almost perfect fit, with the red lines and points on top of the black ones. For a numerical comparison of the negative binomial fit (
est_cdf
) with the data (emp_cdf
), we get: