I have a big vector, $x$, of observations constituted by integers 0-5 that represents how many claims occurred during a certain insurance-contract (111408 elements).
I want to make a Q-Q-plot to show that this data follow a Poisson distribution.
My code in R so far:
y <- rpois(111408, lambda = sum(x)/111408)
qqplot(qpois(ppoints(111408),lambda=sum(x)/111408), y, xlab = 'Theoretical quantiles', ylab = 'Empirical quantiles', main='Q-Q plot Poisson')
qqline(y, distribution = function(p) qpois(p, lambda = sum(x)/111408), col = 2)
But my plot looks weird and I get the following message:
Error in int_abline(a = a, b = b, h = h, v = v, untf = untf, …) :
'a' and 'b' must be finite
What am I doing wrong?
Best Answer
The function qqline does the following:
Let's focus on the computation of quantiles (that's where it starts to go wrong).
It will take your values
y
and compute the 1st and 3rd quartile (this is the default probs value). But since you have so many zero's these .25 and .75 quantiles are both zero and this will create a division by zero further on in the computation.You can make this function work by changing the
probs
.(But a qq plot with such little difference in y-values is not really a useful comparison. Instead of a plot you could better create a table with theoretic and observed frequencies/numbers)
Example how to change the probs
Example how to make a table (and how it would look like)