Solved – Plotting the results of GLM in R

data visualizationgeneralized linear modelrscatterplot

I have this data plotted as a scatter plot in Excel:

enter image description here

I had done a regression in Excel, and the p value was 2.14E-05 while the R- value was 0.32. I was told the R value was too low compared to the significance of the p value, and was told to control for the dispersion of the data by running it through R with GLM with quasipoisson error.

This gave me

glm(formula = encno ~ temp, family = quasipoisson(link = log), 
    data = encnotemp)

Deviance Residuals: 
   Min      1Q  Median      3Q     Max  
-6.008  -2.431  -1.021   1.353   9.441  

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 2.005807   0.174628  11.486  < 2e-16 ***
temp        0.029065   0.006528   4.453 1.53e-05 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for quasipoisson family taken to be 10.19898)

    Null deviance: 1807.4  on 171  degrees of freedom
Residual deviance: 1620.1  on 170  degrees of freedom
AIC: NA

Number of Fisher Scoring iterations: 5

How do I analyse this output?

The problem is that the scatterplot data is too dispersed, and I would like to make a scatterplot from the quasipoisson GLM output that shows less dispersed (more fitted) data points. Will this be possible?

Best Answer

"I was told the R value was too low compared to the significance of the p value" -- sounds like nonsense to me.

On the other hand, some form of glm may be a good idea (but it looks to me like the spread may be increasing more than you might expect with a quasipoisson).

Note that nothing about the glm changes the spread of the data -- it only models changing spread (in a particular way). The data are still the data and if you plot them will still look as they do.

You can change the appearance of the data via a transformation. One that approximately stabilizes variance when the Poisson parameter is not very small is $\sqrt{y}$. If the Poisson parameter can take small values, you may like to try $\sqrt{y+\frac{3}{8}}$ or $\sqrt{y}+\sqrt{y+1}$ instead (it looks to me like that might well be the case that you have small values).

On the other hand, one that would linearize your fitted model would be a log (but that's only suitable if you don't have exact zeros).

--

Although it won't be satisfying to you, you can plot the fitted curve via

plot(temp,encno,xlim=c(0,60))
newdat <- data.frame(temp=seq(9,48,.5))
encnoglm1 <- glm(formula = encno ~ temp, family = quasipoisson(link = log), 
                     data = encnotemp)
fit <- predict(encnoglm1,newdata=newdat,type="response")
lines(fit~temp,data=newdat,type="l",col=4)

Or if you want to look at what would be a nearly constant variance if the quasipoisson were suitable:

 plot(temp,sqrt(encno+3/8),xlim=c(0,60))
 lines(sqrt(fit+3/8)~temp,data=newdat,type="l",col=2)