Solved – Problems plotting GLM data of binomial proportional data

data visualizationgeneralized linear modelrscatterplot

This might be a question for the programmers, but I thought I would ask here first. Im comparing browsing pressure on plants between sites. Ive a value to indicate Browsing Pressure and count data of trees that have been damaged between locations. Ive been using Crawleys R example (page 574) regarding sex ratios and my issue is with the lines command to plot the information. I can't get a fitted regression line to the points to show up for my data. At least not using the commands Crawley suggested. Ive included Crawley's script for quick reference:

density = c(1,4,10,22,55,121,210,444)
females=c(1,3,7,18,22,41,52,79)
males=c(0,1,3,4,33,80,158,365)
numbers=as.data.frame(cbind(density,females,males))
attach(numbers)
par(mfrow=c(1,2))
p<-males/(males+females)
plot(log(density),p,ylab="Proportion male")
y<-cbind(males,females)
model<-glm(y~log(density),binomial)
xv<-seq(0,6,0.1)
plot(log(density),p,ylab="Proportion male")
lines(xv,predict(model,list(density=exp(xv)),type="response"))

My Script:

data<-####see the dput() data below to insert here
attach(data)
p1<-MS1/(MS1+M1)
y<-cbind(MS1,M1)
GM1<-glm(y~BPT,family=binomial (logit),data=data)
summary(GM1)
plot(BPT,p1,col="black",pch=1,main="Relationship a",xlab="Browsing pressure",
ylab="Moose Damage Survey")

This is where I am having difficulties interpreting Crawley's lines command. I can create a sequence for my data like his xv values – mine go from 0 to 0.7.
xv<-seq(0,0.7,0.01) But as Im not sure what is happening in the command line, blindly inserting my data results in no line showing on my graph. I have some success with regLine from the car library, but its not a line fit to my data. Any help explaining why this works for Crawleys but not my data would be appreciated

My DATA:

structure(list(S = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 
14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 
30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 
46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60), 
    BPT = c(0.195778643105884, 0.0651216427326909, 0.0199432997190648, 
    0.255717971810655, 0.21031730503132, 0.0380060076389563, 
    0.0592940237882193, 0.00425756881206537, 0.107351336677244, 
    0.353626952501855, 0.134358354834345, 0.0774003112699534, 
    0.0571275803730281, 0.102480049705623, 0.506048974065245, 
    0.637795327349053, 0, 0.00825369912299262, 0.123063816485164, 
    0.244214078077056, 0, 0.231982659809539, 0.0265824936284919, 
    0, 0, 0, 0.197470272555566, 0, 0.0481572367104686, 0.339072491473947, 
    0.0640004726042278, 0.099465609734403, 0, 0.153951519788959, 
    0.103791449528225, 0.22066571012681, 0.101286659375451, 0.0210055739071239, 
    0, 0, 0.00811999120846242, 0.0576334384710714, 0.0514027186783361, 
    0.00907843357263548, 0.0785740169740806, 0.0801612249152947, 
    0.0439884643248279, 0.189608233745472, 0.434349352638054, 
    0.188190501082836, 0.016381369054353, 0.215189432724754, 
    0.0948563822493123, 0, 0.0111745795351048, 0.0289751701436048, 
    0.0962515835377456, 0.199970627051051, 0.0306277369169669, 
    0), M1 = c(51, 123, 137, 23, 69, 98, 80, 59, 84, 87, 63, 
    88, 64, 75, 30, 19, 86, 152, 55, 22, 115, 66, 75, 124, 87, 
    179, 39, 66, 117, 37, 59, 103, 57, 65, 83, 68, 87, 104, 134, 
    89, 52, 61, 65, 75, 71, 97, 45, 37, 86, 82, 36, 139, 40, 
    56, 63, 64, 37, 79, 70, 121), M2 = c(108, 195, 255, 31, 104, 
    157, 135, 88, 117, 152, 96, 124, 102, 118, 37, 25, 135, 277, 
    97, 47, 174, 121, 134, 196, 163, 270, 75, 123, 214, 46, 109, 
    156, 92, 118, 127, 112, 134, 152, 194, 132, 107, 101, 105, 
    140, 107, 163, 87, 60, 122, 122, 66, 237, 66, 95, 108, 95, 
    50, 128, 128, 189)), .Names = c("S", "BPT", "M1", "M2"), row.names = c(NA, 
-60L), class = "data.frame")

Best Answer

Here is a solution.

First, I replaced MS1 with M2 since there is no varibale with the name MS1 in your example data.

attach(data)
p1<-M2/(M2+M1)
y<-cbind(M2,M1)

The model:

GM1<-glm(y~BPT,family=binomial (logit),data=data)

Note that there is one important difference between your model and Crawley's. His predictor varibale was log-transformed, your predictor variable BPT is used untransformed. By the way: Since BPT does include zeros, log-transformation would result in -Inf values.

Now the main plot:

plot(BPT,p1,col="black",pch=1,main="Relationship a",xlab="Browsing pressure",
     ylab="Moose Damage Survey")

Now the important differences. Crawley created a sequence xv containg values corresponding to his log-transformed predictor variable density. Note that the lengths of density and xv must be identical. Compare:

range(log(density))
#[1] 0.000000 6.095825
xv<-seq(0,6,0.1)
range(xv)
#[1] 0 6

Since your predictor variable is untransformed, you don't need this kind of sequence but can use the original variable BPT.

lines(BPT,predict(GM1,type="response"))

This is the plot:

enter image description here

If you would like to use your own sequence of values for the prediction, use this:

xv<-seq(0,0.7,length.out = length(BPT))
lines(xv,predict(GM1, list(BPT = xv), type="response"))

Best Answer

Related Solutions

Solved – GLM for proportional data

Solved – Getting the weights to compare AMORE NN models

Related Question