I'd like to model the response of three species functional groups (proportion of total abundance) to different environmental gradients. I thought a multiple linear regression could work well, but now I heard that I should use multiple logistic regression, because my response variable (proportion of total counts) is bounded between 0 and 1. However, when I try to perform logistic regression in R with my data, I have the following error message:
In eval(family$initialize): non-integer #successes in a binomial glm!
More specifically, here is my very simple code:
test_clusters <- read.table(clusters.txt)
head(test_clusters, 3)# checking data
Cluster1 Cluster2 Cluster3 PC1_soil PC2_soil precip disturb
P2 0.8297214 0.01857585 0.1517028 2.200434 0.5114511 647 51.98126
P4 0.3196347 0.04109589 0.6392694 -1.016489 1.9255986 591 16.47774
P7 0.7352941 0.03361344 0.2310924 2.479751 0.6501704 516 20.30064
## test_clusters[,1:3] are the proportional abundance of each cluster, while [,4:7] are the predictor (environmental) variables
## Trying to perform multiple logistic regression to test the response of each cluster to the environmental gradients
model <- glm (Cluster1 ~ PC1_soil + PC2_soil + precip + disturb,
data = test_clusters, family = binomial(link="logit"))
Then I have the error message commented above:
In eval(family$initialize): non-integer #successes in a binomial glm!
Someone know what's the problem? Any other suggestion about the more appropriate test for this kind of data would valuable.
Best Answer
You get the warning (not error) because you did not use the
weight
argument toglm
with thebinomial
family and a 1 dimensional outcome variable that is in the $(0,1)$ range.Do you know the total population for each
Cluster1
fraction? If so, use this as theweight
argument. Though, I may have misunderstood your problem.