Solved – Model to predict categorical outcome from continuous and categorical variables

categorical datacontinuous datageneralized linear modelpredictive-modelsself-study

I have to fit a model to test whether Learning (1=learned, 0=failed) depends on lizard sex (M or F), Lizard SVL (snout-vent length), or an interaction of the two.

I am new to both R and this website. Please explain each step fully.

This is random data given to us as part of a zoology/statistics assignment. It is related to lectures 'Generalized Linear Models' and 'Model Selection and Model Averaging'.

Snout-vent length is used as a a measure of size (in millimeters).

Specifically, what R code is required to fit a GLM when a categorical and continuous predictor variable are used to predict a categorical dependent variable? The following is from a previous example with two categorical predictor variables that we have worked through if that helps with what I am trying to ask:

plot(ProportionSurvived ~ Treatment, data=dat)
interaction.plot(dat$Sculpin, dat$Lake, response = dat$NSurvivors/dat$NSticklebackAdded)

glm(cbind(NSurvivors, NSticklebackAdded - NSurvivors) ~ Lake * Sculpin, family = binomial(link = "logit"), data=dat)

Treatment is one of four possible conditions (2 levels of 2 predictors; lake and sculpin present or absent). This example was testing the proportion of sticklebacks that survived (proportionsurvived) in a pond.

Best Answer

Looks to me like you should use binary logistic regression. Your dependent variable is learning (yes/no) and all other variables are independent variables. Binary logistic regression is extremely flexible with regard to scales of measurement used with the predictors.