I would like to know how the glm function in R actually work. Wouldn't it be possible to make a logistic regression not on the raw columns of a dataset but on the four values you get in the contingency table (if you have two binary variables as outcome and predictor). So for Example, taking a random dataset which gives this contingency table:
Right-handed Left-handed Total
Male 43 9 52
Female 44 4 48
Total 87 13 100
Is the glm function in R calculating them in the end or is it only working on the raw columns? Or is it important for the glm function to know which male is right-handed, which one left-handed and the same for the females?
Best Answer
The
glm
function works by optimizing the log likelihood for the binomial. I suggest you read up on most any book on glm if you are interested in learning more about how these models are fit.That being said, it is possible to re-arrange the 2x2 table in such a way that glm can be used.
Created on 2021-09-09 by the reprex package (v2.0.1)
As to some of your questions:
glm works on raw columns and does not calculate 2x2 tables
If you have replicates (e.g. you have 10 right handed males) then you can use the
weights
argument to let glm know this, or you can repeat the row 10 times in your data.