I could use some advice. I am trying to model the relationship between a binary outcome (impact present/impact absent) and a binary predictor (threat present/threat absent), and see if that relationship varies among threat categories. In other words, **does the correlation between threats and impacts vary significantly among threat categories**. Since I am modeling a binary outcome, it seems like logistic regression would be a reasonable approach, and I will discuss what I have so far below. If anyone has suggestions for other approaches they would be welcome. My end goal is to be able to say, **for each threat category, the likelihood of an impact being present if a threat present is x.**

My dataframe looks like this:

species = a categorical variable with 128 levels

threat_category = a categorical variable with 17 levels

impact.pres = a binomial variable with present = 1 and not_present = 0

threat.pres = a binomial variable with present = 1 and not_present = 0

Example data (smaller than actual dataset):

```
dat <- cbind(Species = rep(letters[1:10], each = 5),
threat_cat = rep(c("recreation", "climate", "pollution", "fire", "invasive_spp"), 10),
impact.pres = sample(0:1, size = 50, replace = T),
threat.pres = sample(0:1, size = 50, replace = T))
```

I am running a no-intercept model because I am interested in the true coefficients for each threat, not the difference between each threat and a reference threat.

My model and output looks something like this:

```
mod<- glm(impact.pres ~ 0 + threat.pres*threat_cat, data = dat, family = "binomial")
summary(mod)
Call:
glm(formula = impact.pres ~ 0 + threat.pres * threat_cat, family = "binomial",
data = dat)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.66511 -0.90052 -0.00022 0.90052 1.89302
Coefficients:
Estimate Std. Error z value Pr(>|z|)
threat.pres -6.931e-01 1.323e+00 -0.524 0.600
threat_catclimate 6.931e-01 8.660e-01 0.800 0.423
threat_catfire 6.513e-16 1.000e+00 0.000 1.000
threat_catinvasive_spp 1.099e+00 1.155e+00 0.951 0.341
threat_catpollution -9.163e-01 8.367e-01 -1.095 0.273
threat_catrecreation 6.931e-01 8.660e-01 0.800 0.423
threat.pres:threat_catfire -9.163e-01 1.987e+00 -0.461 0.645
threat.pres:threat_catinvasive_spp -1.099e+00 1.958e+00 -0.561 0.575
threat.pres:threat_catpollution -1.596e+01 2.284e+03 -0.007 0.994
threat.pres:threat_catrecreation 1.099e+00 1.958e+00 0.561 0.575
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 69.315 on 50 degrees of freedom
Residual deviance: 56.785 on 40 degrees of freedom
AIC: 76.785
Number of Fisher Scoring iterations: 16
```

In my real data, threat.pres and most of the individual impacts are significant, but none of the interaction terms are. I am wondering **if I have specified the model correctly to be able to answer my question**, and if so, **how best to interpret these coefficients, and the significance levels associated with them**.

Thank you for your time.

## Best Answer

I think you actually need different syntax for your model. Here is the same model with a slightly different parameterization:

^{Created on 2022-03-02 by the reprex package (v2.0.1)}(Note that because you did not set a seed the results will differ slightly.)

With this parameterization, you can interpret the "main effect" of each category of the categorical variable as the log odds of

`impact.pres`

for that category when`threat.pres`

is 0. For example, the coefficient on`threat_catclimate`

(.5108) means that the log odds of`impact.pres`

for the`climate`

group of`threat_cat`

is .5108 when`threat.pres`

is 0, or, the odds of`impact.pres`

for the`climate`

group of`threat_cat`

is $\exp(.5108)=1.67$ when`threat.pres`

is 0.You can interpret the interaction term between

`threat.pres`

and each level of`threat_cat`

as the difference in the log odds of`impact.pres`

when`threat.pres`

is 1 vs. when`threat.pres`

is 0 for that group of`threat_cat`

. For example, the coefficient on`threat_catclimate:threat.pres`

(-.5108) means the log odds of`impact.pres`

for the`climate`

group of`threat_cat`

is -.5108 lower when`threat.pres`

is 1 than when`threat.pres`

is 0.From this model (but not the standard parameterization or your attempt), it is easy to see the "effect" of the binary variable at each level of the categorical variable without requiring a reference category.