Solved – Logistic regression with an log transformed variable, how to determine economic significance

data transformationinterpretationlogarithmlogisticmarginal-distribution

I am using a logistic regression model with continuous independent variables and two log transformed size variables (total assets and total deposits).

My question is how to interpret the results and measure the economic impact?
The normal steps in interpretation I take is as follows: I firstly run stata test on summary statistics (sum), then a logistic regression (logit) and than the marginal effects for the economic significance (mfx).

These results can be explained by sentences as "a one standard deviation increase in $X$ (an independent variable) increases the probability of participation by $X\%$
I will multiply the $dy/dx$ of the marginal effects by the SD to get the % impact.

So far so good, but how do I interpreted the results with the two ln(size) variables?
To clarify my problem I will show you a part of the results:

sum stat:

             obs       mean       stdD        min         max
lnTCD|      2755   -1.469624    2.992985  -15.01948   4.827369     
lnTA |      2767   -1.131045    2.796627  -16.51857   6.731258

logit res:

           Coeff       stERR       Z      P>Z             95%
lnTCD|   1.088141   .1424768     0.65   0.519     .8418451    1.406494
lnTA |   .7880959    .105272    -1.78   0.075     .6065653    1.023954

Margin eff:

            dy/dx       stERR      Z      P>Z           95%          X
lnTCD|    .0104326      .01619    0.64   0.519  -.021308  .042174  -1.09399
lnTA |   -.0294112       .0165   -1.78   0.075  -.061749  .002926   -.7886

In a normal situation I would multiply lnTA of the marginal effects (-0.0294112) by the Standard deviation of the summary statistic (2.796627) which results in -8.22%

Although this holds for the other non log variables, intuitively this does not sound correct for these (ln) variables.

Best Answer

The interpretation you want to put on the covariate change makes sense in a simple model with one covariate. If you have two or more there could be some interaction and the best you can say in general is that x% is the magnitude of change when you change variable U by one standard deviation with the others held fixed at a particular value. At other places in the covariate space the magnitude of change could be different. If a variable is changed you can still talk about this amount of change on the log scale, but if you want to make the claim on the original scale you would have to figure out how the change on the log scale translates to a change on the original scale.

Related Solutions

Solved – Interaction in logistic models with R. Using * operator or create an interaction variable

Using some dummy data similar to what you must have used:

set.seed(10)
dat <- data.frame(ART.conc = factor(sample(c("yes","no"), 10, replace = TRUE),
                                    levels = c("yes","no")),
                  Parity   = factor(sample(1:3, 10, replace = TRUE)))

and for simplicity, form the interaction variable and store it in dat:

dat <- transform(dat, interact = interaction(ART.conc, Parity))

OK, now the model.matrix will show us the conversion to terms in the model the represent the factor variables. First, the correct way:

> model.matrix( ~ ART.conc * Parity, data = dat)
   (Intercept) ART.concno Parity2 Parity3 ART.concno:Parity2 ART.concno:Parity3
1            1          1       1       0                  1                  0
2            1          0       1       0                  0                  0
3            1          0       0       0                  0                  0
4            1          1       1       0                  1                  0
5            1          0       1       0                  0                  0
6            1          0       1       0                  0                  0
7            1          0       0       0                  0                  0
8            1          0       0       0                  0                  0
9            1          1       1       0                  1                  0
10           1          0       0       1                  0                  0

And now the way you did it:

> model.matrix( ~ interact + ART.conc + Parity, data = dat)
   (Intercept) interactno.1 interactyes.2 interactno.2 interactyes.3
1            1            0             0            1             0
2            1            0             1            0             0
3            1            0             0            0             0
4            1            0             0            1             0
5            1            0             1            0             0
6            1            0             1            0             0
7            1            0             0            0             0
8            1            0             0            0             0
9            1            0             0            1             0
10           1            0             0            0             1
   interactno.3 ART.concno Parity2 Parity3
1             0          1       1       0
2             0          0       1       0
3             0          0       0       0
4             0          1       1       0
5             0          0       1       0
6             0          0       1       0
7             0          0       0       0
8             0          0       0       0
9             0          1       1       0
10            0          0       0       1

The obvious thing is that these two are not the same because there are far more terms in the design matrix for the method using interaction() than for the correct way.

Now, we also note why in the model with the interaction() term you get NA values for some coefficients. Compare the interactyes.3 column with the Parity3 column in the design matrix from the interaction() model. Look familiar? So the interaction() has created information that relates to the main effects terms you added to the formula alongside the interaction() variable. At the very least, you don't need to specify the main effects terms alongside the interaction() variable in that version of the model. However, it'd still be wrong even if you fixed that.

Consider

> with(dat, levels(interact))
[1] "yes.1" "no.1"  "yes.2" "no.2"  "yes.3" "no.3" 
> with(dat, table(interact))
interact
yes.1  no.1 yes.2  no.2 yes.3  no.3 
    3     0     3     3     1     0

What interaction() has done (by default, I'll add) is retain all possible combinations of the levels of the two factors; that's 6 even though we only observed 4 of them in this dummy data set. This shows up in the model matrix as a vector of 0s for both interactno.1 and interactno.3 if you just include the interaction() term in the model:

> model.matrix( ~ interact, data = dat)
   (Intercept) interactno.1 interactyes.2 interactno.2 interactyes.3
1            1            0             0            1             0
2            1            0             1            0             0
3            1            0             0            0             0
4            1            0             0            1             0
5            1            0             1            0             0
6            1            0             1            0             0
7            1            0             0            0             0
8            1            0             0            0             0
9            1            0             0            1             0
10           1            0             0            0             1
   interactno.3
1             0
2             0
3             0
4             0
5             0
6             0
7             0
8             0
9             0
10            0

Hence we won't be able to identify both of two all 0 columns, so you'll still get an NA.

In summ, the two ways you mention are not the same model and you'll cause yourself a world of pain if you try to fit interactions using the interaction() function.

Hopefully by now you realize these forms aren't the same and why, and more importantly why it is a bad idea to circumvent R's formula interface notation.

Aside

I mentioned the default behaviour of interaction(). You can do better by adding the drop = TRUE to the call:

> with(dat, levels(interact2))
[1] "yes.1" "yes.2" "no.2"  "yes.3"\

Now we only have the four observed levels. Then you can do:

> model.matrix( ~ ART.conc + Parity + interact2, data = dat)
   (Intercept) ART.concno Parity2 Parity3 interact2yes.2 interact2no.2
1            1          1       1       0              0             1
2            1          0       1       0              1             0
3            1          0       0       0              0             0
4            1          1       1       0              0             1
5            1          0       1       0              1             0
6            1          0       1       0              1             0
7            1          0       0       0              0             0
8            1          0       0       0              0             0
9            1          1       1       0              0             1
10           1          0       0       1              0             0
   interact2yes.3
1               0
2               0
3               0
4               0
5               0
6               0
7               0
8               0
9               0
10              1

But then this still isn't right because you have one more term than needed, again because R can't do the right thing about removing one of the combinations of levels which gets absorbed into the Intercept term. Note the Parity3 and interact2yes.3 terms; they are coded the same way; the Parity3 term works with the Intercept term to give you the information in interact2yes.3.

Logit Model Marginal Effects – Interpretation with Logarithmic Variables

You know that in a logit:

$$Pr[y = 1 \vert x,z] = p = \frac{\exp (\alpha + \beta \cdot \ln x + \gamma z)}{1+\exp (\alpha + \beta \cdot \ln x + \gamma z )}. $$

After some tedious calculus and simplification, the partial of that with respect to $x$ becomes:

$$ \frac{\partial Pr[y=1 \vert x,z]}{\partial x} = \frac{\beta}{x} \cdot p \cdot (1-p). $$

This is (sort of) equivalent to

$$\frac{\Delta p}{\Delta x}=\frac{\beta}{x} \cdot p \cdot (1-p),$$

which can be re-written as

$$\frac{\Delta p}{100 \cdot \frac{ \Delta x}{x}}= \frac{\beta \cdot p \cdot (1-p)}{100}.$$

This is the definition of semi-elasticity, and can be interpreted as the change in probability for a 1% change in $x$.

Here's an example in Stata.* Note that I am using margins instead of the out-of-date mfx to get the average marginal effect of $x$, $\frac{1}{N}\Sigma_{i=1}^N\frac{\beta \cdot p_i \cdot (1-p_i)}{100}$:

. sysuse auto, clear
(1978 Automobile Data)

. gen ln_price = ln(price)

. logit foreign ln_price mpg weight, nolog

Logistic regression                             Number of obs     =         74
                                                LR chi2(3)        =      57.69
                                                Prob > chi2       =     0.0000
Log likelihood = -16.185932                     Pseudo R2         =     0.6406

------------------------------------------------------------------------------
     foreign |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
    ln_price |   6.851215    2.11763     3.24   0.001     2.700737    11.00169
         mpg |  -.0880842   .1031317    -0.85   0.393    -.2902186    .1140503
      weight |  -.0062268   .0017269    -3.61   0.000    -.0096115   -.0028422
       _cons |  -41.32383   16.24003    -2.54   0.011    -73.15371   -9.493947
------------------------------------------------------------------------------

. margins, expression(_b[ln_price]*predict()*(1-predict())/100)

Predictive margins                              Number of obs     =         74
Model VCE    : OIM

Expression   : _b[ln_price]*predict()*(1-predict())/100

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       _cons |   .0046371   .0007965     5.82   0.000      .003076    .0061982
------------------------------------------------------------------------------

This means that for a 1% increase in price, the probability that a car is foreign increases by 0.005 on a [0,1] scale. Or a 10% increase in price gives you a 0.05 increase. In this date, about 0.3 of the cars are foreign, so these are economically and statistically significant.

Edit:

A good way to do this in Stata 10 is to install the user-written command margeff:

. margeff, dydx(ln_price) replace

Average partial effects after margeff
      y  = Pr(foreign) 

------------------------------------------------------------------------------
    variable |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
    ln_price |   .4637103   .0796514     5.82   0.000     .3075964    .6198241
         mpg |  -.0059616    .006781    -0.88   0.379    -.0192522     .007329
      weight |  -.0004214   .0000417   -10.11   0.000    -.0005031   -.0003398
------------------------------------------------------------------------------

. lincom _b[ln_price]/100

 ( 1)  .01*ln_price = 0

------------------------------------------------------------------------------
    variable |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         (1) |   .0046371   .0007965     5.82   0.000      .003076    .0061982
------------------------------------------------------------------------------

*This is actually not a great empirical example since the relationship in the data has an inverted-U shape.

Best Answer

Related Solutions

Solved – Interaction in logistic models with R. Using * operator or create an interaction variable

Aside

Logit Model Marginal Effects – Interpretation with Logarithmic Variables

Related Question