Solved – Visualizing an interaction effect of a Logistic Regression Analysis (in SPSS)

data visualizationinteractionregressionspss

What would be the best way to go about visualizing an interaction effect of a Logistic Regression Analysis (preferably using SPSS)?

There is a significant interaction between one interval (HDI) and one binary (before/after Fukushima) variable as well as between that same interval variable (HDI) and another interval variable (share of non-renewable electricity consumption). The dependent variable is binary (ecological benefits of solar power mentioned in news article about topic yes/no).

Failing any kind of more specific way of doing this, what would be more general steps to arrive at some visualization? Is a visualization even a good idea? (For context: Until now I have been looking at interactions using ANOVA where plotting interaction effects is straightforward. Those visualizations have been tremendously helpful and I would like something similar for this.)

Best Answer

Visualization is certainly a good idea if these findings are important to communicate. A narrative description of an interaction effect may be cumbersome to write and to absorb, and it is unlikely to make the same impact that a chart would.

I'd make a bubble chart. This is not entirely an SPSS solution, but if you use Excel or R it will work, especially for the continuous-continuous interaction and especially if you are not concerned about making the equivalent of partial plots, i.e., if you don't need to show how the dependent variable is a function of these independent variables while controlling for others.*

Start with a grid consisting of values of HDI on one axis and share of non-renewable electricity consumption on the other. Divide each predictor into some discrete regions (5? 10? 50? It depends on your judgment and your facility with SPSS recodes, or with SPSS's auto-recode commands). For each X-Y region, plot the bubble size as the mean of your dependent variable in that region, taking Yes to be 1 and No to be 0.

If you use R, instead of bubble size you have the option of varying the points using color or symbol.

The continuous-binary interaction could be done the same way; it'll just look a little simpler, perhaps a little simplistic.

*If you do want to incorporate such control, first regress Y on the control variables--those you're not interested in plotting. Then, instead of using the mean of Y as your plot variable, use the mean of the residuals from that regression. The tricky thing here will be what to express about the range of values for these residuals, since they won't be bounded by 0 and 1.

If I've left out some important step someone will correct me....

EDIT: you could make this entirely an SPSS operation if you discretized your Y variable and created a scatterplot of X1 with X2...a) using the "by" command to plot Y in multiple colors, or b) using the "by [Y] (identify) command to plot Y in multiple characters.

Using `irutils`

I came across this package some months ago.

As of commit 0573195c07 on Github, the code won't work with a grouping= argument. Let's go for Friday's debugging session.

Start by downloading a zipped version from Github. You'll need to hack the R/likert.R file, specifically the likert and plot.likert functions. First, in likert, cast() is used but the reshape package is never loaded (although there's an import(reshape) instruction in the NAMESPACE file). You can load this yourself beforehand. Second, there's an incorrect instruction to fetch items labels, where a i is dangling around line 175. This has to be fixed as well, e.g. by replacing all occurrences of likert$items[,i] with likert$items[,1]. Then you can install the package the way you are used to do on your machine. On my Mac, I did

% tar -czf irutils.tar.gz jbryer-irutils-0573195
% R CMD INSTALL irutils.tar.gz

Then, with R, try the following:

library(irutils)
library(reshape)

# Simulate some data (82 respondents x 66 items)
resp <- data.frame(replicate(66, sample(1:5, 82, replace=TRUE)))
resp <- data.frame(lapply(resp, factor, ordered=TRUE, 
                          levels=1:5, 
                          labels=c("Strongly disagree","Disagree",
                                   "Neutral","Agree","Strongly Agree")))
grp <- gl(2, 82/2, labels=LETTERS[1:2]) # say equal group size for simplicity

# Summarize responses by group
resp.likert <- likert(resp, grouping=grp)

That should just work, but the visual rendering will be awful because of the high number of items. It works without grouping (e.g., plot(likert(resp))), though.

enter image description here

I would thus suggest to reduce your dataset to smaller subsets of items. E.g., using 12 items,

plot(likert(resp[,1:12], grouping=grp))

I get a 'readable' stacked barchart. You can probably process them afterwards. (Those are ggplot2 objects, but you won't be able to arrange them on a single page with gridExtra::grid.arrange() because of readability issue!)

enter image description here

Alternative solution

I would like to draw your attention on another package, HH, that allows to plot Likert scales as diverging stacked barcharts. We could reuse the above code as shown below:

resp.likert <- likert(resp)
detach(package:irutils)
library(HH)
plot.likert(resp.likert$results[,-6]*82/100, main="")

but that will complicate things a bit because we need to convert frequencies to counts, subset the likert object produced by irutils, detach package, etc. So let's start again with fresh (counts) statistics:

plot.likert(t(apply(resp, 2, table)), main="", as.percent=TRUE,
            rightAxisLabels=NULL, rightAxis=NULL, ylab.right="", 
            positive.order=TRUE)

enter image description here

To use a grouping variable, you'll need to work with an array of numerical values.

# compute responses frequencies separately by grp
resp.array <- array(NA, dim=c(66, 5, 2))
resp.array[,,1] <- t(apply(subset(resp, grp=="A"), 2, table))
resp.array[,,2] <- t(apply(subset(resp, grp=="B"), 2, table))
dimnames(resp.array) <- list(NULL, NULL, group=levels(grp))
plot.likert(resp.array, layout=c(2,1), main="")

This will produce two separate panels, but it fits on a single page.

enter image description here

Edit 2016-6-3

As of now likert is available as separate package.
You do not need reshape library or detach both irutils and reshape

Solved – SPSS: Plot a multinomial logistic regression

In my opinion, a good way to understand a model is just to plot it. This is as true for logistic regression as for standard linear regression. If you don't have any interactions, you can present each variable independently. (After all, the lack of interactions means the model is assuming the effect of each variable is independent of each other variable.)

I don't know how to get SPSS to produce these plots, although I'm sure it can be done. Nonetheless, a good fallback is to be able to produce plots in Excel. You will want to start by entering the names of the variables into cells A1 through A6 (i.e., "intercept", "Market Cap", "RoA", "History", etc.), and entering the estimated values in the corresponding cells B1 through B6. You'll also want to enter the means and labels for each variable at the top somewhere.

Further down the worksheet, you'll have 2 columns for each variable. In the left column (e.g., A), enter a series of values that spans the range of a variable (e.g., market capitalization). In the column to its right, write a function that will output the predicted probability given the variable value to the left and your model. Remember that the logistic regression model is: $$ \hat p_i=\frac{\exp\!\big(\beta_0+\beta_1\text{Mcap}+\beta_2\text{RoA}+\beta_3\text{hist}+\beta_4X_4+\beta_5X_5\big)}{1+\exp\!\big(\beta_0+\beta_1\text{Mcap}+\beta_2\text{RoA}+\beta_3\text{hist}+\beta_4X_4+\beta_5X_5\big)} $$ For the values of all the variables other than the one you are working on, use the mean of that variable. For instance, when you are getting predicted probabilities as a function of market capitalization, use the mean of RoA, etc. Once you have two columns of corresponding values for X & Y, you can plot them. Use Excel's chart wizard, and select "scatterplot" $\rightarrow$ "smooth lines without markers".

Here's a quick example:

enter image description here

Best Answer

Related Solutions

Solved – Visualizing Likert responses using R or SPSS

Using irutils

Alternative solution

Solved – SPSS: Plot a multinomial logistic regression

Related Question

Using `irutils`