Solved – Fit multiple regression model with pairwise deletion (or on a correlation/covariance matrix) in R

covariancecovariance-matrixmissing datamultiple regressionr

I'm trying to fit a multiple regression model with pairwise deletion in the context of missing data. lm() uses listwise deletion, which I'd prefer not to use in my case. I'd also prefer not to use multiple imputation or FIML. How can I do multiple regression with pairwise deletion in R?

I have tried the mat.regress() function of the psych package, which fits regression models to correlation/covariance matrices (which can be obtained from pairwise deletion), but the regression model does not appear to include an intercept parameter.

Here's what I've tried (small example):

set.seed(33333)
y <- rnorm(1000)
x1 <- y*2 + rnorm(1000, sd=.2)
x2 <- y*5 + rnorm(1000, sd=.5)

y[sample(1:1000, 10)] <- NA
x1[sample(1:1000, 10)] <- NA
x2[sample(1:1000, 10)] <- NA

mydata <- data.frame(y, x1, x2)
covMatrix <- cov(mydata, use="pairwise.complete.obs")

#Listwise Deletion
listwiseDeletion <- lm(y ~ x1 + x2, data=mydata)
observations <- length(listwiseDeletion$na.action) #30 rows deleted due to listwise deletion

coef(listwiseDeletion)
(Intercept)          x1          x2 
0.001995527 0.245372245 0.100001989

#Pairwise Deletion --- but missing intercept
pairwiseDeletion <- mat.regress(y="y", x=c("x1","x2"), data=covMatrix, n.obs=observations)
pairwiseDeletion$beta
       y
x1 0.1861277
x2 0.1251995

#Pairwise Deletion --- tried to add intercept, but received error when fitting model
mydata$intercept <- 0
covMatrixWithIntercept <- cov(mydata, use="pairwise.complete.obs")

pairwiseDeletionWithIntercept <- mat.regress(y="y", x=c("intercept","x1","x2"), data=covMatrixWithIntercept, n.obs=observations)
Something is seriously wrong the correlation matrix.
In smc, smcs were set to 1.0
Warning messages:
1: In cov2cor(C) :
  diag(.) had 0 or NA entries; non-finite result is doubtful
2: In cor.smooth(R) :
  I am sorry, there is something seriously wrong with the correlation matrix,
cor.smooth failed to  smooth it because some of the eigen values are NA.  
Are you sure you specified the data correctly?

So, how can I obtain an intercept parameter using mat.regress, or how can I obtain parameter estimates from pairwise deletion using another method or package in R? I've seen matrix calculations to do this, but, ideally, there'd be a package that also outputs regression diagnostics, fit stats, etc. Also, preferably, the method would be able to fit interaction terms.

Best Answer

The regtools package was just released and allows pairwise deletion in multiple regression and principal components analysis. The package is introduced in this post, and is accompanied by a JSM article that compares pairwise deletion (aka available cases), with listwise deletion and multiple imputation.