Solved – IRT in R: Does anyone know of an IRT item calibration function that can cope with NA’s

item-response-theorypsychometricsr

I have a dataset that looks like this.

test.takers item1   item2 item3 item4 item5 item6 item7 item8 item9 item10 item11 item12 item13 item14 item16 item17 item18 total_score
tt1             1       1     0     1     0     1     1     1     0     NA      1      1     NA      1      1      1      1          12
tt2             0       1     0     0     0     0     0    NA     1      1     NA      0      1      0     NA      1      0           6
tt3             1       1     1     1     0     0    NA     1    NA     NA      1      1      1     NA      0     NA      0           8
tt4             1       1     1     0     1    NA     1     1     0      0     NA      1     NA      1      0      0     NA           8
tt5             0       1     1     0     1     1    NA    NA     0      1      1      1     NA      0      0      1      1           9
tt6             0       0     0     1     1     1     1    NA     1      1      1     NA      1      1     NA      0      0           9
tt7             1       0     0     1     1     1     1     1    NA      0      1      1      1      0      1      1     NA          11

The dataset consists of 3000 test takers with their responses on an ability test. Not all test takers could respond on all 18 items. So some test takers saw only 10 items, others only 12, etc. The result is a data frame with a lot of NA's. Now I want to calibrate the item parameters using a 2pl irt model. And after that I want to calibrate test taker abilities on the same dataset. The problem I face is that I can't find an R package that can handle calibration on a data frame with missing data.

Does anyone know of an R package or R function that could do the job for me or knows how to deal with this problem in some other genious way?

Best Answer

As I stated in the comments above, missing data can be handled by either the ltm or mirt package when the data is MCAR. Here is an example of how to use both on a dataset with missing values:

> library(ltm)
> library(mirt
> set.seed(1234)
> dat <- expand.table(LSAT7)
> dat[sample(1:(nrow(dat)*ncol(dat)), 150)] <- NA
> head(dat)
     Item.1 Item.2 Item.3 Item.4 Item.5
[1,]      0      0      0      0      0
[2,]      0      0      0      0      0
[3,]      0      0      0      0      0
[4,]      0      0      0      0      0
[5,]      0      0      0      0      0
[6,]      0      0      0      0     NA
> (ltmmod <- ltm(dat ~ z1))

Call:
ltm(formula = dat ~ z1)

Coefficients:
        Dffclt  Dscrmn
Item.1  -1.891   0.967
Item.2  -0.720   1.147
Item.3  -1.008   1.885
Item.4  -0.671   0.760
Item.5  -2.554   0.729

Log.Lik: -2572.402

> (mirtmod <- mirt(dat, 1))
Iteration: 22, Log-Lik: -2572.402, Max-Change: 0.00010
Call:
mirt(data = dat, model = 1)

Full-information item factor analysis with 1 factors 
Converged in 22 iterations with 41 quadrature. 
Log-likelihood = -2572.402
AIC = 5164.805; AICc = 5165.027
BIC = 5213.882; SABIC = 5182.122
> coef(mirtmod)
$Item.1
       a1     d g u
par 0.967 1.829 0 1

$Item.2
       a1     d g u
par 1.148 0.826 0 1

$Item.3
       a1     d g u
par 1.886 1.902 0 1

$Item.4
      a1    d g u
par 0.76 0.51 0 1

$Item.5
       a1     d g u
par 0.729 1.863 0 1

$GroupPars
    MEAN_1 COV_11
par      0      1

It's also possible to impute missing values given a good estimate of $\theta$ for obtaining things like model and item fit statistics (should do this several times if the amount of missingness is non-trivial, and it's even better to jitter the $\hat{\theta}$ values as a function of the respective $SE_{\hat{\theta}}$ values for more reasonable imputation results).

> Theta <- fscores(mirtmod, full.scores = TRUE, scores.only = TRUE)
> fulldat <- imputeMissing(mirtmod, Theta)
> head(fulldat)
  Item.1 Item.2 Item.3 Item.4 Item.5
1      0      0      0      0      0
2      0      0      0      0      0
3      0      0      0      0      0
4      0      0      0      0      0
5      0      0      0      0      0
6      0      0      0      0      0

Best Answer

Related Solutions

Solved – IQ adaptive test items in 1pl, 2pl or 3pl IRT model

Longitudinal Item Response – How to Apply Longitudinal Item Response Theory Models in R

Related Question