First Steps with Item Response Theory

Model estimation, model comparison, assumptions check in R

Dr. Ottavia M. Epifania, ottavia.epifania@unitn.it

2007-02-06

Getting started

Create a new project

New Project
Choose the type
Assign a directory

(This is not mandatory by it is strongly suggested)

New file \(\rightarrow\) New project

install.packages("lavaan")

install.packages("TAM")

install.packages("mokken")

install.packages("difR")

install.packages("tidyverse")

install.packages("ggplot2")

library("lavaan")

library("TAM")

library("mokken")

library("difR")

library("tidyverse")

library("ggplot2")

Useful code

IRT
ICC
IIF

# probability of a correct response given theta and item parameters
IRT <- function(theta, a = 1, b = 0, c = 0,e = 1) {
  y <- c + (e - c) * exp(a * (theta - b)) / (1 + exp(a * (theta - b)))
  y[is.na(y)] = 1
  return(y)
}

# function for plotting the ICCs of the items 
# returns a list with a ggplot graph (icc_graph) and a data set with the data 
# used for plotting the ICC (icc_data)
irt_icc = function(model) {
  item_par = model$item
  est_theta = seq(-4,4, length.out=1000)
  item_prob = list()
  if (any(grep("guess", colnames(item_par))) == F) {
    for (i in 1:nrow(item_par)) {
      item_prob[[i]] = data.frame(theta = est_theta)
      item_prob[[i]]$it_p = IRT(item_prob[[i]]$theta, 
                          b = item_par[i, "xsi.item"], 
                          a = item_par[i, "B.Cat1.Dim1"])
      item_prob[[i]]$item = item_par[i, "item"]

}
  } else {
     for (i in 1:nrow(item_par)) {
      item_prob[[i]] = data.frame(theta = est_theta)
      item_prob[[i]]$it_p = IRT(item_prob[[i]]$theta, 
                          b = item_par[i, "AXsi_.Cat1"], 
                          a = item_par[i, "B.Cat1.Dim1"], 
                          c = item_par[i, "guess"])
      item_prob[[i]]$item = item_par[i, "item"]

}
  }
  p = do.call("rbind", item_prob)
  gp = ggplot(p, 
       aes(x = theta, y = it_p, group = item, col =
             item)) + geom_line(lwd = 1)
  object = list(icc_data = p, 
              icc_graph = gp)

return(object)
}

irt_iif = function(model) {
  est_theta = IRT.factor.scores(model, 
                              type = "EAP")$EAP
ii = IRT.informationCurves(model, theta = est_theta)

test_info = data.frame(theta = est_theta, 
               info = ii$test_info_curve, 
               se = ii$se_curve)

iif_info = list()
for(i in 1:nrow(ii$info_curves_item)) {
    iif_info[[i]] = data.frame(theta = est_theta)
    iif_info[[i]]$ii_item = ii$info_curves_item[i, ]
    iif_info[[i]]$item = dimnames(ii$info_curves_item)[[1]][i]
}

dat_info = do.call("rbind", iif_info)

info_tot = list(test_info = test_info, 
                item_info = dat_info)

return(info_tot)
}

Import data

Download the file dataClass.csv from this shared folder.

Store the downloaded file in the data sub folder of your newly created project and run this code:

data = read.csv("data/dataClass.csv", header = T, sep = ",")

Look at the data!

The items
The respondents

Show code

vis_data = data
vis_data$id = paste0("s", 1:nrow(data))

vis_data = vis_data[, c(ncol(vis_data), 1:ncol(data))]
vis_data = pivot_longer(vis_data, names_to = "item", cols = -1)

prop_data = vis_data %>% 
  group_by(item) %>%  
  summarise(proportion = mean(value))

ggplot(prop_data, 
       aes(x = reorder(item, proportion), y=proportion, 
           fill = item)) + geom_bar(stat = "identity") + 
  theme(axis.text = element_text(angle = 90)) + geom_hline(aes(yintercept=.50), linetype = 2) + ylab("Proportion correct") + xlab("") + 
  theme_light() + ylim(0, 1) + 
  theme( 
    legend.position = "none", 
    axis.title = element_text(size = 26), 
    axis.text = element_text(size = 22))

Show code

sbj_data = data
sbj_data$id = paste0("s", 1:nrow(data))
sbj_data = sbj_data[, c(ncol(sbj_data), 1:ncol(data))]

sbj_data$sum = rowSums(sbj_data[,-1])

boxplot(sbj_data$sum)

Assumptions

IRT assumptions

There are three main assumptions:

Monotonicity
Unidimensionality

Local independence

If these assumptions are violated, the models can still be fitted but their interpretation is meaningless

Monotonicity

The probability of responding correctly increases as the the level of the latent trait increases

Monotonicitiy is evaluated at the item level with the \(H\) coefficient by Mokken (1971):

\(H \leq .30\) \(\rightarrow\) The item is not monotone, should be discarded

\(.30 < H \leq .50\) \(\rightarrow\) The item is essentially monotone, this is fine

\(H > .50\) \(\rightarrow\) The item is monotone, this is optimal

If violated

Issues on the reliability and validity of the measure

Monotonicity in `R`

Monotonicity is checked with the mokken package in R:

    ItemH #ac #vi #vi/#ac maxvi  sum sum/#ac zmax #zsig crit
I01  0.45  10   0     0.0  0.00 0.00  0.0000 0.00     0    0
I02  0.62  10   0     0.0  0.00 0.00  0.0000 0.00     0    0
I03  0.52  10   1     0.1  0.05 0.05  0.0049 1.56     0   20
I04  0.55  10   0     0.0  0.00 0.00  0.0000 0.00     0    0
I05  0.56  10   1     0.1  0.03 0.03  0.0031 0.94     0   10

Show code

library(mokken)
mono_check = check.monotonicity(data)
summary(mono_check)

Unidimensionality

Unidimensionality \(\rightarrow\) There’s only one latent trait that is responsible for the observed responses

How

Confirmatory Factor Analysis models (CFA)

If violated

The parameter estimates might be distorted and/or could not be interpreted

Even if the estimates could be interpreted, they would have no meaning because the model has no meaning

There are IRT models that are designed for dealing with multidimensional latent traits

Unidimensionality - CFA

CFA models that force a one-factor solution to the data:

If the one-factor model has a good fit to the data \(\rightarrow\) it is reasonable to assume the unidimensionality
If the one-factor model does not have a good fit to the data \(\rightarrow\) unidimensionality is not supported, it is best to first understand the latent structure of the data

CFA fit indexes:

Comparative Fit Index (CFI) \(\geq .90\)

Standardized Root Mean Square Residual (SRMSR) \(\leq.08\)

Root Mean Square Error of Approximation (RMSEA) \(\leq. 08\)

Unidimensionality - Practice

The CFA model is fitted with the lavaan package:

Define the model
Fit the model

[1] "latent =~ I01 + I02 + I03 + I04 + I05"

Show code

item_lab = paste(colnames(data), collapse = " + ")
form = paste("latent =~", item_lab)

lavaan 0.6-19 ended normally after 9 iterations

  Estimator                                       DWLS
  Optimization method                           NLMINB
  Number of model parameters                        10

  Number of observations                          1000

Model Test User Model:
                                              Standard      Scaled
  Test Statistic                                 3.891       7.034
  Degrees of freedom                                 5           5
  P-value (Chi-square)                           0.565       0.218
  Scaling correction factor                                  0.561
  Shift parameter                                            0.098
    simple second-order correction                                

Model Test Baseline Model:

  Test statistic                              2264.236    1913.779
  Degrees of freedom                                10          10
  P-value                                        0.000       0.000
  Scaling correction factor                                  1.184

User Model versus Baseline Model:

  Comparative Fit Index (CFI)                    1.000       0.999
  Tucker-Lewis Index (TLI)                       1.001       0.998
                                                                  
  Robust Comparative Fit Index (CFI)                         0.988
  Robust Tucker-Lewis Index (TLI)                            0.977

Root Mean Square Error of Approximation:

  RMSEA                                          0.000       0.020
  90 Percent confidence interval - lower         0.000       0.000
  90 Percent confidence interval - upper         0.039       0.052
  P-value H_0: RMSEA <= 0.050                    0.990       0.939
  P-value H_0: RMSEA >= 0.080                    0.000       0.000
                                                                  
  Robust RMSEA                                               0.075
  90 Percent confidence interval - lower                     0.000
  90 Percent confidence interval - upper                     0.145
  P-value H_0: Robust RMSEA <= 0.050                         0.224
  P-value H_0: Robust RMSEA >= 0.080                         0.522

Standardized Root Mean Square Residual:

  SRMR                                           0.021       0.021

Parameter Estimates:

  Parameterization                               Delta
  Standard errors                           Robust.sem
  Information                                 Expected
  Information saturated (h1) model        Unstructured

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)
  latent =~                                           
    I01               0.607    0.036   16.961    0.000
    I02               0.952    0.026   37.134    0.000
    I03               0.613    0.039   15.629    0.000
    I04               0.730    0.033   22.298    0.000
    I05               0.806    0.030   26.863    0.000

Thresholds:
                   Estimate  Std.Err  z-value  P(>|z|)
    I01|t1           -0.181    0.040   -4.549    0.000
    I02|t1            0.143    0.040    3.602    0.000
    I03|t1           -0.610    0.042  -14.364    0.000
    I04|t1            0.634    0.043   14.854    0.000
    I05|t1            0.487    0.041   11.768    0.000

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)
   .I01               0.631                           
   .I02               0.094                           
   .I03               0.624                           
   .I04               0.467                           
   .I05               0.350                           
    latent            1.000

Show code

model = cfa(form, data = data,  ordered = colnames(data), std.lv = T)
summary(model, fit.measures = T)

Local independence

The probability of observing a pattern of responses to a set of items is equal to the product of the probabilities associated with each response to each item:

\[P(x_{p1}, x_{p2}, \ldots, x_{pi}) = P(x_{p1})P(x_{p2}) \ldots P(x_{pi})\]

The correlation between the observed responses “disappears” once the effect of the latent variable has been taken into account

If violated

There is something else other than the latent variable, issues for the validity and reliability of the measure

Local indepedence - How

It is based on the correlations between the residuals of the IRT model \(\rightarrow\) the distance between the observed data and the data expected according to the model

Under the assumption of local independence, the residuals are expected not to be correlated

The usual statistics for interpreting the correlation between the residuals is the \(Q3\) statistics (Yen, 1984)

This statistics is computed between the residuls of every pair of items

\(Q3 > .20\) suggests an high correlation between the residuals of two items, indicating local dependence \(\rightarrow\) one of the item should be deleted

Model estimation

Model estimation in R

The TAM package allows for the estimation of IRT models in R

Multiple-choice items, coded as 0 (incorrect response) and 1 (correct response):

m1pl = tam.mml(data, verbose = F)

m2pl = tam.mml.2pl(data, irtmodel = "2PL", verbose = F)

m3pl = tam.mml.3pl(data, est.guess = colnames(data), verbose = F)

Model comparison

IRT.compareModels(m1pl, m2pl, m3pl)

$IC
  Model   loglike Deviance Npars Nobs      AIC      BIC     AIC3     AICc
1  m1pl -2762.350 5524.699     6 1000 5536.699 5566.146 5542.699 5536.784
2  m2pl -2721.966 5443.933    10 1000 5463.933 5513.010 5473.933 5464.155
3  m3pl -2720.248 5440.496    16 1000 5472.496 5551.020 5488.496 5473.050
      CAIC       GHP
1 5572.146 0.5536699
2 5523.010 0.5463933
3 5567.020 0.5472496

$LRtest
  Model1 Model2      Chi2 df            p
1   m1pl   m2pl 80.766596  4 1.110223e-16
2   m1pl   m3pl 84.203069 10 7.494005e-14
3   m2pl   m3pl  3.436473  6 7.524016e-01

attr(,"class")
[1] "IRT.compareModels"

2PL - Model fit

f.m2pl = tam.modelfit(m2pl, progress = F)
f.m2pl$statlist
f.m2pl$modelfit.test

statlist
modelfit.test

  X100.MADCOV       SRMR      SRMSR    MADaQ3 pmaxX2
1   0.3492573 0.01591987 0.01713347 0.0714379      1

fit_m2pl$modelfit.test

      maxX2 Npairs p.holm
1 0.9536516     10      1

Local dependence

Summary
LD – Items

fit_m2pl$Q3_summary

  type             M        SD        min         max     SGDDM    wSGDDM
1   Q3 -1.624306e-01 0.0916599 -0.3497230 -0.05557536 0.1624306 0.1624306
2  aQ3  1.030287e-17 0.0916599 -0.1872924  0.10685520 0.0714379 0.0714379

local_dep = function(model, cut = NULL) {
  fit_model = tam.modelfit(model, progress = FALSE)
  temp_local = fit_model$Q3.matr
  index = which( upper.tri(temp_local,diag=F) , arr.ind = TRUE )
  local = data.frame( col = dimnames(temp_local)[[2]][index[,2]] ,
                      row = dimnames(temp_local)[[1]][index[,1]] ,
                      val = temp_local[ index ] )

  if (is.null(cut) == TRUE) {
      summary_local = local
  } else {
     summary_local = local[local$val > cut, ]
  }
  return(summary_local)
}

local_dep(m2pl)

   col row         val
1  I02 I01 -0.18357094
2  I03 I01 -0.11032327
3  I03 I02 -0.16445409
4  I04 I01 -0.07012394
5  I04 I02 -0.30916375
6  I04 I03 -0.05557536
7  I05 I01 -0.14497861
8  I05 I02 -0.34972296
9  I05 I03 -0.12466460
10 I05 I04 -0.11172807

2PL - Item parameters

Model summary
Item Characteristics Curves (ICC)

summary(m2pl)

------------------------------------------------------------
TAM 4.2-21 (2024-02-19 18:52:08) 
R version 4.4.2 (2024-10-31 ucrt) x86_64, mingw32 | nodename=EPIFANIA | login=Ottavia 

Date of Analysis: 2025-02-05 18:00:28.496572 
Time difference of 0.09988594 secs
Computation time: 0.09988594 

Multidimensional Item Response Model in TAM 

IRT Model: 2PL
Call:
tam.mml.2pl(resp = data, irtmodel = "2PL", verbose = F)

------------------------------------------------------------
Number of iterations = 72 
Numeric integration with 21 integration points

Deviance = 5443.93 
Log likelihood = -2721.97 
Number of persons = 1000 
Number of persons used = 1000 
Number of items = 5 
Number of estimated parameters = 10 
    Item threshold parameters = 5 
    Item slope parameters = 5 
    Regression parameters = 0 
    Variance/covariance parameters = 0 

AIC = 5464  | penalty=20    | AIC=-2*LL + 2*p 
AIC3 = 5474  | penalty=30    | AIC3=-2*LL + 3*p 
BIC = 5513  | penalty=69.08    | BIC=-2*LL + log(n)*p 
aBIC = 5481  | penalty=37.28    | aBIC=-2*LL + log((n-2)/24)*p  (adjusted BIC) 
CAIC = 5523  | penalty=79.08    | CAIC=-2*LL + [log(n)+1]*p  (consistent AIC) 
AICc = 5464  | penalty=20.22    | AICc=-2*LL + 2*p + 2*p*(p+1)/(n-p-1)  (bias corrected AIC) 
GHP = 0.54639     | GHP=( -LL + p ) / (#Persons * #Items)  (Gilula-Haberman log penalty) 

------------------------------------------------------------
EAP Reliability
[1] 0.72
------------------------------------------------------------
Covariances and Variances
     [,1]
[1,]    1
------------------------------------------------------------
Correlations and Standard Deviations (in the diagonal)
     [,1]
[1,]    1
------------------------------------------------------------
Regression Coefficients
     [,1]
[1,]    0
------------------------------------------------------------
Item Parameters -A*Xsi
  item    N     M xsi.item AXsi_.Cat1 B.Cat1.Dim1
1  I01 1000 0.572   -0.381     -0.381       1.268
2  I02 1000 0.443    0.722      0.722       4.859
3  I03 1000 0.729   -1.323     -1.323       1.358
4  I04 1000 0.263    1.583      1.583       1.799
5  I05 1000 0.313    1.432      1.432       2.353

Item Parameters in IRT parameterization
  item alpha   beta
1  I01 1.268 -0.300
2  I02 4.859  0.149
3  I03 1.358 -0.974
4  I04 1.799  0.880
5  I05 2.353  0.609

Show code

icc_2pl = irt_icc(m2pl)$icc_data 
item = m2pl$item_irt
icc_2pl = merge(icc_2pl, item)
icc_2pl$item_label = paste0(icc_2pl$item, ", b =", 
                            round(icc_2pl$beta, 2), ", a = ", round(icc_2pl$alpha, 2))

ggplot(icc_2pl, 
       aes(x = theta, y = it_p, group = item_label, color = item_label)) +
  geom_line(linewidth = 1.2) + theme_classic() + 
    ylab(expression(paste("P(", x[ip], " = 1|", theta, ", ", b[i], ", ", a[i], ")"))) + xlab(expression(theta)) +
  theme(legend.position = c(.15, .80), 
        legend.title = element_blank(), 
        legend.text = element_text(size = 18), 
        axis.title = element_text(size = 26)) + 
  geom_hline(yintercept = .50, linetype = 2)

Information Functions

Show code

label_item = icc_2pl[, c("item", "item_label")]
label_item = label_item %>% distinct()

info2pl = irt_iif(m2pl)
item_info = info2pl$item_info
item_info = merge(item_info, label_item)

ggplot(item_info, 
       aes(x = theta, y = ii_item, group = item_label, color = item_label)) + geom_line(lwd = 1) + theme_light()+ xlab(expression(theta)) + ylab("Info")+
  theme(legend.position = c(.15, .80), 
        legend.title = element_blank(), 
        legend.text = element_text(size = 18), 
        axis.title = element_text(size = 26)) + 
  geom_hline(yintercept = .50, linetype = 2)

Show code

ggplot(info2pl$test_info, 
       aes(x = theta, y = info)) + geom_line(lwd = 2) + theme_light() + 
  xlab(expression(theta)) + ylab("Info") + 
  theme(axis.text = element_text(size = 18), 
        axis.title = element_text(size = 26))

Show code

ggplot(info2pl$test_info, 
       aes(x = theta, y = se, col = "SE")) + geom_line(lwd = 2) + + theme_light() + 
  xlab(expression(theta)) + ylab("Info") + 
  theme(legend.position = "none", 
        axis.text = element_text(size = 18), 
        axis.title = element_text(size = 26))

Your turn!

Download the data set “dataES_stima.csv” from the shared folder.

Store the downloaded file in the data sub folder:

Import the dataset
Look at the data! (Both items and respondents)
Check the monotonicity and unidimensionality assumptions
Fit of the IRT models & choose the best model
Check the local independence assumption
Obtain the ICCs, IIFs, and TIF of the test

First Steps with Item Response Theory

Getting started

Create a new project

Packages

Useful code

Import data

Look at the data!

Assumptions

IRT assumptions

Monotonicity

Monotonicity in R

Unidimensionality

Unidimensionality - CFA

Unidimensionality - Practice

Local independence

Local indepedence - How

Model estimation

Model estimation in R

Model comparison

2PL - Model fit

Local dependence

2PL - Item parameters

Information Functions

Information Functions

Your turn!

Monotonicity in `R`