Will the suffering ever end? An item response theory algorithm for shortening tests

Ottavia M. Epifania\(^{1,2}\), Livio Finos\(^{2,3}\), Luigi Lombardi\(^{1}\)

\(^1\) University of Trento, Rovereto (IT), \(^2\) Psicostat, Padova (IT) \(^3\) University of Padova, Padova (IT)

Short Test Forms

Why?

Many items \(\rightarrow\) good measurement precision, great reliability and so on

Not always!

People might get tired & frustrated

\[Q \subset B\]

Item Response Theory models for the win

Being focused on the item information and on the ability of each item to measure different levels of the latent trait, IRT models provide an ideal framework for developing STF (and not torturing people)

Automated test assembly and maxmin algorithms

AIM

Size matters: How well can we estimate the latent trait with less and less items?

The 4-Parameter Logistic Model (4-PL)

4-PL - Item Response Function

\[P(x_{pi}= 1| \theta_p, b_i, a_i, c_i, d_i) = c_i + (d_i -c_i) \dfrac{\exp[a_i(\theta_p - b_i)]}{1 + \exp[a_i(\theta_p - b_i)]}\]

4-PL - Item Information Function

\[ \text{IIF}_{i}(\theta) = \dfrac{a_i^2[P(\theta)-c_i]^2[d_i - P(\theta)]^2}{(d_{i}-c_i)^2 P(\theta)Q(\theta)}\]

4-PL - Test Information Function

\[TIF = \sum_{i = 1}^{||B||} IIF_i\] (\(B\): Set of items in a test (\(||X||\) cardinality of set \(X\)))

Not a property of the item!

\(d\) depends on the \(r\) rank of the item presentation during the administration, \(d_r\):

Competing algorithms

\[\text{TIF}^*\]

\[k = 0\]

\[k = 0\]

\[k = 0\]

\[k = 0\]

\[k = 0\]

\[k = 0\]

\[k = 1\]

\[k = 1\]

\[k = 1\]

\[k = 1\]

Frank

At \(k = 0\): \(\text{TIF}^0(\theta) = 0 \, \forall \theta\), \(Q^0 = \emptyset\).

For \(k \geq 0\),

  1. \(A^k = B \setminus Q^k\)
  1. \(\forall i \in A^k\), \(p\text{TIF}_{i}^k = \frac{\text{TIF}^k + \text{IIF}_{i}}{||Q^k||+1}\), with \(d_i = 1\), \(\forall i\)
  1. \(i^* = \arg \min_{i \in A^k} (|\text{TIF}^* - \text{pTIF}_i^k|)\)

  2. Termination criterion: \(|\text{TIF}^* - \text{pTIF}_{i^*}^k| \geq |\text{TIF}_B - \text{TIF}^{k}|\):

    • FALSE: \(Q^{k+1} = Q^{k} \cup \{i^*\}\), \(\text{TIF}^{k+1} = p\text{TIF}_{i^*}\), iterates 1-4

    • TRUE: Stop, \(Q_{\text{Frank}} = Q^k\)

Léon

At \(k = 0\): \(\text{TIF}^0(\theta) = 0 \, \forall \theta\), \(Q^0 = \emptyset\).

For \(k \geq 0\),

  1. \(A^k = B \setminus Q^k\)
  1. \(\forall i \in A^k\), \(p\text{TIF}_{i}^k = \frac{\text{TIF}^k + \text{IIF}_{i}}{||Q^k||+1}\), with \(r = \{0, 1, \ldots, ||Q^k||-1\}\)
  1. \(i^* = \arg \min_{i \in A^k} (|\text{TIF}^* - \text{pTIF}_i^k|)\)

  2. Termination criterion: \(|\text{TIF}^* - \text{pTIF}_{i^*}^k| \geq |\text{TIF}_B - \text{TIF}^{k}|\):

    • FALSE: \(Q^{k+1} = Q^{k} \cup \{i^*\}\), \(\text{TIF}^{k+1} = p\text{TIF}_{i^*}\), iterates 1-4

    • TRUE: Stop, \(Q_{\text{Léon}} = Q^k\)

Simulation Study

Design

1000 respondents with \(\theta \sim \mathcal{U}(-3,3)\)

Item bank \(B\) of 70 items:

  • \(b \sim \mathcal{U}(-3, 3)\)

  • \(a \sim \mathcal{U}(.90, 2.0)\)

  • \(c_i = 0\), \(\forall i \in B\)

  • \(d_r = \exp(-0.01 r)\), with \(r = \{0, \ldots, ||B|| -1\}\)

\(\text{TIF}^* = \sum_{i = 1}^{||B||} \frac{\text{IIF}_i}{||B||}\), with \(d_i = 1\), \(\forall i \in B\) 🥇

Considering \(TIF^*\), \(B\), and \(d_r\), Léon and Frank are applied to find the best \(Q \subset B\) in 100 replications

Plot twist! There is a minimum number of items: 10%, 25%, 50% of \(||B||\)

Results

Final remarks


Tip

Acknowledging for the response fatigue during the item selection itself helps to find the item selection able to minimize the distance from the target


Warning

The order of the items selected by Léon cannot be randomized


ottavia.epifania@untin.it