Will the suffering ever end? An item response theory algorithm for shortening tests

Ottavia M. Epifania\(^{1,2}\), Livio Finos\(^{2,3}\), Luigi Lombardi\(^{1}\)

\(^1\) University of Trento, Rovereto (IT), \(^2\) Psicostat, Padova (IT) \(^3\) University of Padova, Padova (IT)

Short Test Forms

Why?

Many items \(\rightarrow\) good measurement precision, great reliability and so on

Not always!

People might get tired & frustrated

\[Q \subset B\]

Item Response Theory models for the win

Being focused on the item information and on the ability of each item to measure different levels of the latent trait, IRT models provide an ideal framework for developing STF (and not torturing people)

Automated test assembly and maxmin algorithms

AIM

How can we choose the “best” items from \(B\) such that \(Q\) can satisfy the measurement precision we want?

The 4-Parameter Logistic Model (4-PL)

4-PL - Item Response Function

\[P(x_{pi}= 1| \theta_p, b_i, a_i, c_i, d_i) = c_i + (d_i -c_i) \dfrac{\exp[a_i(\theta_p - b_i)]}{1 + \exp[a_i(\theta_p - b_i)]}\]

4-PL - Information Functions

\[ \text{IIF}_{i}(\theta) = \dfrac{a_i^2[P(\theta)-c_i]^2[d_i - P(\theta)]^2}{(d_{i}-c_i)^2 P(\theta)Q(\theta)}\]

\[TIF = \sum_{i = 1}^{||B||} IIF_i\] (\(B\): Set of items in a test (\(||X||\) cardinality of set \(X\)))

Not a property of the item!

\(d\) depends on the \(r\) rank of the item presentation during the administration, \(d_r\):

Competing algorithms

The logic
The algorithms

\[\text{TIF}^*\]

\[k = 0\]

\[k = 0\]

\[k = 0\]

\[k = 0\]

\[k = 0\]

\[k = 0\]

\[k = 1\]

\[k = 1\]

\[k = 1\]

\[k = 1\]

Frank

At \(k = 0\): \(\text{TIF}^0(\theta) = 0 \, \forall \theta\), \(Q^0 = \emptyset\).

For \(k \geq 0\),

\(A^k = B \setminus Q^k\)

\(\forall i \in A^k\), \(p\text{TIF}_{i}^k = \frac{\text{TIF}^k + \text{IIF}_{i}}{||Q^k||+1}\), with \(d_i = 1\), \(\forall i\)

\(i^* = \arg \min_{i \in A^k} (|\text{TIF}^* - \text{pTIF}_i^k|)\)
Termination criterion: \(|\text{TIF}^* - \text{pTIF}_{i^*}^k| \geq |\text{TIF}_B - \text{TIF}^{k}|\):
- FALSE: \(Q^{k+1} = Q^{k} \cup \{i^*\}\), \(\text{TIF}^{k+1} = p\text{TIF}_{i^*}\), iterates 1-4
- TRUE: Stop, \(Q_{\text{Frank}} = Q^k\)

Léon

At \(k = 0\): \(\text{TIF}^0(\theta) = 0 \, \forall \theta\), \(Q^0 = \emptyset\).

For \(k \geq 0\),

\(A^k = B \setminus Q^k\)

\(\forall i \in A^k\), \(p\text{TIF}_{i}^k = \frac{\text{TIF}^k + \text{IIF}_{i}}{||Q^k||+1}\), with \(r = \{0, 1, \ldots, ||Q^k||-1\}\)

\(i^* = \arg \min_{i \in A^k} (|\text{TIF}^* - \text{pTIF}_i^k|)\)
Termination criterion: \(|\text{TIF}^* - \text{pTIF}_{i^*}^k| \geq |\text{TIF}_B - \text{TIF}^{k}|\):
- FALSE: \(Q^{k+1} = Q^{k} \cup \{i^*\}\), \(\text{TIF}^{k+1} = p\text{TIF}_{i^*}\), iterates 1-4
- TRUE: Stop, \(Q_{\text{Léon}} = Q^k\)

Simulation Study

Design

1000 respondents with \(\theta \sim \mathcal{U}(-3,3)\)

Item bank \(B\) of 70 items:

\(b \sim \mathcal{U}(-3, 3)\)
\(a \sim \mathcal{U}(.90, 2.0)\)
\(c_i = 0\), \(\forall i \in B\)
\(d_r = \exp(-0.01 r)\), with \(r = \{0, \ldots, ||B|| -1\}\)

\(\text{TIF}^* = \sum_{i = 1}^{||B||} \frac{\text{IIF}_i}{||B||}\), with \(d_i = 1\), \(\forall i \in B\) 🥇

Considering \(TIF^*\), \(B\), and \(d_r\), Léon and Frank are applied to find the best \(Q \subset B\) in 100 replications

Plot twist! There is a minimum number of items: 10%, 25%, 50% of \(||B||\)

Results

Final remarks

Tip

Acknowledging for the response fatigue during the item selection itself helps to find the item selection able to minimize the distance from the target

Warning

The order of the items selected by Léon cannot be randomized