class: center, middle, inverse, title-slide .title[ # Item Response Theory for beginners ] .subtitle[ ## Differential Item Functioning in R ] .author[ ### Dr. Ottavia M. Epifania ] .institute[ ### Bressanone ] .date[ ### Corso IRT @ Università Libera di Bolzano, 17-18 Gennaio 2023 ] --- <style type="text/css"> pre { max-height: 700px; overflow-y: auto; } pre[class] { max-height: 500px; } .scroll-100 { max-height: 500px; overflow-y: auto; } .inverse { background-color: #272822; color: #d6d6d6; text-shadow: 0 0 20px #333; } .scrollable { height: 500px; overflow-y: auto; } .scrollable-auto { height: 80%; overflow-y: auto; } .remark-slide-number { display: none; } </style> --- class: section, center, middle # Importare i dati --- ## Caricare i dati in R .scrollable[ Scaricare il file `difClass.csv` dalla cartella [Dati](https://drive.google.com/drive/folders/1PXDG7HhjRDMdFEQjk5WQQWdiorDpSgtw?usp=sharing) Salvarlo all'interno della sottocartella `Dati` del progetto `R` che avete creato per il corso Usate questo codice: ```r data = read.csv("Dati/difClass.csv", header = T, sep = ",") ``` ] --- class: section, center, middle # Look at the data! --- ## Risposte agli item .scrollable[ ```r small_gender = data[, c("id", "gender")] small_responses = data[, -2] responses = reshape(small_responses, direction = "long", varying = c(2:ncol(small_responses)), idvar = "id", v.names = "response", timevar = "item", times = names(small_responses)[-1]) long_data = merge(small_gender, responses, by = "id") p = data.frame(table(long_data$item, long_data$resp)) p$prop = p$Freq/nrow(data) p$Var1 = factor(p$Var1, levels = unique(long_data$item)) colnames(p)[2] = "correct" ggplot(p, aes(x = Var1, y=prop, fill = correct)) + geom_bar(stat = "identity") + theme(axis.text = element_text(angle = 90)) + geom_hline(aes(yintercept=.50)) ``` <img src="DIF_files/figure-html/unnamed-chunk-3-1.png" width="90%" height="60%" style="display: block; margin: auto;" /> ] --- --- ## Closer .scrollable[ ```r p = data.frame(table(long_data$item, long_data$resp, long_data$gender)) p$prop = p$Freq/nrow(data) p$Var1 = factor(p$Var1, levels = unique(long_data$item)) colnames(p)[3] = "gender" ggplot(p[p$Var2 %in% 1,], aes(x = Var1, y=prop, fill = gender)) + geom_bar(stat = "identity", position = "dodge") + theme(axis.text = element_text(angle = 90)) + geom_hline(aes(yintercept=.50)) ``` <img src="DIF_files/figure-html/unnamed-chunk-4-1.png" width="90%" style="display: block; margin: auto;" /> ] --- class: section, center, middle # Stimare e scegliere il modello --- ## Model comparison .scrollable[ ```r m1pl = tam.mml(data[, grep("item", colnames(data))], verbose = F) m2pl = tam.mml.2pl(data[, grep("item", colnames(data))], irtmodel = "2PL", verbose = F) m3pl = tam.mml.3pl(data[, grep("item", colnames(data))], est.guess = grep("item", colnames(data)), verbose = F) IRT.compareModels(m1pl, m2pl, m3pl) ``` ```{.scroll-100} ## $IC ## Model loglike Deviance Npars Nobs AIC BIC AIC3 AICc ## 1 m1pl -5599.390 11198.78 11 1000 11220.78 11274.77 11231.78 11221.05 ## 2 m2pl -5597.621 11195.24 20 1000 11235.24 11333.40 11255.24 11236.10 ## 3 m3pl -5597.716 11195.43 31 1000 11257.43 11409.57 11288.43 11259.48 ## CAIC GHP ## 1 11285.77 0.5610390 ## 2 11353.40 0.5617621 ## 3 11440.57 0.5628716 ## ## $LRtest ## Model1 Model2 Chi2 df p ## 1 m1pl m2pl 3.5389531 9 0.9390614 ## 2 m1pl m3pl 3.3484797 20 0.9999895 ## 3 m2pl m3pl -0.1904735 11 1.0000000 ## ## attr(,"class") ## [1] "IRT.compareModels" ``` ] --- ## 1PL .scrollable{ ```r summary(m1pl) ``` ```{.scroll-100} ## ------------------------------------------------------------ ## TAM 4.0-16 (2022-05-13 13:23:23) ## R version 4.2.2 (2022-10-31 ucrt) x86_64, mingw32 | nodename=LAPTOP-OTTAVIA | login=huawei ## ## Date of Analysis: 2023-01-17 08:10:23 ## Time difference of 0.05735612 secs ## Computation time: 0.05735612 ## ## Multidimensional Item Response Model in TAM ## ## IRT Model: 1PL ## Call: ## tam.mml(resp = data[, grep("item", colnames(data))], verbose = F) ## ## ------------------------------------------------------------ ## Number of iterations = 18 ## Numeric integration with 21 integration points ## ## Deviance = 11198.78 ## Log likelihood = -5599.39 ## Number of persons = 1000 ## Number of persons used = 1000 ## Number of items = 10 ## Number of estimated parameters = 11 ## Item threshold parameters = 10 ## Item slope parameters = 0 ## Regression parameters = 0 ## Variance/covariance parameters = 1 ## ## AIC = 11221 | penalty=22 | AIC=-2*LL + 2*p ## AIC3 = 11232 | penalty=33 | AIC3=-2*LL + 3*p ## BIC = 11275 | penalty=75.99 | BIC=-2*LL + log(n)*p ## aBIC = 11240 | penalty=41 | aBIC=-2*LL + log((n-2)/24)*p (adjusted BIC) ## CAIC = 11286 | penalty=86.99 | CAIC=-2*LL + [log(n)+1]*p (consistent AIC) ## AICc = 11221 | penalty=22.27 | AICc=-2*LL + 2*p + 2*p*(p+1)/(n-p-1) (bias corrected AIC) ## GHP = 0.56104 | GHP=( -LL + p ) / (#Persons * #Items) (Gilula-Haberman log penalty) ## ## ------------------------------------------------------------ ## EAP Reliability ## [1] 0.624 ## ------------------------------------------------------------ ## Covariances and Variances ## [,1] ## [1,] 0.988 ## ------------------------------------------------------------ ## Correlations and Standard Deviations (in the diagonal) ## [,1] ## [1,] 0.994 ## ------------------------------------------------------------ ## Regression Coefficients ## [,1] ## [1,] 0 ## ------------------------------------------------------------ ## Item Parameters -A*Xsi ## item N M xsi.item AXsi_.Cat1 B.Cat1.Dim1 ## 1 item1 1000 0.490 0.050 0.050 1 ## 2 item2 1000 0.720 -1.131 -1.131 1 ## 3 item3 1000 0.310 0.963 0.963 1 ## 4 item4 1000 0.329 0.859 0.859 1 ## 5 item5 1000 0.823 -1.821 -1.821 1 ## 6 item6 1000 0.291 1.070 1.070 1 ## 7 item7 1000 0.222 1.495 1.495 1 ## 8 item8 1000 0.512 -0.056 -0.056 1 ## 9 item9 1000 0.695 -0.988 -0.988 1 ## 10 item10 1000 0.156 1.992 1.992 1 ## ## Item Parameters in IRT parameterization ## item alpha beta ## 1 item1 1 0.050 ## 2 item2 1 -1.131 ## 3 item3 1 0.963 ## 4 item4 1 0.859 ## 5 item5 1 -1.821 ## 6 item6 1 1.070 ## 7 item7 1 1.495 ## 8 item8 1 -0.056 ## 9 item9 1 -0.988 ## 10 item10 1 1.992 ``` } --- ## Fit del modello ```r f.m1pl = tam.modelfit(m1pl, progress = F) f.m1pl$statlist ``` ```{.scroll-100} ## X100.MADCOV SRMR SRMSR MADaQ3 pmaxX2 ## 1 0.4318818 0.02178404 0.02800907 0.02874571 1 ``` ```r f.m1pl$modelfit.test ``` ```{.scroll-100} ## maxX2 Npairs p.holm ## 1 5.173086 45 1 ``` --- ## 1PL - Item fit .scrollable[ ```r item.fit.1pl = IRT.itemfit(m1pl) item.fit.1pl$chisquare_stat ``` ```{.scroll-100} ## item Group1 ## 1 item1 1.2342630 ## 2 item2 0.5938130 ## 3 item3 0.6668817 ## 4 item4 0.3117211 ## 5 item5 1.1353671 ## 6 item6 0.3983689 ## 7 item7 2.2867280 ## 8 item8 0.4899905 ## 9 item9 1.2476006 ## 10 item10 0.3184401 ``` ```r item.fit.1pl$RMSD ``` ```{.scroll-100} ## item Group1 ## 1 item1 0.013892105 ## 2 item2 0.007935321 ## 3 item3 0.010894367 ## 4 item4 0.005948698 ## 5 item5 0.010882772 ## 6 item6 0.008170491 ## 7 item7 0.019833211 ## 8 item8 0.009002479 ## 9 item9 0.013681206 ## 10 item10 0.006888076 ``` ```r item.fit.1pl$RMSD_summary ``` ```{.scroll-100} ## Parm M SD Min Max ## 1 Group1 0.01071287 0.004171214 0.005948698 0.01983321 ``` ] --- class: section, center, middle # DIF time! --- ## Pacchetto `difR` Per investigare la DIF, utilizziamo il pacchetto `difR` `difR` permette di investigare la DIF anche con approcci non IRT che però non sono il focus di questo corso Il vantaggio di usare un approccio IRT è che la DIF viene investigata considerando il tratto latente `\(\rightarrow\)` non dipende dal campione - Likelihood Ratio Test - Lord - Raju --- ## Likelihood Ratio Test .scrollable[ Non è propriamente il LRT ```r est_theta = IRT.factor.scores(m1pl)$EAP lrt_dif = difGenLogistic(data[, !colnames(data) %in% c("id", "gender")], group = as.factor(data$gender), focal.names = "f", type = "udif", alpha = .001, p.adjust.method = "BH", match = est_theta, criterion = "LRT") lrt_dif ``` ```{.scroll-100} ## ## Detection of uniform Differential Item Functioning ## using Logistic regression method, without item purification ## and with LRT DIF statistic ## ## Matching variable: specified matching variable ## ## No set of anchor items was provided ## ## Multiple comparisons made with Benjamini-Hochberg adjustement of p-values ## ## Logistic regression DIF statistic: ## ## Stat. P-value Adj. P ## item1 9.5769 0.0020 0.0033 ** ## item2 7.8345 0.0051 0.0064 ** ## item3 2.8907 0.0891 0.0891 . ## item4 6.6418 0.0100 0.0111 * ## item5 28.1792 0.0000 0.0000 *** ## item6 36.3876 0.0000 0.0000 *** ## item7 18.8507 0.0000 0.0000 *** ## item8 9.8524 0.0017 0.0033 ** ## item9 27.3310 0.0000 0.0000 *** ## item10 8.9442 0.0028 0.0040 ** ## ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Detection threshold: 10.8276 (significance level: 0.001) ## ## Items detected as uniform DIF items: ## ## item5 ## item6 ## item7 ## item9 ## ## ## Effect size (Nagelkerke's R^2): ## ## Effect size code: ## 'A': negligible effect ## 'B': moderate effect ## 'C': large effect ## ## R^2 ZT JG ## item1 0.0092 A A ## item2 0.0086 A A ## item3 0.0030 A A ## item4 0.0069 A A ## item5 0.0368 A B ## item6 0.0394 A B ## item7 0.0231 A A ## item8 0.0098 A A ## item9 0.0293 A A ## item10 0.0130 A A ## ## Effect size codes: ## Zumbo & Thomas (ZT): 0 'A' 0.13 'B' 0.26 'C' 1 ## Jodoin & Gierl (JG): 0 'A' 0.035 'B' 0.07 'C' 1 ## ## Output was not captured! ``` ] --- ## DIF Raju .scrollable[ ```r rajDif = difRaju(d2[, !colnames(d2) %in% c("id", "ability")], group = "gender", focal.name = "f", model = "1PL", p.adjust.method = "BH") rajDif ``` ```{.scroll-100} ## ## Detection of Differential Item Functioning using Raju's method ## with 1PL model and without item purification ## ## Type of Raju's Z statistic: based on unsigned area ## ## Engine 'ltm' for item parameter estimation ## ## Common discrimination parameter: fixed to 1 ## ## No set of anchor items was provided ## ## Multiple comparisons made with Benjamini-Hochberg adjustement of p-values ## ## Raju's statistic: ## ## Stat. P-value Adj. P ## item1 -2.7358 0.0062 0.0104 * ## item2 -2.5503 0.0108 0.0154 * ## item3 -1.5133 0.1302 0.1302 ## item4 -2.2916 0.0219 0.0244 * ## item5 4.4742 0.0000 0.0000 *** ## item6 5.0150 0.0000 0.0000 *** ## item7 3.6098 0.0003 0.0008 *** ## item8 2.4605 0.0139 0.0173 * ## item9 -4.6929 0.0000 0.0000 *** ## item10 -2.9145 0.0036 0.0071 ** ## ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Detection thresholds: -1.96 and 1.96 (significance level: 0.05) ## ## Items detected as DIF items: ## ## item1 ## item2 ## item4 ## item5 ## item6 ## item7 ## item8 ## item9 ## item10 ## ## Effect size (ETS Delta scale): ## ## Effect size code: ## 'A': negligible effect ## 'B': moderate effect ## 'C': large effect ## ## mF-mR deltaRaju ## item1 -0.4190 0.9846 A ## item2 -0.4244 0.9973 A ## item3 -0.2458 0.5776 A ## item4 -0.3677 0.8641 A ## item5 0.9066 -2.1305 C ## item6 0.8568 -2.0135 C ## item7 0.6592 -1.5491 C ## item8 0.3799 -0.8928 A ## item9 -0.7692 1.8076 C ## item10 -0.5764 1.3545 B ## ## Effect size codes: 0 'A' 1.0 'B' 1.5 'C' ## (for absolute values of 'deltaRaju') ## ## Output was not captured! ``` ] --- ## DIF Lord .scrollable[ ```r lordDif = difLord(data[, !colnames(data) %in% "id"], group = "gender", focal.name = "f", model = "1PL", p.adjust.method = "BH") lordDif ``` ```{.scroll-100} ## ## Detection of Differential Item Functioning using Lord's method ## with 1PL model and without item purification ## ## Engine 'ltm' for item parameter estimation ## ## Common discrimination parameter: fixed to 1 ## ## No set of anchor items was provided ## ## Multiple comparisons made with Benjamini-Hochberg adjustement of p-values ## ## Lord's chi-square statistic: ## ## Stat. P-value Adj. P ## item1 7.4848 0.0062 0.0104 * ## item2 6.5039 0.0108 0.0154 * ## item3 2.2901 0.1302 0.1302 ## item4 5.2515 0.0219 0.0244 * ## item5 20.0188 0.0000 0.0000 *** ## item6 25.1505 0.0000 0.0000 *** ## item7 13.0309 0.0003 0.0008 *** ## item8 6.0539 0.0139 0.0173 * ## item9 22.0232 0.0000 0.0000 *** ## item10 8.4941 0.0036 0.0071 ** ## ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Items detected as DIF items: ## ## item1 ## item2 ## item4 ## item5 ## item6 ## item7 ## item8 ## item9 ## item10 ## ## Effect size (ETS Delta scale): ## ## Effect size code: ## 'A': negligible effect ## 'B': moderate effect ## 'C': large effect ## ## mF-mR deltaLord ## item1 -0.4190 0.9846 A ## item2 -0.4244 0.9973 A ## item3 -0.2458 0.5776 A ## item4 -0.3677 0.8641 A ## item5 0.9066 -2.1305 C ## item6 0.8568 -2.0135 C ## item7 0.6592 -1.5491 C ## item8 0.3799 -0.8928 A ## item9 -0.7692 1.8076 C ## item10 -0.5764 1.3545 B ## ## Effect size codes: 0 'A' 1.0 'B' 1.5 'C' ## (for absolute values of 'deltaLord') ## ## Output was not captured! ``` ] --- ## DIF Lord I `mF-mR` è la differenza tra i parametri del gruppo focale e del gruppo di riferimento `deltaLord` è una misura dell'effetto ottenuta moltiplicando la differenza tra i parametri di difficoltà nei due gruppi per `\(-2.35\)` (Penfield & Camilli, 2007) Il `deltaLord` può essere interpretato secondo i valori riportati nell'output (Holland & Thayer, 1985) I valori delle stime dei parametri nei due gruppi posso essere estratti con: ```r lordDif$itemParInit ``` --- ## Attenzione ai paramteri degli item .scrollable[ Le prime `\(i = 1, \ldots, I\)` righe sono i parametri degli item **NEL GRUPPO DI RIFERIMENTO** ```r item_par = lordDif$itemParInit item_par[1:10, ] ``` ```{.scroll-100} ## b se(b) ## Item1 0.1003314 0.1081049 ## Item2 -1.0813481 0.1167024 ## Item3 0.9287588 0.1141034 ## Item4 0.8852199 0.1135280 ## Item5 -2.5345451 0.1625745 ## Item6 0.5418750 0.1100347 ## Item7 1.0626916 0.1160698 ## Item8 -0.4051660 0.1092741 ## Item9 -0.7726385 0.1124324 ## Item10 2.1299052 0.1442241 ``` Le ultime righe sono i parametri **NEL GRUPPO FOCALE** ```r item_par[11:nrow(item_par), ] ``` ```{.scroll-100} ## b se(b) ## Item1 1.621335e-06 0.1084637 ## Item2 -1.187073e+00 0.1186087 ## Item3 1.001565e+00 0.1156194 ## Item4 8.361141e-01 0.1134093 ## Item5 -1.309367e+00 0.1209242 ## Item6 1.717348e+00 0.1307057 ## Item7 2.040499e+00 0.1409710 ## Item8 2.933326e-01 0.1090660 ## Item9 -1.223203e+00 0.1192650 ## Item10 1.872148e+00 0.1353199 ``` Questo si applica anche al metodo di Raju ] --- ## Attenzione ai parametri degli item .scrollable[ ```r itemFR = data.frame(cbind(item_par[11:nrow(item_par), ], item_par[1:10, ])) colnames(itemFR)[c(1,3)] = paste0(rep("b",2), c("F", "R")) itemFR$dif =itemFR$bF - itemFR$bR itemFR ``` ```{.scroll-100} ## bF se.b. bR se.b..1 dif ## Item1 1.621335e-06 0.1084637 0.1003314 0.1081049 -0.10032975 ## Item2 -1.187073e+00 0.1186087 -1.0813481 0.1167024 -0.10572484 ## Item3 1.001565e+00 0.1156194 0.9287588 0.1141034 0.07280599 ## Item4 8.361141e-01 0.1134093 0.8852199 0.1135280 -0.04910580 ## Item5 -1.309367e+00 0.1209242 -2.5345451 0.1625745 1.22517862 ## Item6 1.717348e+00 0.1307057 0.5418750 0.1100347 1.17547293 ## Item7 2.040499e+00 0.1409710 1.0626916 0.1160698 0.97780725 ## Item8 2.933326e-01 0.1090660 -0.4051660 0.1092741 0.69849860 ## Item9 -1.223203e+00 0.1192650 -0.7726385 0.1124324 -0.45056403 ## Item10 1.872148e+00 0.1353199 2.1299052 0.1442241 -0.25775737 ``` I valori sono diversi da quelli di `dif` sono dversi da quelli riportati nell'ouput in `mF-mR`!! Questo perché i parametri del gruppo focale vengono riscalati con *equal means anchoring* di Cook e Eignor (1991) ] --- ## Cook e Eignor (1991) Secondo il metodo dell'*equal means ancrhoring*, i parametri del gruppo di riferimento e del gruppo focale devono essere riscalati (viene fatto in automatico dal software) Il rescaling viene fatto sottraendo una costante al parametro di difficoltà del gruppo di riferimeno: `$$c = \bar{b}_R - \bar{b}_F$$` Questa costante va tolta a tutti i parametri di difficoltà del **gruppo di riferimento** Dopo aver applicato il rescaling, si può calcolare la DIF --- ## Rescaling .scrollable[ ```r itemFR$constant = mean(itemFR$bR) - mean(itemFR$bF) itemFR$new.bR = itemFR$bR - itemFR$constant itemFR$DIF.correct = itemFR$bF- itemFR$new.bR itemFR ``` ```{.scroll-100} ## bF se.b. bR se.b..1 dif constant ## Item1 1.621335e-06 0.1084637 0.1003314 0.1081049 -0.10032975 -0.3186282 ## Item2 -1.187073e+00 0.1186087 -1.0813481 0.1167024 -0.10572484 -0.3186282 ## Item3 1.001565e+00 0.1156194 0.9287588 0.1141034 0.07280599 -0.3186282 ## Item4 8.361141e-01 0.1134093 0.8852199 0.1135280 -0.04910580 -0.3186282 ## Item5 -1.309367e+00 0.1209242 -2.5345451 0.1625745 1.22517862 -0.3186282 ## Item6 1.717348e+00 0.1307057 0.5418750 0.1100347 1.17547293 -0.3186282 ## Item7 2.040499e+00 0.1409710 1.0626916 0.1160698 0.97780725 -0.3186282 ## Item8 2.933326e-01 0.1090660 -0.4051660 0.1092741 0.69849860 -0.3186282 ## Item9 -1.223203e+00 0.1192650 -0.7726385 0.1124324 -0.45056403 -0.3186282 ## Item10 1.872148e+00 0.1353199 2.1299052 0.1442241 -0.25775737 -0.3186282 ## new.bR DIF.correct ## Item1 0.41895953 -0.4189579 ## Item2 -0.76271998 -0.4243530 ## Item3 1.24738700 -0.2458222 ## Item4 1.20384806 -0.3677340 ## Item5 -2.21591698 0.9065505 ## Item6 0.86050314 0.8568448 ## Item7 1.38131979 0.6591791 ## Item8 -0.08653783 0.3798704 ## Item9 -0.45401035 -0.7691922 ## Item10 2.44853338 -0.5763855 ``` ] --- --- class: section, center, middle # Esercitazione --- ## Esercitazione DIF Scaricare il dataset `difES.csv` dalla cartella [Dati](https://drive.google.com/drive/folders/1PXDG7HhjRDMdFEQjk5WQQWdiorDpSgtw?usp=sharing) Look at the data! Fittare i modelli IRT Scegliere il modello più appropriato Valutarne la fit (sia del modello sia degli item) Valutare la DIF con i tre metodi ---