Recent research has expanded beyond investigating differential item functioning (DIF) in response score data to examine DIF in response times. However, studies on response time-based DIF have predominantly focused on uniform DIF, with non-uniform DIF receiving limited attention. This study applies the log-normal response time model to directly model non-uniform DIF in response times. Linear regression is then utilized to identify items exhibiting DIF in response times. A simulation study evaluated the impact of examinee sample sizes, focal-to-reference group ratios, the number of DIF items in a test, and DIF effect sizes on the method’s performance. Results indicate that the proposed approach maintains superior performance, demonstrating robust detection power even under small sample sizes and small-magnitude DIF effects while effectively controlling Type I error rates within acceptable limits. An empirical study using the PISA 2018 dataset assessed the presence of response time DIF between Chinese and American students across five dimensions of science items, confirming the method’s practical utility and potential for real-world testing applications.
Liu et al. (Thu,) studied this question.