Skip to main content

Modeling Missing Data in Large-Scale Educational Assessments

Item nonresponses are prevalent in standardized testing. They happen either when students fail to reach the end of a test due to a time limit, or when the students choose to omit some items strategically. Oftentimes item nonresponses are non-random and hence the missing data mechanism needs to be properly modeled. In this presentation, I will introduce an innovative item response time model as a cohesive missing data model to account for two most common item nonresponses: not-reached items and omitted items. It is a latent hierarchical model with treating item omission as a censored event. Simulation studies show that the proposed approaches improve estimation precision of item parameters compared with other competing methods. Moreover, for persons with missing data, their latent trait estimates are also less biased and more precise (i.e., lower standard error). The 2015 Programme for International Student Assessment (PISA) computer-based mathematics data is analyzed to illustrate the application of the proposed method. If time allows, I will also discuss the modeling of missing data caused by adaptive design and use National Assessment of Educational Progress (NAEP) data as an example.