A primary goal of cognitive diagnosis models, a traditional part of psychometrics, is the estimation of latent traits for a group of observations. For example, we might be interested in estimating a skill set profile for a group of students (which skills do you have? which ones are you missing?). CDMs can also be used as diagnostic assessments for patient care (given these patient reported outcomes, can we choose the correct underlying problem?). Of late, the commonly used CDMs have been found to be inadequate when trying to estimate the latent traits for high-dimensional data sets. Given the rapid growth of online intelligent tutoring systems and the large data sets they produce, the field has been searching for more flexible tools. Recent work has employed clustering methodology as an alternative. After introducing cognitive diagnosis models, we briefly summarize the current use of clustering and its advantages and disadvantages. We then introduce mixture model component trees, a new technique to visualize high dimensional group structure (if present). Their development is motivated by the common assumption in model-based (parametric) clustering that the population density is comprised of a mixture of (usually) Gaussian component densities and that each mixture component is a cluster estimating an underlying (sub-population) group. Component trees can be used to identify combinations of components that could potentially be merged to better estimate the underlying groups. We include results from the Assistments system, an online tutoring system designed to assess students and assist teachers in preparing for state achievement.
Joint work with Nema Dean, Department of Statistics, University of Glasgow, and Beth Ayers, Department of Education, Berkeley