Skip to main content

Marginal Modeling of Multilevel Binary Data with Time-Varying Covariates

Modeling multilevel binary data continues to pose significant challenges such as non-nested sources of clustering and endogenous covariates. Hierarchical models may not model contrasts of interest and may be difficult to fit or make unreasonable assumptions. Generalized estimating equations (GEE) require a large number of clusters, assume data are missing completely at random, and may be inefficient. We propose and compare two approaches for regression analysis of multilevel binary data when clusters are non necessarily nested: a GEE method that relies on a working independence assumption coupled with a three-step method for obtaining empirical standard errors and a likelihood-based method implemented using Bayesian computational techniques. Implications of covariate endogeneity are addressed. The methods are illustrated using data from the Breast Cancer Surveillance Consortium to estimate mammography accuracy from a repeatedly screened population.