Bayesian Inference for Two-Phase Studies with Categorical Covariates
Jon Wakefield
November 2012 CSSS Working Paper #123
Abstract
In this paper, we consider two-phase sampling in the situation in which all covariates are categorical. Two-phase designs are appealing from an efficiency perspective since, if care-fully implemented, they allow sampling to be concentrated in informative cells. A number of likelihood-based methods have been developed for the analysis of two-phase data, but we describe a Bayesian approach which has previously been unavailable. The methods are first compared with existing approaches via a simulation study, and are then applied to data collected on Wilms tumour. The benefits of a Bayesian approach include relaxation of the reliance on asymptotic inference, particularly in sparse data situations, and the potential to model data with complex dependencies, for example, via the introduction of random effects. The sparse data situation is illustrated via a simulated example.
Keywords: Contingency tables, Efficiency, Markov chain Monte Carlo, Outcome-dependent sampling