Skip to main content

Unified Bayesian Methods for Class Discovery and Variable Selection

We have proposed Bayesian methods that provide a unified approach to identify cluster structure among experimental units and locate variables that discriminate between groups. We use model-based clustering to uncover the cluster structure and we build a stochastic search variable selection method into the model to identify discriminating variables. We let the number of clusters be unknown and adopt two different approaches. One consists of formulating the clustering problem in terms of finite mixture models with an unknown number of components and uses a reversible jump MCMC technique. The second approach uses infinite mixture models via Dirichlet process mixture priors. We illustrate the methods with an application to DNA microarray data, where there is interest in using gene expression profiles to capture disease heterogeneities.