Statistical Modeling of Dependent Network Data
PI: Peter D. Hoff
Sponsor: Statistical Modeling of Dependent Network Data
Project Period: -
Network data summarizes relational information among interacting units, and are common in many areas of research. Applications include international conflict, international trade, telephone calling patterns, chain-of-command networks in businesses and other organizations, the behavior of epidemics and the interconnectedness of the world wide web. Such data differs from standard data in that it consists of observations on pairs of experimental units, and that the observations among pairs are typically not independent, but dependent in complicated ways. Past efforts at modeling dependencies in networks have focused on exponentially parameterized random-graph models (often referred to as the p* class of models), which have been difficult to estimate and often give a poor fit to actual network data. Additionally, such models have focused on the case of binary responses, and have difficulty modeling common types of network data such as continuous, count, time-series, and multivariate data. In contrast, the proposed project will develop a flexible modeling strategy for dependent network data using a novel random effects approach, which can easily be incorporated within well-known statistical methods such as linear regression, generalized linear models, semiparametric regression, and others. Preliminary results suggest such an approach has several advantages over current practice. The proposed approach allows for prediction and hypothesis testing; lends itself to a model-based method of network visualization; is highly extendible and interpretable in terms of well known statistical procedures; and has a feasible means of exact parameter estimation.