Data streams of increasing complexity and scale are being collected in a variety of fields ranging from neuroscience, genomics, and environmental monitoring to e-commerce. Modeling the intricate relationships between the large collection of series can lead to increased predictive performance and domain-interpretable structures. For scalability, it is crucial to discover and exploit sparse dependencies between the data streams. Such representational structures for independent data sources have been studied extensively, but have received limited attention in the context of time series. In this talk, we present Bayesian models for capturing such sparse dependencies via clustering and graphical models of time series. We explore these methods in a variety of applications, including house price modeling and inferring networks in the brain.
We then turn to observed interaction data, and briefly touch upon how to devise statistical network models that capture important network features like sparsity of edge connectivity. Within our Bayesian framework, a key insight is to move to a continuous-space representation of the graph, rather than the typical discrete adjacency matrix structure. We demonstrate our methods on a series of real-world networks with up to hundreds of thousands of nodes and millions of edges.