Knowledge generation through crowdsourcing is becoming increasingly possible and useful in many domain areas, yet requires new method development given the observational, unstructured and noisy nature of data sourced directly from individuals. In this talk I will discuss statistical and machine learning methods we are developing to integrate crowdsourced data into spatio-temporal public health models. This includes, combining citizen-sourced and clinical data, accounting for biases, drawing inference from observational data, and generating relevant features. Examples will use empirical data from local and worldwide contexts.