TInternet data have generated new opportunities for the study of human migrations and geographic mobility. However, these data come with important limitations due to several types of bias. This talk will discuss some recent work that colleagues and I have done to estimate patterns of international migration using data from major social media platforms. There is not a definite answer to the question of whether statistical inference can be made from non-representative social media data. Here we show that, when certain assumptions are met, we can extract signal from large, but noisy and biased data. We propose two main approaches to reduce bias. When "ground truth" data are available, we suggest a method that relies on calibration of the online data against reliable official statistics. When no ground truth data are available, we propose a difference-in-differences approach to estimate relative trends.
Estimating patterns of international migration using (non-representative) social media data
Room
409