Using Twitter for Demographic and Social Science Research: Tools for Data Collection
April 2013 CSSS Working Paper #127
Despite recent interest in using Twitter to examine human behavior and attitudes, little work has been done to develop systematic ways of collecting Twitter data for social science research. Further, gleaning key demographic information about Twitter users, a key component of much social science research,remains a challenge. This paper develops a scalable, sustainable toolkit for social science researchers interested in using Twitter data to examine behaviors and attitudes, as well as the demographic characteristics of the populations expressing or engaging in them. We begin by describing how to collect Twitter data on a particular population – in this case, individuals who do not plan to vote in the 2012 U.S. presidential election. We then describe and evaluate a method for processing data to retrieve demographic information reported by users that is not encoded as text (e.g., details of images) and assess the reliability of these techniques. We end by assessing the challenges of this data collection strategy and discussing how large scale social media data may benefit demographic researchers.
Keywords: Twitter, social media, big data, demography, data collection, crowdsourcing, Amazon Mechanical Turk