CAPPP Fellowship (2018-2019)

October 16, 2018

Sources for Data

Big Dataset of Datasets: For all projects, your first stop should be the database of datasets at the Inter-University Consortium for Political and Social Research (ICPSR).

Public Advocacy Organizations and Think Tanks: Start here, but in general, dig into the organizations working on your specific topic to see who is collecting niche data relevant to you.

U.S. State, County, and Local Level Data: National Association of Counties, National Council of State Legislatures, and Data.gov.

U.S. Census Data: Some data at aggregated sub-national levels, some individual level data. Some people have luck with DataFerrett. For quick Census facts, go here. For more options on how to get more complex datasets from the Census, go here. The Census also conducts the American Community Survey, which has lots of other population and housing data.

U.S. Electoral Politics: Your first stop should be the ever-venerable American National Election Studies on participation, voting behavior, public opinion, efficacy, media exposure, values, etc. etc. etc. On politics and the media, Google has a database of all political ads it has run, there is the Wesleyan broadcast ad archive and Stanford’s Political Communication Lab data sets, and ProPublica has a new collection of political advertisements from Facebook.

Global City and Country Data: At both the country and city level, the U.N. has a broad range of data. At the country level, some common and useful data sources include the World Bank, GINI Coefficients and other measures of global economic inequality at the World Inequality Database, and the human development index from the U.N.

Other Potentially Useful International Datasets: University of Maryland Global terrorism dataset, Princeton’s Empirical Studies of Conflict dataset, and the Comparative Study of Electoral Systems dataset.

Other Potentially Useful U.S. Datasets: the Crowd Counting Consortium which documents crowds and protests, Princeton’s Fragile Families dataset, the ADD health and other individual outcomes dataset, and the FBI Uniform Crime Reporting data. On immigration, try this UC San Diego data guidethis list of government immigration data sources, and this list of additional immigration data. For newspaper content, the New York Times has an article_search API that you can use to download newspaper articles. For Twitter data, this article has a good overview of possible ways to access those datasets. For television news, use the Vanderbilt Television News archive.