Skip to Content

Linking Survey and Big Data: New Data Opportunities for Research on Social Inequality and Mobility

Description of research project
Linking existing social survey data to administrative (big) data sources is a powerful way to expand the data available for sociological inquiry. This project pursues a range of different linkage projects. We will add historical Census data as well as rich data on housing from a real estate vendor to ongoing, large-scale survey studies of American families. These matched data will end up supporting exciting new opportunities for research on the long-term trends in economic wellbeing and the transmission of social inequality across generations in the United States.

Description of work that will be assigned to research assistants
Most of the work will be focused on the cleaning and linking of data. You will get to know two very different types of data: rich survey data on American families and big administrative data. In order to link the two, we need to locate the families that participated in a survey within the big administrative data source (such as the Census or real estate vendor data). To do so, we will jointly develop matching algorithms (in Stata) that help locate records and then hand-check the quality of the match. This is an iterative process, in which you will not only learn how to work with big data and judge their quality – such as, e.g., someone working at Google would – but also glance the sociological opportunities that these data entail. Later stages of the project may also involve literature searches on research themes arising from this project. You will work in a small group of highly motivated undergraduate students.

Supervising Faculty Member
Fabian T. Pfeffer, Assistant Professor, Department of Sociology

Contact information
Fabian Pfeffer
4213 LSA

Average hours of work per week

Range of credit hours students can earn

Number of positions available: