Martha Bailey

Pilot Study Creating the Early Twentieth Century Integrated Longitudinal Dataset (ETCILD)


We seek to construct the first early 20th century longitudinal dataset. Our long-run research objective is to collect, link, and release the first dataset that combines Vital Statistics and census data. The pilot will demonstrate linkage rates between the Vital Statistics data and complete 100 percent censuses from 1900 to 1940 for an individual state.

Project Description

Some of the most important questions in social science and public policy relate to how individuals’ lives and experiences have changed across time. However, most late 19th and 20th century data are cross-sectional—large sets of individuals at one point in time. Existing cross-sectional data limit the study of economic and geographic mobility, family formation and dissolution, and the long-term impacts of adverse events, public policies, or family background.

This project will transform vital records data (birth, marriage, and death certificates) into the first longitudinal and intergenerational micro-database to span much of the late 19th and 20th centuries. The methodological innovation is the use of birth certificate data as a basis for historical record linking on an unprecedented scale.

This database will expand the research frontier in at least five dimensions. It will (1) cover the lifetimes of individuals born from 1881 to 1930, (2) include representative samples of both men and women (most historical samples cannot track women after their names change at marriage), (3) link networks of families across multiple (up to four) generations, and (4) integrate economic and demographic information with measures of health, family background, and early life context. It will also (5) offer unprecedented sample sizes, especially for understudied populations of racial/ethnic minorities and immigrants.

The ultimate aim of this project is to create a novel, user-friendly database to facilitate high-impact research on important and unresolved policy and social, behavioral and economic research questions. The project also has broader impacts on other on-going linkage efforts. It will extend and improve the Census Bureau’s new Core Longitudinal Infrastructure Population Project (CLIPP), which aims to combine census data including administrative records and population records. It’s crosswalk between women’s birth and married names, reconstruction of families, and intergenerational records will enhance the Early Indicators Project, an NIH-funded initiative to link Union Army veterans to their children and grandchildren, and the IPUMS historical linked census project (Ruggles et al. 2010). Finally, it will advance recent initiatives to link Medicare records with census and administrative data.