Skip to main content

Combining Survey and Population Data for Estimation and Simulation of Demographic Processes

The advantages and disadvantages of sample versus population data collections point strongly to their being complementary. The advantages of population data are that they are without sampling error and may be much less subject to biases due to non-response. The advantages of sample surveys are in the large amounts of information collected about each sampled, responding individual. The present study combines population-level information with sample data within a likelihood framework to exploit these advantages. The application is to marital fertility and divorce. Maximum likelihood estimation is implemented as a constrained optimization problem where the population data provide the "constraints." These are year-specific divorce rates, and age-, year- and race-specific marital fertility rates. The survey data are from the Panel Study of Income Dynamics (PSID). These contribute more detailed information on the length of the marriage, and on number and timing of children born within and before the marriage. Separate equations are estimated for divorce and marital fertility. Large reductions in both variance and bias are obtained by implementing the constrained estimation. These equations are then used in a simulation of the formation and dissolution of two-parent families. Variance about statistics resulting from the simulation (e.g., proportion of marriages that end in divorce before the youngest child attains age 18) is estimated by a Monte Carlo method.