Although the percentage of youngsters graduating from high school has increased since the 1970s, a significant proportion of America's high school youth continue to drop out of school. Numerous studies have identified important individual- and school-level factors associated with dropping out. But the models used to identify at youth at risk of school failure have largely proved inadequate. This study applies the method of "boosting"-a new class of learning algorithms more precisely known as "gradient machines"-to improve the accuracy of classification models predicting the likelihood of high school failure. While relatively new to the field of statistics, boosting has become a popular tool in the machine learning community and shows promise in the area of applied problems. This study will apply boosting algorithms to data from the National Education and Longitudinal Study of 1988, a nationally representative sample of adolescents within schools currently in its fourth wave. NELS provides data on students' attitudes and behaviors as well as data on family background, and other demographics. By applying boosting to the problem of school dropout, this study seeks to help teachers and administrators identify students most susceptible to school failure, and enable them to direct resources for dropout prevention to students in greatest need.
Joint with Greg Ridgeway, RAND Corporation