For decades, policing in America has been characterized by massive racial disparities and high-profile incidents of excessive force against minorities. These factors have fueled allegations of racial bias and demands for change. Yet researchers have made surprisingly little progress in quantifying the sources and severity of racial bias, let alone in identifying concrete avenues for reform. I show how these issues stem from a combination of inconsistent record keeping, incomplete datasets, and incompatible analytic approaches resting on implausible or unstated assumptions. As a consequence, research on policing has produced contradictory or outright misleading results, slowing knowledge accumulation. Drawing on formal statistical frameworks for causal inference, I show the importance of measuring and accounting for the underlying causal process—the long chain of events from officer deployment to contact, detainment, and violence—to avoid underestimating bias and draw reliable conclusions. To address these issues in a generalizable way, I present new work on automated sharp causal bounding that allows researchers to draw rigorous conclusions that are robust to essentially any obstacle that can arise in observational or experimental analyses with discrete data. The technique is illustrated with numerous examples including unobserved confounding, selection, measurement error, and missing data.