The Problem of False Positives in Automated Census Linking: Evidence from Nineteenth-Century New York's Irish Immigrants
568/2021 Tyler Anbinder, Dylan Connor, Cormac Ó Gráda and Simone Wegge
Automated census linkage algorithms have become popular for generating longitudinal data on social mobility, especially for immigrants and their children. But what if these algorithms are particularly bad at tracking immigrants? Using nineteenth-century Irish immigrants as a test case, we examine the most popular of these algorithms—that created by Abramitzky, Boustan, Eriksson (ABE), and their collaborators. Our findings raise serious questions about the quality of automated census links. False positives range from about one-third to one-half of all links depending on the ABE variant used. These bad links lead to sizeable estimation errors when measuring Irish immigrant social mobility.