Migrant self-selection: Anthropometric evidence from the mass migration of Italians to the United States, 1907–1925
Introduction
[A]lthough drawn from classes low in the economic scale, the new immigrants as a rule are the strongest, the most enterprising, and the best of their class …
(The Dillingham Commission, US Congress, 1911, vol. 1, p. 24)
The recent debate on immigration policy in the United States has included policy proposals that would reduce the size of migratory flows and “direct immigrant selection towards merit and skill” (White House, 2018). A recurring claim justifying such measures is that current immigration is negatively selected: “When Mexico sends its people, they're not sending their best” (Trump, 2015). Perceived negative selection of migrants has been the basis for calls for restrictive immigration policy in the United States at least since the Age of Mass Migration. In particular, the “new immigrants” to the United States, who arrived in growing numbers from southern and eastern Europe beginning in the 1880s, were viewed by contemporary advocates of immigration restriction as representing the poor, incapable, uneducated, and unskilled elements of their home countries; that is, it was argued that they were negatively selected from within their populations of origin.1 Modern and historical debates over immigration policy thus focus on a fundamental question of the economics of migration—who migrates?
In this paper, we answer this question in the context of the Italian immigration to the United States during the Age of Mass Migration, an episode that is particularly well suited to studying the composition of migration. Between 1892 and 1925, over 17 million Europeans immigrated to the United States. The largest single national group among them comprised roughly 4 million Italians (Ferenczi and Wilcox, 1929, Table III), who represent what is perhaps the largest free and almost entirely unrestricted movement of population in history. As a result of the preservation, transcription, and recent release of the manifests of passengers arriving at Ellis Island, this migration is extremely well documented and provides one of the largest, most complete, and most informative migration data sets ever generated.2 In addition to the availability in this source of information on migrants' heights and a highly accurate registration of Italian migrants' last place of residence, there exist detailed full-population data on the distribution of male heights in Italy, disaggregated by birth cohort and province.3 This enables a direct comparison of Italian migrants to their populations of origin.
In particular, we proxy migrants' “quality” by their height,4 and quantify the selection of Italian immigration to the United States by comparing migrants' heights to the height distributions of their populations of origin. The premise of our empirical strategy is that, the taller were migrants relative to their populations of origin, the more positive, on average, was their selection into migration on the basis of characteristics, such as occupational skill, education, income, wealth, health, and cognitive ability, that are important to policy makers and economists. This approach is grounded in a large body of research that has established that the average stature of a large group is indicative of such characteristics (Case and Paxson, 2008; Deaton, 2007; Floud et al., 2011; Steckel, 1995)
We construct a data set consisting of the stature, place of origin, and additional personal information of Italian passengers indexed in the complete Ellis Island arrival records database. First, we geo-located the last place of residence of 3.2 million of the 4.8 million Italian passengers appearing in the Ellis Island data to determine each migrant's province of origin within Italy. Next, we randomly sampled about 88,000 Italian passengers arriving at Ellis Island between 1907 (when migrants' stature was first recorded) and 1925, and transcribed their stature and other personal information that had not yet been digitized. We then linked male migrants to the distributions of stature of their province-cohorts,5 based on military records that provide nearly universal coverage during the period (A'Hearn et al., 2009; A'Hearn and Vecchi, 2011) Because all Italian males were required by the military to present themselves for physical examination, including measurement of height, this data source avoids the common problem that military data may not be representative of the population of interest (Bodenhorn et al., 2017; Zimran, 2018).
At the core of our findings are three results. Result 1 is that Italians arriving at Ellis Island were shorter, on average, than all Italians of the same birth cohort. This result is the product of over-representation in the migratory flow to the United States of migrants from southern Italy, where average heights were relatively low. When compared only with their province-cohorts, result 2 is that Italian passengers were, on average, taller. That is, narrowing the reference group from national to local reverses the sign of selection.
Finally, according to result 3, local selection from Italy varied systematically across regions and provinces. Passengers from southern Italy were considerably more positively selected than northerners relative to their province-cohorts of origin.6 Similarly, throughout the country and within both north and south, shorter province-cohorts tended to be the sources of more positively selected migrants. As a result, groups of migrants that appeared to be negatively selected in a national comparison were in fact positively selected relative to their local peers.
The magnitude of the selection that we document and of its variation across province-cohorts is large. For example, the height advantage of migrants from south Italy over their province-cohort averages was 40 percent of the height premium that we measure for literacy. Selection among the bottom quartile of provinces (as measured by average height) was stronger than in the top quartile by 140 percent of the height premium for literacy, which is equivalent to about 90 percent of the modern height premium for professional and managerial workers in the UK over manual workers, and about 54 percent of the white collar height premium over blue collar workers in the United States (Case and Paxson, 2008).
Result 3 supports theories that highlight the importance of liquidity constraints in generating positively selected migration by placing a greater barrier to the migration of the relatively disadvantaged (e.g., Angelucci, 2015; Belot and Hatton, 2012; Chiquiar and Hanson, 2005; McKenzie and Rapoport, 2010; Orrenius and Zavodny, 2005). According to these theories, the share of individuals who are financially constrained and thus unable to afford migration is greater in poorer provinces. As a result, those who are observed migrating from these provinces are a more positively selected group, disproportionately drawn from the upper ranks of the local income distribution. Further evidence is consistent with an important modification to these theories, holding that networks can mitigate the influence of liquidity constraints; that is, individuals with close links to previous migrants can overcome their liquidity constraint by relying on support from their friends and relatives who have already migrated, thus countering the under-representation of the lower ranks. Using an individual measure of connectedness—whether a migrant was joining a first-degree relative already living in the United States (as opposed to joining a friend, a more distant relative, etc., or no one at all)—we show that stronger personal links to previous migrants were associated with more negatively selected migrants. This result adds to evidence from studies showing that a greater stock of past migration leads to more negative migrant selection (Beine et al., 2011; Fernández-Huertas Moraga, 2013; McKenzie and Rapoport, 2010), while using a personal rather than a local measure of connectedness.
Extending the analysis, we provide what is, to our knowledge, the first systematic analysis of the changes in migrant selection after the imposition of the literacy requirement by the Immigration Act of 1917. We find that this requirement was associated with a general increase in positive selection throughout Italy. The largest increases in positive selection occurred in the least literate provinces (typically in the south), as would be expected if the literacy requirement was indeed binding and if height and literacy were related. In principle, results 2 and 3 could have been driven by this shift alone without applying to the period of unrestricted migration prior to 1917. In part this was true: during the period 1907–1916 alone, the average local selection was effectively zero, meaning that result 2 did not hold for Italy as a whole. However, among the pre-1917 migrants from the south, local selection was already positive. Moreover, the systematic variation in selection across provinces (result 3) also existed prior to 1917. These patterns were simply strengthened after imposition of the new restriction in 1917.
Finally, we discuss a number of threats to the validity of our results and evaluate their robustness. These threats include systematic failures in geo-location of the last place of residence and non-classical measurement error caused by random errors in geo-location. Based in part on an alternative geo-location procedure that uses passengers’ surnames, we show that these factors are unlikely to drive our results.
Beyond improving our understanding of the Italian migration during the Age of Mass Migration,7 the main contribution of this paper is to the recent and emerging literature that quantifies migrant selection and identifies its determinants. This literature has largely focused on modern migration from Mexico to the United States (Chiquiar and Hanson, 2005; Feliciano, 2005; Fernández-Huertas Moraga, 2011, 2013; Ibarraran and Lubotsky, 2007; Kaestner and Malamud, 2014; McKenzie and Rapoport, 2010; Mishra, 2007; Orrenius and Zavodny, 2005), while some other studies deal with variation in selection across source countries (Docquier and Marfouk, 2006; Feliciano, 2005; Grogger and Hanson, 2011). Recent improvements in the availability of historical data and in record linkage methods have enabled the circumvention of fundamental data limitations in such studies of contemporary migration (see critique by Fernández-Huertas Moraga, 2011) by using data from the Age of Mass Migration (e.g., Abramitzky and Boustan, 2017; Abramitzky et al., 2012, 2013, 2014; Connor, 2016; Kosack and Ward, 2014). Drawing on the advantages of the historical data on the Italian immigration, our contribution to the literature on migrant selection is three-fold.
First, we present a study of migrant selection based on data of unusual clarity and completeness. The data and the use of stature as our measure of migrant quality satisfy three criteria that we identify as essential to clean measurement of migrant selection: the two sources of height data are representative of the migrating population and the population at risk for migration, measuring quality for each group with minimal scope for selection biases; adult stature is unaffected by migration or by decisions made in expectation of migration; and the period that we cover was almost entirely free of restrictions on immigration from Europe,8 making it possible to learn the supply of migrants with little contamination by policy that differentially favors migrants of different quality. The use of stature as our measure of migrant quality has the additional advantage of revealing variation in a setting in which conventional measures of migrant quality—such as occupation and literacy—are extremely coarse or imprecise.
Second, our findings on the imposition of the literacy requirement and on the role of network connections offer a significant contribution to the understanding of the effectiveness of migrant screening policies, on which little is known (Borjas, 2014, p. 215).9 Screening migrants by one coarse measure of quality—literacy—was indeed associated with strong improvements in the selection of immigrants in terms of another proxy measure of quality—height. However, using a uniform measure of screening did not necessarily lead to a uniform improvement in selection, at least in terms of height. The screening was much more binding and effective in the south, which suggests that such one-size-fits-all criteria yield very different results in different contexts and that attention to local circumstances is crucial for determining the likely impacts of migration policy. Similarly, the finding that strong personal relations are negatively correlated with selection is informative in the context of the debate on restrictions on family-based migration.
Our third contribution is to distinguish between selection from a country as a whole and selection from within local, sub-national, environments. In most studies, migrant selection is measured relative to the national distribution of some quality measure in the country of origin. Selection within local environments and the variation of such local selection across regions are rarely observed or noticed by policy makers and economists, in many cases due to data limitations.10 As a result of the fine disaggregation of the data on the Italian population and the ability to link migrants to their provinces of origin, we show that the two levels of selection can be, as in the case of Italy, qualitatively different, and that there was considerable and systematic variation in the degree of local selection across Italy. Moreover, the strongest positive selection relative to local peers came from the poorest regions of the country, among the group of migrants that a national comparison would have portrayed as being negatively selected.
These qualitative differences between local and national selection suggest that that comparisons of migrants to their national-level populations of origin alone may disregard potentially important information given by the degree of local selection. From the perspective of the provinces of origin, it is clear that positively and negatively selected emigration will have different impacts; but local selection can also be informative in immigration screening, even conditional on observing national selection. Intuitively, if south Italian migrants were compared only to their nationwide reference group, they would have been judged to be of low quality; but the fact that they stood out relative to their populations of origin could indicate that they were better endowed with characteristics that might determine better outcomes in the receiving economy. We develop a simple theoretical framework to formalize this intuition and derive a reasonable condition under which local selection is positively correlated with expected outcomes, conditional on the national selection.
This lesson is particularly important when considering immigration from large and diverse countries of origin, such as Mexico, China, and India (currently the top three sources of immigrants to the United States): failure to take local selection into consideration has the potential to distort immigrant screening policies. As attention to migrant selection grows and calls intensify for screening policies, such as point systems, that favor migrants of higher absolute observed quality measures, it must be kept in mind that absolute measures may not tell the whole story of migrant selection and that such policies may prove to be a crude and unnecessarily strong filter. The greatest gains might come from those among whom it is least expected.
Section snippets
Stature and migrant selection
While individual height is overwhelmingly determined by idiosyncratic genetic factors, it is well established that, when comparing large populations, these factors average out and that average adult stature reflects standards of living, health, and physical well being during childhood and adolescence (Eveleth and Tanner, 1976; Floud et al., 2011; Frisancho, 1993; Silventoinen, 2003; Steckel, 1995). Importantly for the present context, stature is informative regarding individual characteristics
Data sources
Our information on the stature and other personal characteristics of migrants is taken from the Ellis Island arrival records data base. This source contains the records of nearly all passengers who passed through the Port of New York from 1897 to 1924 (and January 1925),11 comprising the overwhelming majority of Italian passengers entering the
Results
Let Ft denote the stature distribution at the national level for birth cohort t and let its mean and variance be μt and .33
The 1917 literacy requirement
After a quarter century of attempts to pass such legislation, the 1917 Immigration Act imposed, for the first time, significant restrictions on European immigration to the United States (Daniels, 2004; Goldin, 1994; Hing, 2004; Zolberg, 2006). The law banned the entry of passengers over the age of 16 who were unable to prove basic literacy in any language. Shortly thereafter, the 1921 Emergency Quota Act severely limited Italian immigration by setting the national yearly quota for Italy at less
Systematic upward bias
There are two possible sources of systematic upward bias in the heights reported on the passenger manifests—measurement with shoes, and upward-biased self-reporting.56 Unfortunately, we were unable to find documentation of how precisely the height data in the passenger manifests were gathered.57
Conclusions
The finely disaggregated data that we use in this paper enable the investigation of selection at the local level. They reveal that the seemingly disadvantaged southern Italian immigrants were, indeed, “the best of their class,” thus changing the interpretation of the Italian migration for the receiving and the sending economies. They suggest that southern Italy experienced a human capital drain, one that may have contributed to the contemporaneously widening north-south divide within Italy.74
Notes
This is a revised version of chapter 4 of Zimran's dissertation. A previous version of this paper was titled “Self-Selection of Immigrants on the Basis of Living Standards: Evidence from the Stature of Italian Immigrants at Ellis Island, 1907–1925.”
Acknowledgements
We are indebted to Joel Mokyr, Joseph Ferrie, Igal Hendel, and Matthew Notowidigdo for encouragement and guidance, and to Andrew Foster (the editor) and anonymous referees for detailed comments. For providing data, we are grateful to Peg Zitko and the Statue of Liberty-Ellis Island Foundation, Brian A'Hearn, Franco Peracchi, Giovanni Vecchi, and Jordi Martí-Henneberg. We also thank Ran Abramitzky, William Collins, Timothy Hatton, Richard Hornbeck, Taylor Jaworski, Andrea Matranga, Marian Smith,
References (86)
- et al.
Have the poor always been less likely to migrate? Evidence from inheritance practices during the age of mass migration
J. Dev. Econ.
(2013) - et al.
The making of modern America: migratory flows in the age of mass migration
J. Dev. Econ.
(2013) - et al.
Diasporas
J. Dev. Econ.
(2011) - et al.
Can selective immigration policies reduce migrants' quality?
J. Dev. Econ.
(2016) - et al.
Making sense of the labor market height premium: evidence from the British household panel survey
Econ. Lett.
(2009) - et al.
Height and BMI of Italian immigrants to the USA, 1908-1970
Econ. Hum. Biol.
(2005) Understanding different migrant selection patterns in rural and urban Mexico
J. Dev. Econ.
(2013)- et al.
Income maximization and the selection and sorting of international migrants
J. Dev. Econ.
(2011) - et al.
Was Dick Whittington taller than those he left behind? Anthropometric measures, migration and the quality of life in early nineteenth century London
Explor. Econ. Hist.
(2009) Immigrant quotas and immigrant selection
Explor. Econ. Hist.
(2016)
Network effects and the dynamics of migration and inequality: theory and evidence from Mexico
J. Dev. Econ.
Emigration and wages in source countries: evidence from Mexico
J. Dev. Econ.
Self-selection among undocumented immigrants from Mexico
J. Dev. Econ.
Selection in migration and return migration: evidence from micro data
Econ. Lett.
Brain drain in the age of mass migration: does relative inequality explain migrant selectivity?
Explor. Econ. Hist.
Immigration in American economic history
J. Econ. Lit.
Europe's tired, poor, huddled masses: self-selection and economic outcomes in the age of mass migration
Am. Econ. Rev.
A nation of immigrants: assimilation and economic outcomes in the age of mass migration
J. Polit. Econ.
Height and the normal distribution: evidence from Italian military data
Demography
Statura
Who Leaves? Deciphering immigrant self-selection from a developing country
Econ. Dev. Cult. Change
Migration and financial constraints: evidence from Mexico
Rev. Econ. Stat.
Immigrant selection and short-term labor market outcomes by visa category
J. Popul. Econ.
Castle Garden Database
Encountering Ellis Island: How European Immigrants Entered America
The ecology of height: the effect of microbial transmission on human height
Perspect. Biol. Med.
Immigration policy, self-selection, and the quality of immigrants
Rev. Int. Econ.
Immigrant selection in the OECD
Scand. J. Econ.
Skilled and unskilled wage differentials and economic integration, 1870–1930
Eur. Rev. Econ. Hist.
Escaping Europe: health and human capital of holocaust refugees
Eur. Rev. Econ. Hist.
Sample-selection biases and the industrialization puzzle
J. Econ. Hist.
Self-selection and the earnings of immigrants
Am. Econ. Rev.
Immigration Economics
Self-selection of emigrants: theory and evidence on stochastic dominance in observable and unobservable characteristics
Econ. J.
Immigration Laws and Regulations of July 1, 1907
Stature and status: height, ability, and labor market outcomes
J. Polit. Econ.
International migration, self-selection and the distribution of wages: evidence from Mexico and the United States
J. Polit. Econ.
Italian Genealogical Records: How to Use Italian Civil, Ecclesiastical, & Other Records in Family History Research
Annual Report of the Commissioner-general of Immigration for the Fiscal Year Ended June 30, 1903
The Cream of the Crop? Inequality and Migrant Selectivity in Ireland during the Age of Mass Migration
Using anthropometric indicators for Mexicans in the United States and Mexico to understand the selection of migrants and the ‘hispanic paradox’
Soc. Biol.
Guarding the Golden Door: American Immigration Policy and Immigrants since 1882
Height, health, and development
Proc. Natl. Acad. Sci. Unit. States Am.
Cited by (63)
Male and female self-selection during the Portuguese mass migration, 1885–1930
2024, Explorations in Economic HistoryHistorical height measurement consistency: Evidence from colonial Trinidad
2023, Explorations in Economic HistoryUS immigrants’ secondary migration and geographic assimilation during the Age of Mass Migration
2022, Explorations in Economic HistoryLife after crossing the border: Assimilation during the first Mexican mass migration
2021, Explorations in Economic HistoryRefugees' and irregular migrants’ self-selection into Europe
2021, Journal of Development EconomicsEffects of passive smoking on prenatal and infant development: Lessons from the past
2021, Economics and Human Biology