Does higher density cause lower birth rates?
Assessing one recent claim that it does
Works in Progress has a new issue! Check out our articles on:
How Western societies conquered drunk driving, first through deterrence, and then by changing social norms.
New York’s long road to implementing London-style congestion pricing;
How Britain experienced housing ‘shrinkflation’ only otherwise seen in the Soviet Union;
Notes on Progress are pieces that are a bit too short to run on Works in Progress. In this piece, Sam Watling punctures some of the faulty evidence around density and the baby bust.
There is a long-running debate over whether dense urban areas, and dense areas in general, lead to lower birth rates. A recent paper by Amanda Rotella, Michael Varnum, Oliver Sng, and Igor Gorossmanhas attempted to investigate this question by testing whether countries with higher population density have lower fertility.
The method they use is, at its core, fairly straightforward. The authors use an extensive dataset of estimated fertility rates in 174 countries from 1950 to 2019, which comes from the UN World Population Prospects. Then they use a regression to estimate the effect of differences in population density on fertility. They find that there is a negative relationship between fertility and density – in fact, they find that with their preferred set of controls, they can explain nearly a third of the overall difference in fertility rates between countries from 1950 to 2017 with population density. They also find the same result examining fertility and density trends within countries, where above average changes in density are associated with greater decreases in fertility.
It is, however, important to note what the source of variation in population density is. They are not testing whether denser sub-units (for example, cities) of a given country have a lower fertility than less dense sub-units (rural counties). Instead they look at, first, whether countries with a higher overall population density have lower fertility than less dense countries and, second, whether a given country’s fertility decreases as its population density increases.
Is this convincing? Not really. What you want to get is the isolated effect of a change in population density on fertility. But there are factors that change both population density and fertility. When you regress density on fertility, you are also measuring the impact of these other factors.
For example, the conventional pattern of demographic transition is that initially both fertility and death rates are high, effectively canceling each other out and making population growth relatively low. As a country gets richer, its death rate falls, and the population increases. At some point after this, the fertility rate also falls, for reasons we are trying to understand.
But what does this mean for this study? Countries at the end of the demographic transition, because their populations have increased, have a higher population density and, for one reason or another, a lower fertility rate. Even if population density had no effect on fertility we would still expect to see a negative correlation between fertility and population density, simply because the population has increased. In other words, an estimate with raw data would be biased.1
Of course, the authors are aware of this and try to use controls. In particular they try to factors such as development, culture and environment.
The problem here is that they don't justify what their controls are – instead they do a ‘kitchen sink regression’ – essentially analyzing the relationship between two variables after adjusting for differences in a lot of control variables without special justification for each control.
This is not good practice because it can lead to ‘overadjusting’ for variables in the same way that one can ‘underadjust’ for other variables. For instance, higher population densities can boost economic development, which then influences fertility rates. By controlling for GDP per capita, we risk adjusting away the pathways through which population density could affect fertility. There could also be other plausible explanations for how the other variables are related to each other. Without explaining what our assumptions are, or justifying why we’ve controlled for some variables, the final result can become uninterpretable.
Another problem is overfitting, when including many control variables increases the chances of capturing random noise, as variables can correlate by chance rather than through true causal relationships. This overfitting occurs because the regression model learns spurious correlations, performing well within the dataset containing noise but which would perform badly outside of it.
However let’s be extremely generous and say we take their results at face value. Recall that their measure of population density is just the number of people in the country divided by its overall size, rather than the density at which people live. The problem is that this is not population density in terms most people would understand or experience it.
By this definition Egypt has a population density of 113 people per square kilometer, much less than half the UK’s. But most of Egypt is a completely empty desert. Nearly all of the population lives in places with over 1,000 people per square kilometer and the capital, Cairo, is notoriously dense (the picture above probably shows an area with over 100,000 people per square kilometer). By contrast the Netherlands has an official population density nearly four times higher, but almost the entire country is built up, meaning that most Dutch live at much lower density than most Egyptians. But the authors’ work implies the exact opposite.
There is a measure of population density that attempts to remove empty space from the denominator, thus tracking how dense an area actually feels for residents: it's called the population-weighted density. Population-weighted density weights each subunit in the country’s contribution to the overall population density by its population. This gets much closer to what the authors are looking for, and our intuitive idea of what density means in these contexts.
Given that the borders of very few countries in the world, especially those with large populations, have changed at all during the study period, changes in raw population density are really just a measure of population. You could move everyone in a given country into a Corbusian megacity of towers, or into isolated hamlets and farmhouses, and that country’s resulting density would be identical according to the study’s measure.
But there is an even bigger problem. In any regression analysis you are simply measuring the extent to which one metric (here, population density) changes along with another metric (here, fertility). However there is nothing in this analysis, beyond the assumptions of the authors, to show that this variation is due to changes in population density changing the fertility rate. The causal relationship could be the other way around, where changes in the fertility rate change the population density. This is a problem known as reverse causality .
In this example, the main driver of population changes in the study’s sample, i.e. of population density changes, is fertility, since the areas of almost all the countries stayed the same. This means the highest fertility countries saw the highest population density growth. But they also had the furthest to fall! If you were already at a fertility rate of 2 per woman then converging to the developed world norm of 1.5 in the 2020s was a much smaller fall than if you had a fertility rate of 3 or 4 or more per woman. This reverse causality does not just bias the result. It biases it in a way that cannot be controlled for or measured, as the two variables that provide the bias are already in the equation.
A better approach would be to clearly lay out what their model of what is going on is, how each variable in their regression is (hypothetically) related to the others, decide on that basis what should and shouldn’t be controlled for, and use that to make a statistical estimate.2 This would be a far better way to estimate the direct effect of population density on fertility, and avoid the use of irrelevant, and potentially misleading, controls.
Another approach would be to look at the impact of a policy change, or some exogenous change related to population density specifically, on fertility rates in one country relative to similar countries. Alternatively they might find some change that affected population density directly and didn’t affect fertility rates in any other way aside from density.
It’s still possible that density reduces birth rates, but this paper doesn’t just fail to provide good evidence, it doesn’t provide any evidence at all.
They include adjustments for time trends within countries, but that only corrects for this bias if the decrease in fertility is linear (constant over time). This is contrary to the predictions of the conventional demographic transition models and the experience of most countries where fertility falls rapidly from a stable high fertility equilibrium and stabilises at a low fertility equilibrium in relatively a short space of time.
Note that this is separate from their use of ‘life history theory’ which states that density ‘increases competition for resources’.to predict a negative correlation between density and fertility as it does not specify the appropriate controls or the shape of the time trends.
A guest post by
|