On Fridays, we examine a research paper that uses (or fails to use) a clever method to perform causal inference, i.e. to tease out cause and effect.
In case you haven’t been keeping up, I’ll start by noting that Flint, Michigan is still having problems with its water supply. (Technically, residents are being told that the lead prevalence is down to acceptable levels but also that they should keep using bottled water anyway for the next 3 years until the pipes are replaced, which, uh…ok.) We know on a general level that lead consumption is bad, but it’s worth thinking about what specific problems can arise, so here’s a fun synopsis on lead poisoning. Oh, and lead is also being blamed for Legionnaries’ Disease in the area, so there’s that. We’re also learning that the water problems appear to have had significant effects on fertility and infant mortality:
Flint changed its public water source in April 2014, increasing lead exposure. The effects of lead in water on fertility and birth outcomes are not well established. Exploiting variation in the timing of births we find fertility rates decreased by 12%, fetal death rates increased by 58% (a selection effect from a culling of the least healthy fetuses), and overall health at birth decreased (from scarring), compared to other cities in Michigan. Given recent efforts to establish a registry of residents exposed, these results suggests women who miscarried, had a stillbirth or had a newborn with health complications should register.
Woah if true. On principle, I’m talking about this study because I feel like this matter needs attention in order for government officials to be willing to act responsibly. As a data analysis matter, I’m talking about this paper because it illustrates an important component of causal analysis. So let’s take a look at one of the pieces of data that the authors used to reach their conclusion:
Ok, that doesn’t look great. BUT…it’s actually not enough information to conclude that the water itself is causing a problem- maybe there was something else going on more generally that resulted in decreasing fertility at the time. (In other words, maybe we’re falling for the post hoc ergo propter hoc fallacy– it’s not just the title of the second episode of The West Wing!) To do a proper analysis, we need a counterfactual- theoretically, a world where everything is the same except there is no water crisis. In practice, a counterfactual approximation is usually constructed by looking at comparable areas that didn’t undergo the “treatment”- in this case, didn’t have a water crisis.
If you looked closely at the above graph, you may have noticed that I cheated in order to fit my narrative- the graph actually looked like this:
Oh. Yeah, that doesn’t look great either for the people of Flint, but it at least looks better from a data analysis perspective. Economists refer to this sort of analysis as “differences-in-differences”- i.e. a comparison of before-after comparisons. In picture form, something like this:
|Treated (i.e. Flint)||A||B||B-A|
|Untreated (i.e. comparison group)||C||D||D-C|
We can then analyze whether the treatment had an effect by investigating whether the incremental difference of the treatment group (B-A) – (D-C) is different from zero. (If the difference is positive, the treatment had a positive effect and vice versa, assuming that larger outcomes are better.) In order to be more rigorous in the analysis, the next logical step would be to test whether this difference in differences is statistically significantly different from zero. To do this, economists run regressions with various interaction terms that get at this “difference in differences.”
Like I’ve said before, causal analysis generally aims to be as close to the middle-school science project as possible- control group, experimental group, the only difference between the groups being the treatment, and so on. In this case, the causal interpretation of the data presented in the paper rests crucially on whether Flint really is like the comparison group in all ways other than the water supply (or at least those ways that can’t be controlled for). In addition, it’s crucial that the treatment that is being analyzed (lead in the water, in this case) is random, meaning that it doesn’t pop up in response to something about the treatment group that researchers can’t observe/quantify.
From what I’ve read about the history of the Flint water crisis, I feel pretty comfortable ruling out that latter concern- in other words, I really don’t think Flint did anything to invite the water crisis that would also affect fertility. (I guess I also don’t think Flint did anything to invite the water crisis more generally, so there’s that.) As to the former concern, the comparison group is comprised of the 15 largest non-Flint cities in Michigan, so they could be different from Flint in various demographic ways that are not controlled for in the picture above. That said, those differences can be controlled for in the regression analysis, which still does find a significant difference in fertility. On the other hand, it could be the case that some of the women who wanted to have children moved out of Flint to do so for precautionary reasons- we wouldn’t be able to see this easily in the data, and it gives an explanation for why fertility rates could have dropped even if the water wasn’t actually making women infertile. While this explanation is initially plausible, we would also have to believe that the women who stayed in Flint were predisposed to have much sicker babies, since there is also an observed difference in infant mortality between Flint and the comparison cities.
The general idea is as follows: use techniques at your disposal to perform causal analysis as best as possible from a mathematical perspective, try to come up with alternative explanations for your findings, and then try to use the data to rule them out. These last two parts can get interesting, largely because different people think of different alternative explanations. For example, the authors of the paper conjecture that people in Flint might just have decided to have less sex rather than leave the area, and they actually use data from the American Time Use Survey to argue that this is not the case, which I personally find hilarious. Overall, I’m not sure whether to feel happy that we’re doing rigorous analysis or depressed that we’re finding that the water supply has significant negative effects.