Causal Friday: Some Real Effects of the Flint Water Crisis…

On Fridays, we examine a research paper that uses (or fails to use) a clever method to perform causal inference, i.e. to tease out cause and effect.

In case you haven’t been keeping up, I’ll start by noting that Flint, Michigan is still having problems with its water supply. (Technically, residents are being told that the lead prevalence is down to acceptable levels but also that they should keep using bottled water anyway for the next 3 years until the pipes are replaced, which, uh…ok.) We know on a general level that lead consumption is bad, but it’s worth thinking about what specific problems can arise, so here’s a fun synopsis on lead poisoning. Oh, and lead is also being blamed for Legionnaries’ Disease in the area, so there’s that. We’re also learning that the water problems appear to have had significant effects on fertility and infant mortality:

Flint changed its public water source in April 2014, increasing lead exposure. The effects of lead in water on fertility and birth outcomes are not well established. Exploiting variation in the timing of births we find fertility rates decreased by 12%, fetal death rates increased by 58% (a selection effect from a culling of the least healthy fetuses), and overall health at birth decreased (from scarring), compared to other cities in Michigan. Given recent efforts to establish a registry of residents exposed, these results suggests women who miscarried, had a stillbirth or had a newborn with health complications should register.

Woah if true. On principle, I’m talking about this study because I feel like this matter needs attention in order for government officials to be willing to act responsibly. As a data analysis matter, I’m talking about this paper because it illustrates an important component of causal analysis. So let’s take a look at one of the pieces of data that the authors used to reach their conclusion:

Ok, that doesn’t look great. BUT…it’s actually not enough information to conclude that the water itself is causing a problem- maybe there was something else going on more generally that resulted in decreasing fertility at the time. (In other words, maybe we’re falling for the post hoc ergo propter hoc fallacy– it’s not just the title of the second episode of The West Wing!) To do a proper analysis, we need a counterfactual- theoretically, a world where everything is the same except there is no water crisis. In practice, a counterfactual approximation is usually constructed by looking at comparable areas that didn’t undergo the “treatment”- in this case, didn’t have a water crisis.

If you looked closely at the above graph, you may have noticed that I cheated in order to fit my narrative- the graph actually looked like this:

Oh. Yeah, that doesn’t look great either for the people of Flint, but it at least looks better from a data analysis perspective. Economists refer to this sort of analysis as “differences-in-differences”- i.e. a comparison of before-after comparisons. In picture form, something like this:

Before After Difference
Treated (i.e. Flint) A B B-A
Untreated (i.e. comparison group) C D D-C

We can then analyze whether the treatment had an effect by investigating whether the incremental difference of the treatment group (B-A) – (D-C) is different from zero. (If the difference is positive, the treatment had a positive effect and vice versa, assuming that larger outcomes are better.) In order to be more rigorous in the analysis, the next logical step would be to test whether this difference in differences is statistically significantly different from zero. To do this, economists run regressions with various interaction terms that get at this “difference in differences.”

Like I’ve said before, causal analysis generally aims to be as close to the middle-school science project as possible- control group, experimental group, the only difference between the groups being the treatment, and so on. In this case, the causal interpretation of the data presented in the paper rests crucially on whether Flint really is like the comparison group in all ways other than the water supply (or at least those ways that can’t be controlled for). In addition, it’s crucial that the treatment that is being analyzed (lead in the water, in this case) is random, meaning that it doesn’t pop up in response to something about the treatment group that researchers can’t observe/quantify.

From what I’ve read about the history of the Flint water crisis, I feel pretty comfortable ruling out that latter concern- in other words, I really don’t think Flint did anything to invite the water crisis that would also affect fertility. (I guess I also don’t think Flint did anything to invite the water crisis more generally, so there’s that.) As to the former concern, the comparison group is comprised of the 15 largest non-Flint cities in Michigan, so they could be different from Flint in various demographic ways that are not controlled for in the picture above. That said, those differences can be controlled for in the regression analysis, which still does find a significant difference in fertility. On the other hand, it could be the case that some of the women who wanted to have children moved out of Flint to do so for precautionary reasons- we wouldn’t be able to see this easily in the data, and it gives an explanation for why fertility rates could have dropped even if the water wasn’t actually making women infertile. While this explanation is initially plausible, we would also have to believe that the women who stayed in Flint were predisposed to have much sicker babies, since there is also an observed difference in infant mortality between Flint and the comparison cities.

The general idea is as follows: use techniques at your disposal to perform causal analysis as best as possible from a mathematical perspective, try to come up with alternative explanations for your findings, and then try to use the data to rule them out. These last two parts can get interesting, largely because different people think of different alternative explanations. For example, the authors of the paper conjecture that people in Flint might just have decided to have less sex rather than leave the area, and they actually use data from the American Time Use Survey to argue that this is not the case, which I personally find hilarious. Overall, I’m not sure whether to feel happy that we’re doing rigorous analysis or depressed that we’re finding that the water supply has significant negative effects.

Causal Friday: Fun with Gender Discrimination, Now with More Bad Econometrics…

On Fridays, we examine a research paper that uses (or fails to use) a clever method to perform causal inference, i.e. to tease out cause and effect.

Disclaimer: I’m kind of stretching the definition of both “causal analysis” and “research paper” here, but I guess you could interpret the analysis as relating to the causal impact of being female.

In case you haven’t heard, Google is the target of a class-action lawsuit based on gender discrimination. (Shocking, I know, given what we know about Silicon Valley more generally. =P) Part of the impetus for the lawsuit is an employee-led effort to collect compensation data that shows that men are paid more than women at the company:

At Google, Employee-Led Effort Finds Men Are Paid More Than Women

At Google, Employee-Led Effort Finds Men Are Paid More Than Women

A spreadsheet created by employees to share salary information shows pay for women is falling short of what men make at various levels.


From a data perspective, proving discrimination can be somewhat difficult- for example, we hear the often-quoted “women make 77 cents for every dollar a man makes” statistic, but this in itself doesn’t really tell us anything about discrimination. It could instead be the case that women sort into lower-paying occupations and jobs of their own volition, choose to work fewer hours, and so on. (On the other hand, we can’t rule out the discrimination hypothesis either.)

Ideally, what one would do to look for discrimination would be to compare otherwise equivalent men and women and see whether compensation differences still exist within the matched groups. Mathematically, this is essentially what economists do when they run a regression with “control variables”- variables that suck up the differences that are accounted for by stuff other than gender.

Google employees seem to be up on their applied math, since they put together an analysis so that they could make the following statement:

Based upon its own analysis from January, Google said female employees make 99.7 cents for every dollar a man makes, accounting for factors like location, tenure, job role, level and performance.

On the surface, this seems to suggest that significant gender discrimination just doesn’t show up in the data. BUT…and this is important…this example highlights the difference between doing math and doing data analysis (or, more charitably, data science)- while this conclusion may be mathematically correct, it’s basically a “garbage in, garbage out” use of econometric tools. Simply put, if you’re trying to isolate gender discrimination, you can’t just blindly control for things that themselves are likely the result of gender discrimination! It’d be like looking at the impact of diet on health and using weight as a control variable- sure, you’d get an “all else being equal” sort of result, but it wouldn’t make sense since weight is likely a step in the chain between diet and health outcomes.

In this way, Google tipped its hand quite a bit regarding the particular nature of gender discrimination at the company- if men and women are paid the same once job title and performance reviews are taken into account, then gender discrimination (if it exists) is taking place either by herding women into jobs with different roles/levels or showing anti-female (or pro-male) bias in performance reviews. (Also, if the “levels” have set pay bands, which the article kind of suggests, doesn’t controlling for level largely amount to assuming your conclusion?)

Turns out my suspicions are pretty on point, given the specific claim of the lawsuit:

Google ‘segregates’ women into lower-paying jobs, stifling careers, lawsuit says

Google ‘segregates’ women into lower-paying jobs, stifling careers, lawsuit says

Exclusive: Women say Google denied them promotions, telling the Guardian they were forced into less prestigious jobs despite qualifications


It’s amazing what you can learn from data IF you look at it properly. In a semi-previous life, I worked as an economic consultant, which basically means that I helped prepare expert testimony to be used in lawsuits involving economic matters. What I wouldn’t give to be the expert witness who gets to offer up a rebuttal to Google’s crap econometrics here.

Update: This is amazing:

In case you’re curious, the excerpt is from this book, which I highly recommend.

Causal Friday: The Most Depressing Instrument Ever, Fox News Edition…

On Fridays, we examine a research paper that uses (or fails to use) a clever method to perform causal inference, i.e. to tease out cause and effect.

Economists Gregory J. Martin and Ali Yurukoglu have a new paper published in the American Economic Review (also available in working paper form here) that shows that the existence of Fox News has a (statistically) significant impact on Republican vote share. Here’s the abstract:

We measure the persuasive effects of slanted news and tastes for like-minded news, exploiting cable channel positions as exogenous shifters of cable news viewership. Channel positions do not correlate with demographics that predict viewership and voting, nor with local satellite viewership. We estimate that Fox News increases Republican vote shares by 0.3 points among viewers induced into watching 2.5 additional minutes per week by variation in position. We then estimate a model of voters who select into watching slanted news, and whose ideologies evolve as a result. We use the model to assess the growth over time of Fox News influence, to quantitatively assess media-driven polarization, and to simulate alternative ideological slanting of news channels.

Ok sure, that’s a lot to unpack, but let’s work through it. I think we can all agree that people who watch Fox News are more likely to vote Republican than others, but on that basis we can’t tell whether Fox News actually causes them to vote Republican, Republican ideology attracts them to Fox News, or something else both causes them to watch Fox News and vote Republican. In an ideal world (at least from a research standpoint), we could run an experiment to examine cause and effect where we take a group of people and randomly choose half of them to sit in front of Fox News for a while (and disallow the other group from watching) while keeping everything else about their lives the same as before. (This might actually be hard if the Fox News group doesn’t watch a lot of TV and goes outside instead, etc.) To my knowledge, no one has tried to do this yet, perhaps because watching Fox News is too hazardous to get IRB clearance. (That said, I will admit I was too lazy to read the lit review of the paper.)

So do researchers just give up? Well, sociologists might. 🙂 (I kid because I love.) But economists get creative, and one thing they do is try to find an instrumental variable– simply put, a source of randomization. In this case, the researchers asserted that people are more likely to watch a given channel when it has a lower channel number (perhaps the result of the typical channel-surfing process), and they noticed that what channel Fox News is on differs by geography in a fairly random way. (In other words, it’s not correlated with how likely people are to watch fox News, vote Republican, etc.) These two observations together mean that we basically do have a world where some people are randomly subjected to more Fox News than others, and, as it turns out, there is a (negative) relationship between Fox News channel number and Republican vote share.

Obviously, there is a no direct link between Fox News channel number and voting patterns, and instead the hypothesis is that channel number impacts viewing time, which in urn affects the votes. Kind of fancy econometrics stuff enables the researchers to isolate the part of watching Fox News that is essentially random and then determine the impact of that random part on voting. They estimate that this impact is 0.3 percentage points in vote share as a result of a random extra 2.5 minutes per week of Fox News watching. (for example, 55.3% to 55.6% voting Republican) A few things to note:

  • This doesn’t seem a like a huge effect, but it’s statistically significantly different from zero, and there are people who are randomly subjected to more than an extra 2.5 minutes per week of Fox News, in which case the effect would be larger. (2.5 minutes is the increase in viewing time associated with a one standard deviation reduction in channel number.)
  • Similar analysis was done for, say, MSNBC, but an analogous effect was not observed.
  • The paper itself tries harder than I do here to rule out alternative explanations and such.
  • If cable/broadcast companies know that the channel numbers work in this way, they could use them as a manipulative tool, since that’s how causality works. (Good thing this paper happened first, since non-randomization would kind of screw things up.)

I’m a little conflicted here- on one hand, given that Fox News is heavy on the misinformation, it’s pretty depressing to learn that it actually shapes ideology and actions. On the other hand, math is SO COOL.

(Sidenote: If you think this sort of think is neat, you can see a whole talk about it here.)