Experimental Pitfalls Are Generally Dumber Than We Imagine, Now With More Dogs

Ok, story time, so go grab your blankie and a beverage…

A number of years ago, I taught an econ tutorial entitled “Sex, Drugs, and Rock & Roll”…it was a neat class- as the name would suggest, we looked at papers on birth control, drug legalization, the music industry, and so on. For reasons I’m still not sure I understand, it was a somewhat peculiar group of students- basically the football team, the kid who as far as I could tell tutored the football team, and a female transfer from Tulane (because of hurricane Katrina). (I actually confirmed with her beforehand that she was ok with the gender ratio, given the subject matter…turns out she wasn’t easily fazed and later explained to everyone how to make meth.)

Anyway, I digress…the first paper I assigned was authored by a combination of friends and classmates- it wasn’t so much directly relevant to the subject matter but it has “Sex, Drugs, and Rock & Roll” in the title so I figured it would be a good place to start getting students used to reading academic research. The discussion…well, didn’t go as I expected.

Basically, the football team came in and explained that it was funny to read the paper because they participated in the experiment. So, PSA: if you use the CLER Lab at Harvard your subject pool is the Harvard football team I guess. (I don’t mean to imply this is positive or negative, just that it might affect generalizability.) Given this information, I found it far more interesting to have them talk about their experience than to jump right into talking about the results of the paper. The main thing I wanted to know was essentially “did you take the task seriously and think things through?”

The responses were “meh” at best, from what I remember. I pushed back, asking something along the lines of “but you get paid more if you performed better at the task, did that not serve as an incentive?” The response was basically this:

Again, I’m not criticizing the students- I completely believe that effort isn’t costless even when a person doesn’t have anything better to do, so it’s entirely possible that the kids were in fact maximizing their utility. But it gave me something interesting to think about regarding experimental design and how to interpret experimental results. In general, economists are wary of asking hypothetical questions about preferences and behaviors, since there’s good evidence that what people say they would do or what they think they would do doesn’t always coincide with the choices that they actually make. To mitigate this problem, experimental economists try to give subjects tasks where payoffs depend on behavior (usually in addition to giving a payment for showing up), since this supposedly gives the subjects proper incentives to treat the task as though it’s real. Economists have even been known to do their experiments in developing countries, since that enables the research budgets to create stakes that are more real than trivial.

The discussion with my students makes me think that we need to be more critical regarding whether our experiments give sufficient incentives to elicit true behaviors and preferences. In addition, economists could probably do more to assess both during and after the experiments how seriously the subjects took their assigned task. In any case, the bottom line is that the success of an experiment depends crucially on subjects taking the task seriously and behaving as expected in a logistical sense, and these characteristics can’t be taken for granted.

Or, put more succinctly, here’s a helpful warning:

I’ve been watching this all day and I can’t stop laughing. (probably related: Gizmo looks a little worried at the moment) For context, there’s this thing going around Twitter where people try to see whether their dogs understand object permanence by holding up a sheet and then running away as it falls. I mean, what could possibly go wrong with this clearly rock solid experimental design?

Update: There’s a point brought up in one of the comments below that I think warrants discussion…I certainly didn’t intend for the takeaway from this post to be “experiments are bad, burn them,” but I see how it could be taken that way. If anything, I’m suggesting the opposite, so allow me to explain. If subjects aren’t fully engaged in an experiment, this probably biases the experiment towards “no effect” rather than creating effects where none exist. In the dog scenario, for example, maybe the chihuahua does in fact understand object permanence, but we can’t document the phenomenon because the dog’s behavior caused the experiment to be garbage. Therefore, I’m not advocating for questioning results so much as questioning “non results.” We talk about a “replication crisis” in economics, where a surprisingly large percentage of documented effects can’t be re-documented in subsequent experiments. These numbers are general used to call the initial findings into questions, and I guess I’m suggesting that we should go easier on this suspicion until we can confirm that such lack of replicability isn’t driven by the sorts of issues I describe above. I guess I’m also trying to help researchers make their budgets go further, since publication bias means that non results often just get shoved in the file drawer, never to see the light of day.