Experimental Pitfalls Are Generally Dumber Than We Imagine, Now With More Dogs

Ok, story time, so go grab your blankie and a beverage…

A number of years ago, I taught an econ tutorial entitled “Sex, Drugs, and Rock & Roll”…it was a neat class- as the name would suggest, we looked at papers on birth control, drug legalization, the music industry, and so on. For reasons I’m still not sure I understand, it was a somewhat peculiar group of students- basically the football team, the kid who as far as I could tell tutored the football team, and a female transfer from Tulane (because of hurricane Katrina). (I actually confirmed with her beforehand that she was ok with the gender ratio, given the subject matter…turns out she wasn’t easily fazed and later explained to everyone how to make meth.)

Anyway, I digress…the first paper I assigned was authored by a combination of friends and classmates- it wasn’t so much directly relevant to the subject matter but it has “Sex, Drugs, and Rock & Roll” in the title so I figured it would be a good place to start getting students used to reading academic research. The discussion…well, didn’t go as I expected.

Basically, the football team came in and explained that it was funny to read the paper because they participated in the experiment. So, PSA: if you use the CLER Lab at Harvard your subject pool is the Harvard football team I guess. (I don’t mean to imply this is positive or negative, just that it might affect generalizability.) Given this information, I found it far more interesting to have them talk about their experience than to jump right into talking about the results of the paper. The main thing I wanted to know was essentially “did you take the task seriously and think things through?”

The responses were “meh” at best, from what I remember. I pushed back, asking something along the lines of “but you get paid more if you performed better at the task, did that not serve as an incentive?” The response was basically this:

Again, I’m not criticizing the students- I completely believe that effort isn’t costless even when a person doesn’t have anything better to do, so it’s entirely possible that the kids were in fact maximizing their utility. But it gave me something interesting to think about regarding experimental design and how to interpret experimental results. In general, economists are wary of asking hypothetical questions about preferences and behaviors, since there’s good evidence that what people say they would do or what they think they would do doesn’t always coincide with the choices that they actually make. To mitigate this problem, experimental economists try to give subjects tasks where payoffs depend on behavior (usually in addition to giving a payment for showing up), since this supposedly gives the subjects proper incentives to treat the task as though it’s real. Economists have even been known to do their experiments in developing countries, since that enables the research budgets to create stakes that are more real than trivial.

The discussion with my students makes me think that we need to be more critical regarding whether our experiments give sufficient incentives to elicit true behaviors and preferences. In addition, economists could probably do more to assess both during and after the experiments how seriously the subjects took their assigned task. In any case, the bottom line is that the success of an experiment depends crucially on subjects taking the task seriously and behaving as expected in a logistical sense, and these characteristics can’t be taken for granted.

Or, put more succinctly, here’s a helpful warning:

I’ve been watching this all day and I can’t stop laughing. (probably related: Gizmo looks a little worried at the moment) For context, there’s this thing going around Twitter where people try to see whether their dogs understand object permanence by holding up a sheet and then running away as it falls. I mean, what could possibly go wrong with this clearly rock solid experimental design?

Update: There’s a point brought up in one of the comments below that I think warrants discussion…I certainly didn’t intend for the takeaway from this post to be “experiments are bad, burn them,” but I see how it could be taken that way. If anything, I’m suggesting the opposite, so allow me to explain. If subjects aren’t fully engaged in an experiment, this probably biases the experiment towards “no effect” rather than creating effects where none exist. In the dog scenario, for example, maybe the chihuahua does in fact understand object permanence, but we can’t document the phenomenon because the dog’s behavior caused the experiment to be garbage. Therefore, I’m not advocating for questioning results so much as questioning “non results.” We talk about a “replication crisis” in economics, where a surprisingly large percentage of documented effects can’t be re-documented in subsequent experiments. These numbers are general used to call the initial findings into questions, and I guess I’m suggesting that we should go easier on this suspicion until we can confirm that such lack of replicability isn’t driven by the sorts of issues I describe above. I guess I’m also trying to help researchers make their budgets go further, since publication bias means that non results often just get shoved in the file drawer, never to see the light of day.

6 Replies to “Experimental Pitfalls Are Generally Dumber Than We Imagine, Now With More Dogs”

  1. One coincidental encounter seems like insufficient evidence to question an entire field of study. And it certainly does nothing to counter the ample evidence produced by countless studies meant to asses the externality validity of laboratory experiments. Probably the most poorly reasoned and least compelling thing you’ve ever posted here.

    1. I think the takeaway is more of a question rather than a reject variety…and actually, I don’t even think I meant it as questioning results so much as non-results. If experimental subjects don’t take a task seriously (and I have other examples, which frustrates me on principle), this should in most cases bias the results toward the “no effect” side, in which case the results get stashed in the file drawer or are harder to publish or whatever, in addition to potentially being a Type II error. If there are steps that experimenters can take to maximize the likelihood that subjects are putting forth full effort (and don’t discern what the task is testing, etc. for that matter), they can hopefully make their research budgets go further if nothing else.

  2. I will say that I attended an undergrad program that’s very big on experimental econ and I always took the experiments quite seriously. I frequently made $20-30/h doing so, and in one instance (by sheer luck) I made $80 in one sitting. I would think athletes to be less likely to take experiments seriously than the average student, but they were rarely — if ever — present in my sessions. I would guess they didn’t necessarily comprise the majority of participants in the study they participated in at Harvard, but if they did it would obviously affect the data.

    That said, any experimental economist will tell you that there are obvious issues with restricting your population to university students [regardless of whether they’re athletes] as they’re over-studied and unrepresentative of the general population regardless of whether or not they take the exercises seriously.

    1. And yet we experiment on our students all the time! 🙂 I don’ t think the lack of generalizability comes so much from the athlete thing (though there could be some endogeneity in who is drawn to particular sports) as that they are all guys.

    2. Wait I have a related follow-up question- did you ever figure out what the experiments you participated in were trying to test?

      1. There are a number of biases in addition to gender, but as I’m sure you know, they experiment on students to reduce research expenses. While I’m sure they’d love to test more representative samples, they have to balance endogeneity against budgetary constraints. You won’t find many experienced decision makers willing to stop by a college campus for the prospect of $20-30.

        I never got access to any published data, but I was always curious. As an econ student, every experiment involved an attempt to infer the treatment, but they generally did a good job of masking them.

Leave a Reply

Your email address will not be published. Required fields are marked *