I got an email from Steve Landsburg with the subject line "krugman, me and you." I can't decide whether that counts as the sort of threesome I've always dreamt about...
I get daily emails from The Chronicle of Higher Education newsletter. Today's headline: "Academe Today: Professor Says His University Cares Little About Teaching." I had to stop for a second and confirm that I wasn't in fact reading The Onion.
CHAIRMAN BERNANKE. …Let me turn now to the economic situation. Boy, I think it has been a while since we were three and a half hours into the meeting before we got to the staff forecast.
MR. STOCKTON. The GDP is a little smaller than it was at the start of the meeting.
Or perhaps you prefer this one…
MS. YELLEN. …The residential housing sector has now shrunk so much that the only real assurance that it will ever stabilize seems to be the fact that construction spending cannot go negative. This is just about the only zero lower bound that is working on our side. [Laughter]
I don’t care what anyone says, Yellen is totally going on the list for next year’s humor session. =P
Let’s begin by working through a situation that has been quite popular recently…suppose that you show the following photo to 10 of your friends:
Now, let’s say that 8 of your friends saw the dress as blue and black and 2 saw it as white and gold. You’d probably feel pretty comfortable asserting that, if you were to poll more people, more would see the dress as blue and black than white and gold. What about if your sample had 6 reporting blue and black and 4 reporting white and gold? You’d probably think something along the lines of “yeah, I know the result isn’t split 50/50, but it’s not that weird to get a little away from 50/50 in a sample even when the population is divided 50/50.” Neither of these statements is particularly unreasonable…but where do you draw the line? What if your sample reports 7 for blue and black and 3 for white and gold?
This is what tests of statistical significance are supposed to help out with.* Wouldn’t it be nice to know how likely it would be that your sample would give a 7-3 vote if the population really were split 50/50? This is what a statistical “p-value” tells us. If that value is sufficiently small, we say to ourselves “self, you know what? It’s pretty darn unlikely that I would see what I’m seeing from my sample if the population were really split 50/50 on this issue- maybe it’s time to entertain the notion that more people think the dress is blue and black then think it is white and gold.” (In reality, I think the white/gold camp wins out, but this is my story, so just go with it.) This is what statistical hypothesis testing does.
Sounds pretty compelling, right? If so, then I hope for your sake that you aren’t a social psych researcher, since the Journal of Basic and Applied Social Psychology decided to ban statistical significance testing in all of th articles that it publishes. (For you Bayesians out there, they aren’t too happy with you either but are willing to consider your analyses on a case by case basis.) Okay, I get that the generally accepted practice of considering a finding with a p-value of 0.05 or less as significant and everything else garbage isn’t without it’s problems, most obviously that researchers have incentives to finagle their analyses to sneak in under this threshold, but what on earth are researchers supposed to do instead? (i.e. what is the counterfactual to statistical hypothesis testing? So meta.)
I have some suggestions:
Just look at your data- if your graph traces out the shape of an animal, count it as meaningful. Like this:
Just wave your hands and talk forcefully until people take your result seriously. (This seems to have worked for macroeconomists for a while now.)
Ask your pets- right paw = significant, left paw = not significant. (If you have a bird, you could use the result to line the cage and…well, you figure it out, since I can’t decide if bird crap indicates significance or the lack thereof.)
The downside I suppose is that none of these approaches really have the gravitas normally associated with scientific rigor, so I’m at a bit of a loss. Seriously though, I don’t understand what researchers are supposed to do instead- the article mentions something about descriptive statistics, but the point of the statistical analysis that I referred to above is to give some context as to whether differences in descriptive statistics are large enough to be worth paying attention to.
As I said, statistical analysis is not without its flaws, but there are a number of far less controversial and likely more productive steps that the journal could have taken:
Pre-registration of experimental trials- if there is a record of what was tried experimentally, then it’s more clear how many things were tried in order to get a result that looks “good.” (The American Economic Association has started doing this, but it’s not mandatory yet.)
Publication of p-values and confidence intervals- rather than just declare something as “significant” because it meets some arbitrary p-value threshold (results are often simply given a number of asterisks to indicate significance), explicitly show the likelihood that your result is due to random chance and give a range for where or error bars for point estimate results.
Publication of negative results- if journals published papers where the “null hypothesis” (i.e. the uninteresting hypothesis that the researchers are looking to refute) can’t be rejected, then researchers would have less of an incentive to fiddle with their analyses to make it look like they meet the threshold of statistical significance. This, coupled with pre-registration of experiments, would cut down on what is known as “publication bias,” or the tendency for readers to see only the studies that showed the result that researchers were looking for (while the other studies get put in the circular file or whatever).
I guess this makes me thankful that I’m an economist, since if I ever write a paper that reads “well, my cat and I think this result looks pretty good, how about you?” it will be because I wanted to and not because I had to.
* Yes, I know that this doesn’t have to do with causality specifically, but this same method is used for analyses that attempt to tease out cause and effect.
In a recent article, labor economist David Autor was quoted as saying that “If we automate all the jobs, we’ll be rich—which means we’ll have a distribution problem, not an income problem.” David, have you been reading drafts of my dystopian future economic science-fiction novel again?
The synopsis of my (mostly, but not entirely) hypothetical Hunger Games fan fiction goes something like the following: Imagine a world where technology has progressed to the point where one person can create all of the output that we have today (or maybe even then some) by pushing a really fancy version of the Staples easy button- you know, like this:
Would society in the aggregate be better off economically as a result? The answer is most surely yes- if we believe that most people would rather sit on a beach and have their work done for them than actually do the work, then the button basically has to be a boon to society overall. In this situation, people would have two options: one, they could be content with the current standard of living and continue sitting on the beach, or they could use their newly-acquired free time to produce new (hopefully) cool stuff for society. (You know you were just looking for the right moment to get into the artisan cupcake pick business.) Let’s, for the sake of argument, assume that people take option number one- this is what Autor meant when he said that technology would make us richer, since we would now have all of the stuff and all of the free time rather than just all of the stuff.
You’ve probably caught on by now, however, that the GDP button presents some interesting challenges for society. We are currently pretty much accustomed to distributing money in exchange for labor and capital- that’s how markets for the factors of production work. Under this regime, the guy/girl who owns the GDP button (Katniss Everdeen in my fan fiction, obviously, to make up for being from a poor family) would get all of the factor payments. (Hence the distributional problem that Autor referred to.) But this is where it gets sticky- payments from whom? If Katniss keeps pressing the button and people keep buying her output, eventually everyone but KAtniss is going to run out of money and not be able to buy the output anymore. Granted, this might not bother Katniss, since she has the button that makes stuff and therefore doesn’t really need your money. Everyone else, on the other hand, runs into a bit of a problem.
This problem is not technically insurmountable, so it’s not a given that everyone but Katniss is going to die of starvation. (I suppose this is where the narrative diverges from the parent books a bit.) That said, the problem isn’t the easiest to solve either. One option would be for Katniss to give away the stuff that she doesn’t use. This would prevent at least some of the starvation problem, but it would introduce new logistical problems- giving stuff away doesn’t exactly get said stuff to those who value it the most. For example, if I saw that, say, a motorcycle was being given away, I might go get it because hey, free motorcycle, but I don’t actually like motorcycles that much. Now, you might think that this would take a motorcycle away from a motorcycle enthusiast, but think this through a little more. Since I’m not the only person wandering around perusing free stuff, society would likely end up in a situation where the more popular free items have longer lines. (Just ask the people who tried to get free Krispy Kreme the other day.) This would solve the resource allocation problem to some degree, since I would be more likely than the motorcycle enthusiast to balk at the line for motorcycles, and it approximates a world where time is the currency used for resource allocation. On the up side, this seems pretty fair, since everyone has the same endowment of time. On the down side, we’ve now gone from a wondrous society on the beach to a society where we’re all waiting in line for our “free” stuff.
What if, instead, Katniss kept giving out the money that people pay her for stuff each period so that people can keep paying for stuff? Prices would adjust so that resources are allocated efficiently, so this approach seems promising (especially if Katniss realizes that this approach doesn’t make her worse off)- just don’t think too much about how the money gets allocated. Does each adult get the same amount? Do parents with more children get more? Do people with health issues get more than healthy people? These sorts of questions really highlight the fact that issues of fairness don’t have a “right” answer.
The world described above is obviously a very extreme version of what we actually see in society today and what Autor is referring to, but it presents a nice allegory for some of the issues that society is currently facing or worried about facing in the future. Specifically, how can companies continue to thrive by selling stuff to middle-class and working-class households if these households don’t continue to get the money to spend? Conversely, how can middle-class and working-class households thrive if they aren’t endowed with or have the means to develop the factors of productions that are valued in the economy? If capital (the analogue to the magic button) becomes the most crucial factor in production, how does society ensure that citizens have an endowment of capital equivalent to their current endowment of labor? (You have to admit that nature does a nice job of leveling the playing field by endowing most people with the ability to work.) At the risk of channeling Piketty, I will fully acknowledge that it seems like a shift towards capital being the main factor of production would have to, as per the allegory, be accompanied by some serious thought as to the heritability of wealth and even possibly a wealth endowment. (For comparison, consider that the median household is endowed with somewhere in the neighborhood of $1 million of labor “wealth,” assuming a 5% return and $50,000 median household income.)
Fun fact: Fifty Shades of Grey started out as Twilight fan fiction, so feel free to suggest some BDSM aspects of the narrative so that I can make a boatload of money, thanks.
Half Of Hollywood Test Group Screened Placebo Film
LOS ANGELES—Saying the methodology helps them ensure unbiased results in their marketing research, studio executives at Paramount Pictures confirmed that during a Hollywood test screening this week they showed half of all theatergoers a placebo film. “Instead of watching our authentic big-budget studio film, this randomly selected control group saw a movie that lacked any recognizable star, overt ‘high-concept’ premise, rapidly unfolding narrative, or extensive computer-generated effects, so that we could compare their reactions with those of the real movie’s viewers,” said Paramount production head Marc Evans, acknowledging that many members of the control group exhibited the same level of emotional gratification and entertainment as those who viewed the actual upcoming action-adventure blockbuster. “Such a double-blind screening method allows us to determine whether the thrills, laughs, and heartbreak experienced by audience members actually stem from Arnold Schwarzenegger’s performance in the Terminator sequel we have coming out this July, or whether they are simply the result of a placebo effect.” Despite poor findings that showed no significant improvement upon the placebo film, executives said they had already spent $170 million developing the franchise feature and would just give it a wide international release anyway.
Hmmm…would it be completely unreasonable to create treatment and control films in order to determine how much of a wage premium big stars should really get? Come on, an NSF grant would totally cover the cost of that…in related news, just a reminder that sophisticated data analysis is helpful because experiments aren’t always feasible or practical!
Sex toy injuries surged after ‘Fifty Shades of Grey’ was published
The number of Americans requiring emergency room care for injuries involving sex toys has approximately doubled since 2007, according to data from the Consumer Product Safety Commission. Much of that increase happened in 2012 and 2013, following the release of the wildly popular erotic novels in the Fifty Shades of Grey series. And the overwhelming majority of these injuries — 83 percent — require “foreign body removals.”
So do we need to be thinking about the negative public-health externalities of this particular phenomenon? I get the feeling that the headline writer chose her words carefully, but the article itself (and others like it) seem to reallllly want me to think, despite an offhand sentence to the contrary, that the increase in injuries is actually because of the book (or at least that the increase in injuries is a result of the increase in sex toy use that resulted from the publication of the book). But we nerds know better…
This suggestion is a particular form of the correlation versus causation issue known as the “post hoc ergo propter hoc” fallacy (not to be confused with the second episode of The West Wing)- namely, that just because one thing happens after another doesn’t imply that the latter happened because of the former. In this instance, we see a claim that sex toy injuries increased after the book was published, but we can’t conclude that the increase happened because of the book itself.
The fallacy is highlighted further when a more complete dataset is observed:
Now it is more clear that the increase was largely following what appears to be a longer term trend- in fact, the data suggest that perhaps the relationship goes in the other direction and the increased proliferation of sex toys and sexual experimentation lead to the book being published and getting so popular.
It’s worth keeping in mind, however, when causality matters and when it doesn’t- for example, my warning to be careful is relevant regardless of what is causing the increase in injuries. Also, having this data makes this scenario a tad less surprising:
Given the 87 percent figure mentioned in the article, I’m actually kind of surprised that this didn’t show up on the board.
I will be in you tomorrow to talk about the economics of The Simpsons. All I ask is that you have better weather than exists here in Boston, which is not a tall order. (Technically, I also request good coffee, so feel free to leave suggestions in the comments.) The details are as follows:
Presented by Florida State College Jacksonville
Thursday, February , 2015
3939 Roosevelt Blvd.
Free (or, as Dan Ariely would say, FREE!)
For tickets and more info, email Susan Reilly – sreilly at fscj dot edu