UPDATE:I had planned to make this a two-part series, with the mathematical details in the second part. Instead I'm going to turn the whole thing into a paper. I'll post more when the paper is written. Original post follows.
Canadians love them some Roll Up the Rim, the annual Tim Hortons coffee contest. They also love to tweet about it. And if ever there was a great experiment to see how people perceive -- and report -- random events, Roll Up the Rim and Twitter are it: known odds, random selection, and mass participation.
Last year around this time, I wrote a quick program that accessed Twitter's API, and extracted all available public tweets with the hashtag "#rolluptherim" (thanks twitter4j). I gathered 876 tweets covering Friday, March 18, to Wednesday, March 23, 2011. They were then promptly stored and forgotten about until a couple of weeks ago, when Roll Up the Rim began again in earnest.
So what do people tweet about, when they tweet #rolluptherim? For one thing, they tweet about their success rates: of the 876, I extracted 387 tweets containing something like "1/8", "3 for 10", and so on.
So here's my question: how do people report their own success?
Statistically speaking, what should we expect to happen? This is basically a game of chance, and suppose in the middle of such a game, we took a survey of players and asked them to report their cumulative distribution function (CDF) value. That's a very fancy way of saying: the fraction of players with no more wins than you, on average.
For instance, the winning odds for Roll Up the Rim are 1/6. So if you and all your friends play 100 times, and you win 18 times, you would report "0.4948"-- as 49.48% of your friends would have 18 or fewer wins, on average.
Does this reflect how people tweet about #rolluptherim? Probably not: if you have a great (or terrible) run, you might be more likely to tweet about it, compared to average luck. But let's see. I took the 387 tweets with success rates and threw out all reports with fewer than 10 trials; that left 294 tweets. Then, I calculated the CDF values for each tweet, and plotted the following histogram.
But the huge spike at the right side is worth looking at: these tweets are reporting much higher success rates than expected. For instance, to land in the bar at far right, you have to be winning more often than 95% of the population: on 10 trials, that's 4 wins; on 40, it's 11 wins. A run of 11/40 doesn't seem so outlandish -- it's only a bit more than one in four! Yet the more you play, the closer to one in six you have to be; it's the law of large numbers. If you had the money and the bladder to play 1000 times, you would be in the 95% bin at 186 wins -- not much greater than 167, which is one in six of 1000.
So what's going on here? There are a couple pieces of evidence that point to an answer.
First, out of the original 387 tweets, 41 were from people who played Roll Up the Rim at least 40 times. Of these, 18 reported a CDF value of over 95% -- that is, nearly half of them -- and of those, 16 reported a value over 99%. Maybe only those who are doing some serious winning have the patience to report their odds on twitter. Or maybe the longer you play, the more you exaggerate your success rate.
Which brings me to the second bit of evidence: one tweeter reported 19 wins in 24 plays. This outcome is more or less impossible -- the odds of getting at least 19 wins on 24 plays are roughly one in 34 billion (that's billion with a b). Put another way, if the current Canadian population played 24 cups every year in Roll Up the Rim, an event like this would happen about every 1000 years. So 19/24 has got to be an exaggeration. (The only other logical explanation is a serious problem in evenly distributing the prize cups.)
If this is right, it isn't really surprising. We exaggerate our stories all the time, maybe even subconsciously, to make them more interesting. And twitter isn't a court of law -- there's no duty to be completely accurate. But it's perhaps unwise to think you're getting the whole truth from tweets.