Sunday, April 21, 2013

Why did Reddit get the wrong guy? (Or: the Wisdom of Crowds vs. the Madness of Mobs)



Short answer: It didn't. Or more accurately, we'll never know if it did, because we don't really have a way of knowing what "Reddit" thinks, only what some people on Reddit seem to think.

Long answer: OK, let's back up. When the Boston Marathon bombing manhunt began, there was a Reddit forum (subreddit) devoted to finding the bombers. A lot of people had high hopes for this effort. But the main "suspect" to emerge out of Reddit was a guy named Sunil Tripathy, who had no relation whatsoever to the bombings. Meanwhile, in about the same amount of time, police found the real guys, Tamerlan and Dzhokhar Tsarnaev. If you're interested in the details of Reddit's epic fail, see here. (And more here.)

Which brings us to the question, which someone asked me on Twitter: Why, exactly, did Reddit whiff so badly?

In recent decades, we've heard a lot about the "wisdom of crowds". James Surowiecki, who wrote an excellent book on the topic, mentions things like the stock market's identification of the reason for the Challenger disaster, or the ability of a group of non-experts to collectively outguess an expert on questions like "How many jelly beans are in this jar?". More recently, we've learned that prediction markets are more accurate than polls at predicting election outcomes, and in fact that they beat sophisticated "expert" forecasts in many situations. Companies have experimented with internal prediction markets to tap the collective wisdom of their employees. In general, we have come to believe more and more in the ability of large groups of non-experts relative to the ability of small groups of experts.

Should that belief be challenged by the Sunil Tripathy fiasco?

Not necessarily. The key is that the "wisdom of crowds" may work very well in some cases, while in other cases it may give way to the "madness of mobs". We don't know exactly which case is which, but we do have a general idea what sets them apart. Surowiecki summarizes it well in his book, in fact.

Basically, when we have a method for aggregating the information of diverse independent individuals, crowds will perform very well. When the individuals in a crowd coordinate, however, diversity and independence breaks down, and crowds can pounce on the wrong answer.

We see this in finance experiments. A number of experiments, including classic work by Charles Plott, have established the ability of financial markets to aggregate the private information of diverse participants to arrive at the "right" price. However, other experiments, e.g. by Colin Camerer, have shown that when people pay attention to the actions of others instead of to their own private information, then information can become "trapped", and markets can arrive at the wrong price. There are a number of different theoretical reasons why herd behavior might take over from efficient information aggregation; some of these are "rational" explanations and others are "irrational", but they all rely on individuals having some reason to ignore their private information and focus on what other people do.

You can definitely see the herding dynamic at work in the case of the Sunil Tripathy fiasco. A few guys started saying "It was Sunil Tripathy!" And a lot of other people on the subreddit started focusing on that name, and looking for information about Tripathy. The Tripathy idea was a wrong idea that was initially concentrated among a small group of individuals, who pushed that idea loudly and confidently. Meanwhile, a large number of people on the subreddit may have had small, weak pieces of information pointing to the Tsarnaev brothers. But since Reddit had no way of collecting and aggregating these dispersed small pieces of information, it might have become "trapped", just like in a Colin Camerer experiment.

So let me return to the "short answer" at the beginning of the post. It's not really right to say that "Reddit" picked Sunil Tripathy. Some people on Reddit picked Tripathy, and their voices emerged loud and clear from the chaos, not because most people agreed with them, but because they were the loudest and most strident minority voice. So anyone paying attention to Reddit picked out a few shrill cries of "Tripathy!" rising above the cacophony, and concluded that this was Reddit's consensus verdict. Meanwhile, the attention of other Redditors was turned toward Tripathy, and they spent their time and effort evaluating the Tripathy hypothesis instead of generating alternative hypotheses.

In other words, because it had no way of aggregating information, Reddit became less like a prediction market and more like a lynch mob.

Would Reddit have done better if people could have voted on who they thought did it? I doubt it, because the set of hypotheses was not properly mapped. In an election prediction market, you know the set of candidates. In a jellybean jar contest, you know the set of numbers of jellybeans that might be in the jar (i.e. the real line). But a "whodunit" poll can't list every human being as a potential culprit; it has to limit the choices to a few popular hypotheses. In Reddit's case, a poll would have included 1. Tripathy, and 2. Someone Else. Not very helpful. A prediction market would have suffered from the same problem.

So is there any hope for crowdsourcing terrorism investigations? I think that there already is such a method: Police tip hotlines. Tips tend to be independent, since people usually don't know who else is calling in a tip. And in a high-profile case like a terrorist attack, people who call in tips tend to be fairly diverse, since so many different kinds of people are paying attention. Finally, police can tabulate the number of similar tips, which is a method of aggregation. So tip hotlines satisfy the loose, general criteria for the "wisdom of crowds" to overcome the "madness of mobs". I think it's no coincidence that in the Boston bombing case, a victim's tip ended up being hugely helpful to the police.

Anyway, it's worth pointing out that these criteria for "crowd wisdom" aren't clear-cut. How do you know how independent and diverse a crowd's members are? What is the optimal method of aggregating their beliefs? This is a large, important, open area of research. So have at it, smart people. Just don't pay too much attention to what others in the field are doing...

31 comments:

  1. "including classic work by Charles Plott, have established the ability of financial markets to aggregate the private information of diverse participants to arrive at the "right" price."

    Excel sheet please.

    ReplyDelete
    Replies
    1. Heh. Those experiments are very basic and have been replicated by many a grad student. In fact I'll be doing that soon, in the course of a different experiment.

      Delete
  2. I think you are overstating the case for the "wisdom of the crowds" somewhat. Its a very old hypothesis (I believe it comes from a mathematician in the 1700s) and has never really been decisively proven by evidence. Recent prediction markets have basically been shown to be about as good as professional forecasts, which isn't surprising since those are probably the forecasts people use to make the bets. But on the other hand they have vulnerabilities that professional forecasts don't--for example, during the last election there was "underdog bias" overestimating the chances of whoever was in second place, as well as pretty clear instances of political tampering, where apparently wealthy individuals invested massive sums trying to influence the market prediction.

    Also, perhaps the problem with Reddit was that it didn't have a way to weight people's opinions--ideally, people with better evidence would be given more weight than people with less evidence. Prediction markets do this naturally since people who are more confident will bet more money.

    ReplyDelete
    Replies
    1. I suggest you take this up with Justin Wolfers!

      Delete
    2. Anonymous7:02 PM

      "I think you are overstating the case for the "wisdom of the crowds" somewhat. Its a very old hypothesis (I believe it comes from a mathematician in the 1700s) and has never really been decisively proven by evidence."

      It comes from Francis Galton around the late 1800s when he was trying to show that people in large groups were idiots (like a primitive statistical Arrow's theorem; Galton may have been inspired by the first "Democracy is inconsistent paradoxes" that various French rationalists had bandied a couple decades earlier). Anyway, he noticed that a bunch of people's guesses on the weight of a pig or some such thing were when *averaged*, much closer to the weight of the pig than most individual's guesses. The "scientifically proved" bit is a little vague, but there are literally thousands of "information/preference" aggregation papers out there.

      "Recent prediction markets have basically been shown to be about as good as professional forecasts, which isn't surprising since those are probably the forecasts people use to make the bets."

      Hanson has written a few papers on why, 2012 election and Silver vs Intrade notwithstanding, this is not generally the case. And finally,

      "Also, perhaps the problem with Reddit was that it didn't have a way to weight people's opinions--ideally, people with better evidence would be given more weight than people with less evidence. Prediction markets do this naturally since people who are more confident will bet more money."

      Do you reddit? Reddit has a system of up and down votes on posts that weights the best replies up to the top. It didn't work!

      Delete
    3. Re your last point, I don't think crowdsourcing weights is an ideal weighting system and certainly doesn't satisfy weighting those with better evidence relative to your typical zerohedge devotee.

      Delete
  3. Reddit fails the 'wisdom of the crowds' criteria, since the observations are not independent. This should give the police hotline an advantage. If the crowd starts doing the aggregating then groupthink and all the other biases of professional forecasters will creep in. Seems for this crowed sourcing to be useful it has to be separate from other information and their has to be some cost to providing information. Maybe it is cost enough to cut off the fame and glory of everyone paying attention to your prediction....feedback loops here are important.

    ReplyDelete
    Replies
    1. This is exactly right. People were not so much aggregating information, as they were independently discovering and becoming convinced by a very limited amount of information that Sunil Tripathi was the bomber. There were *some* people who added more information, but mostly people were just reading it and commenting without doing any real aggregation. Commenting on Reddit is much more like commenting on a blog than like contributing to Wikipedia. Actually, this error was much more like people being duped by the New York Post into thinking it was somebody else (which happened!) than assembling bad information en masse.

      See my comment below about what exact information people were looking at on Reddit that convinced them it was Sunil. It is not a huge volume of information! Certainly not capital-C Crowd-worthy.

      Delete
  4. Does a tip hotline really count as "wisdom of the crowds"? Tips are really aggregated; an individual tip is either useful or not. They aren't averaged out in any way.

    ReplyDelete
    Replies
    1. D'oh. I meant "Tips aren't really aggregated".

      Delete
    2. I'd count a tip hotline as "wisdom of the crowds". After all it collects information from everyone willing to share.

      The interesting question in any given case is whether there is anyone with relevant information.

      Delete
    3. But collecting information isn't enough to make it "wisdom of the crowds".

      In the stock market example, everyone's decision to buy or share counts toward the final price. In the jelly bean example, you average everyone's guess.

      In the tip hotline, you dismiss obviously stupid tips, follow up on others that don't pan out, and hope that one (or at most a few) tips will give you the information you need. That's like pointing to the earnings of Buffett or the winner of the jellybean contest and calling that "the wisdom of crowds".

      Delete
  5. Reddit is a market design.

    For killing spam, making latest funny cat pictures rise to the top, works pretty well.

    I don't think this is necessarily a problem that can't be crowd-sourced, but you need a different market design to reliably scan and vet a massive number of photos for solid evidence. As you point out, part of the 'wisdom of crowds' is to have a large number of independent evaluations/observations. Also, something a little more Wikipedia-ish to find potential avenues and gradually tease out as much as can be known about them, without descending into witch-hunts. And somehow have to get people to not see what they want to see, or promote the reliable people and bury the trolls and witch-hunters.

    ReplyDelete
  6. I think the key determinant for crowd predictions is the error amount.

    For a "guess the jellybeans" type of question, the error distribution is simple -- people will not guess wildly wrong amounts. For a "find the bomber" question, the error level is tougher. Ideally, the crowd would have found every person that had a backpack and then ranked them by probability of guilt.

    For crowd guesses of other events, the error function can get very complex.

    ReplyDelete
  7. I read some of the reddit threads. The strangest thing to me was that every new photo not from the FBI was immediately declared by several people to be a fake. None of them actually were fakes, of course.

    ReplyDelete
  8. Anonymous7:13 AM

    Did they have amy pictures that had the right guys in them? Lacking that, it would be hard to get them... They just worked with what they had.

    ReplyDelete
  9. As Asimov "explained" in his series Foundations, psychohistory works only if people are unaware of its existence. Same for market wisdom, it does not work because people think it works.

    ReplyDelete
  10. Problem solved once the geo position and timestamps of photos become certified and easily accessible.

    I'm sure there are other unintended consequences, but that's technology.

    ReplyDelete
  11. This is under-stating Reddit's case. The evidence they had was actually not that bad. For example:

    1) Sunil looks like Dzhakhar in the photo released by the FBI. Not all photos, certainly not the more recent ones, but in THAT one they look similar.

    2) Shortly after Sunil disappeared from Brown University there were several reports of explosions in Hanover, RI (not far from Brown). They were IEDS (seriously), and several of them were left behind.

    3) Sunil's family's Facebook page about his disappearance disappeared from Facebook, shortly after the Tsarnaev brothers' high-speed chase began.

    4) Several sources claimed they heard Sunil on the Boston PD police scanner. Many people on social media were listening to the scanner at once, during the car chase, shootout, and lockdown. I was one of those people, though I did not hear his name. I did however follow the many people on Twitter who were transcribing the scanner into Twitter (this makes for fascinating reading btw, see for example @BlogsOfWar's tweets from that evening.) The claim, which went unrefuted by anyone listening to the scanner, was that Sunil Tripathi was mentioned on the scanner along with another (unknown, possibly non-existent) person, Mike Mulugetta, whose name was (so they said) spelled out on the scanner letter by letter.

    5) The FBI has been investigating Sunil's disappearance since mid-March.

    ...now, is any of that proof? Of course not. But it was all sufficiently suspicious that it didn't seem crazy to think it was Sunil. I think #1 by itself was enough to convince a lot of people it was him, and #2 and #5 sealed the deal. #3 and #4 were viewed as confirmation of what many had already suspected.

    So maybe the term "Madness of Crowds" is a little unkind to Reddit users. It wasn't "Mad," it was just "Wrong." I think plenty of people come to wrong conclusions on the basis of much flimsier evidence (see for example some people's reaction to Reinhart-Rogoff!)

    ReplyDelete
    Replies
    1. Did you read the Atlantic piece (http://www.theatlantic.com/technology/archive/2013/04/it-wasnt-sunil-tripathi-the-anatomy-of-a-misinformation-disaster/275155/) ?

      It's absolutely hilarious.

      1. Mike Mulugetta was never mentioned.
      2. A "Mulugetta" (first name unknown) was mentioned, but not as a suspect.
      3. Sunil Tripathi was never mentioned.

      One guy decided to "scoop" the world by falsely claiming that the suspects were IDed. It was a fabrication.

      Delete
    2. Right. At the time people didn't know that. And again, by that point it was considered by most people already discussing whether it was Sunil to be confirmation of something they already thought they had good evidence of. Just the photos and the explosions around Hanover were enough for many users on Reddit.

      Delete
  12. PS here is an article from the Boston Globe, dated March 27, about those IEDS in Hanover: http://www.boston.com/metrodesk/2013/03/27/hanover-authorities-seeking-identify-person-who-detonated-explosives-injuries-property-damage-reported/ZGSwvBKPCRMkRkrQpBVFcI/story.html

    ReplyDelete
  13. Blue Aurora6:43 AM

    Yes, there was a lot of speculation on who were the culprits behind the 2013 Boston Marathon bombings...but to comment on part of the title of your post - "the Wisdom of Crowds vs. the Madness of Mobs"...couldn't it be argued that the two are really two faces of the same coin? Of course, I'm not expert on crowd psychology, and I presume that there are other researchers working on this matter, but what are the factors which cause the coin to land on one side or the other? (Badly phrased I know, but I hope I got the message across.)

    On another note, Noah Smith...do you have an SSRN account, or any interesting papers coming down the pipeline?

    ReplyDelete
  14. Very nice post! It is almost impossible imho to glean independent (and perhaps self-cancelling noise) in security market predictions, since pretty much "everybody" consumes the same sources of information - Bloomberg, Reuters, FT, WSJ and cnbc (which spews its own awe-inspiring quantities of rubbish.

    In this context, I think econ/finance researchers should make a clear distinction between predictive models and analysis. The former can be a great deal more difficult and likely futile, as opposed to the latter. Analytical models need not have much mathematics at all (read Keynes' letters, e.g. the ones to FDR, elegantly and persuasively stating the issues facing the U.S. economy). Great analysis of events can be predictive and prescriptive for future policy, as they would have almost surely been, if Washington and Wall Street had listened to Krugman, Stiglitz, Larry Summers, Christy Romer, Brad Delong, Martin Wolf and Warren Buffet, to name a few notables. But no. We listened to Geithner and Obama in the U.S(who I believe is a closet liquidationist) and Trichet, Merkel and Weidemann in Europe.

    ReplyDelete
  15. Anonymous9:01 AM

    What is a 'fair trial' in the USA?
    Perhaps more to the point - do some people not deserve a 'fair trial' because everyone (save one distraught father, perhaps, and maybe the odd conspiracy theorist) already 'knows' they are guilty?
    I write this not to suggest that an 'innocent' person is in custody and another dead. I write it in the spirit of philosophical enquiry regarding the US justice system.
    Here in the UK you could not get away with saying, as you have "Meanwhile, in about the same amount of time, police found the real guys, Tamerlan and Dzhokhar Tsarnaev" without being seen to have prejudiced the chances of any sort of fair trial, and possibly being prosecuted for having done so.
    So - let me ask again: do some people deserve a fair trial in the USA, while others do not, because everyone knows they are guilty?
    Where is your protection against real conspiracies if the whole country including its intellectuals is allowed to become a lynch mob?
    Just asking.

    ReplyDelete
  16. Anonymous,

    Thank you for your concern regarding our protection against those real conspiracies. I know I felt a chill run up my spine when I heard these two brothers referred to as "the real guys". No doubt all prospective jurors read a economics blog that deals with Goodhart's law and the limitations of the DSGE model, and now the chances of a fair trial in the case are basically nil. (Allegedly, I should add.) Yes, prosecuting someone for writing such a sentence is not in any way a totally overblown response unreconciliable with any reasonable system of justice. And more importantly, we now have to worry about all those false flag operations the US government will perform since even our intellectuals are just lapdogs willing to pronounce guilt based on no evidence whatsoever. Just like they did with OJ. (He was found guilty of those murders, right?)

    Also, thanks for throwing in the "Just asking" part. And to think for a second there I thought you were being all morally sanctimonious!

    ReplyDelete
  17. Anonymous12:09 PM

    @Steve
    Of course I was being morally sanctimonious. I'm a Brit.
    But that is quite beside the point. You have wriggled round the issue with dry, scathing and amusing sarcasm but not actually addressed it.
    What is the meaning of 'fair trial' in the USA if people can name the guilty or 'guilty' parties in the meedja beforehand? And talking again of missing the point, believe me, Noahpinion is not all that specialist. He's widely quoted and tweeted. Why, even the great Krugman (bbhn)...

    ReplyDelete
  18. If media condemnation guaranteed a guilty verdict, then OJ and Casey Anthony would've been found guilty. I really don't think the use of the word "alleged" in the media is the only thing that makes a jury listen to evidence presented at a trial. They're going to do that no matter what bloggers and reporters think of the case.

    ReplyDelete
    Replies
    1. Anonymous10:16 PM

      Oh Steve...Talking of sanctimonious, that, now made for the second time, is a poor lawyer's argument. A point-scoring argument. A mere sales pitch. More to the point, it's a non-sequitur. Just doesn't address the issue. Of course if you are rich enough and/or well-connected enough you have at least a fighting chance of 'getting off' when there is real and good evidence against you, not to speak of public opinion, in many a country's jury-based justice system. It proves nothing about the chances of a fair trial outside of such examples.
      I was making a very minor philosophical point, touching on epistemology. And now we've turned it into a fight. I'm off, matey. You win?

      Delete
  19. Agreed, but it's not really fair to use this as a case of the wisdom of experts vs. the wisdom of crowds, when the information available to each was so radically different here, which I doubt was the case for the examples you sited.

    ReplyDelete
  20. Anonymous11:39 AM

    Very good point. Does anyone want a pretty good solution to this problem?

    I have one. It is coming out very soon. Within about a month. The first announcement will be on Delong's blog. If people don't like it fine. But I do hope they give it a try. There's a lot of wisdom (I believe) that went into designing it. Nothing hidden, nothing fancy, no AI bot to mess with our minds. Just a much better format.

    ReplyDelete