## Thursday, November 01, 2012

### Why predict percentages?

"I'll bet you \$20 that there's a 70% chance that Obama will win the election."

That's a bet nobody will ever collect on, because it's impossible to verify a percentage chance of something. So why do forecasters like Nate Silver - and any bookie or oddsmaker - say that there's a 70% chance of Obama winning? Why don't they just make an up-or-down prediction?

The answer is: Those percentages give you information to the extent that you believe in the forecaster. If you believe that Nate Silver's model is the best available forecast of the election results, then you believe that the odds he gives are the "fair" odds. Knowing the fair odds will help you hedge properly against the chance that Obama or Romney will be elected president, say for example if you have a business whose livelihood depends on policy. It might also help you make a buck on InTrade, especially if you believe that things like market manipulation can make those prediction markets temporarily inefficient.

But will those odds tell you whether to believe in a forecaster? Surprisingly, the answer is "maybe". The main way to ascertain how good a forecaster is is to observe a repeated sample - just watch the forecaster make 100 forecasts (of who will win, not of what the odds are!), and observe how often (s)he is wrong. This is the main way that you tell how good a forecaster is. But it isn't the only way. If you look at the forecaster's odds and find that they move in predictable ways, then you know that the forecaster could have done a better job. After all, why predict a 70% chance today when you have good information telling you that it will change to a 78% chance tomorrow? Just predict a 78% chance today!

So the odds can be useful in evaluating forecasters, as well as in making use of forecasts.

1. Look at something like the greatest thing in the world, KenPom. Pomeroy's model, like Silver's, gives probablistic predictions of college basketball games. Now, there are a lot more basketball games in any year than there are elections, but KenPom checks by seeing if his 50-55% favorites are winning 50-55% of their games, that the 70% favorites win 70% of the time, and so on. If we're patient, the way to test Silver is to see if he's hitting his percentages.

2. Starting to look more and more like I'm gonna owe you and Chris beers on Nov. 7th. But I prefer to think of our bet as properly hedging against even a small chance of crushing depression on that day.

3. Silver is making predictions for individual states so you can check the quality of his prediction by comparing outcomes with his predictions for the individual states. The electoral college prediction is just an (appropriate) aggregate of the predictions for the individual states.

4. I think bookies' odds are actually a fairly direct outcome of their hedging activities. ( Here's a write-up , though I seem to recall having seen a rather better one elsewhere: http://e-sportbets.org/how-does-your-bookmaker-calculate-its-odds/ ) That has a lot in common with dynamic hedging strategies used in securities trading.

So what Silver is trying to do is not really the same as what a bookie does. He's more concerned with balancing actual voter choices (as told to pollsters) as accurately as possible rather than balancing bettor's *predictions* of voters' choices.

5. Noah, I was very sad to see that your previous post on entrepreneurship did not have any Easter Eggs in its photo accompanying the post.

Good to see you are back on your game here!

6. Mr. Smith:

I remember reading somewhere that Silver got 49 out of 50 states right in 2008. My admittedly arithmetically-challenged brain tells me that if that were indicative of the long-term accuracy of his model, the model must have made an unbelievable number of in-the-bag calls and made them stick. Here's an illustrative example that I made up:

His model calls 45 states 100%-0% and the remaining 5 states 80%-20% for one candidate or other, which brings the overall hypothetical accuracy of the calls to 98%, or what the model's actual record turned out to be.

Is the model so powerful, are states that easy to call (yes I realize that many states are deeply blue or red), are the results anomalous, or is my reasoning off by a mile (or, as we call it here, 1.6 km)?

1. I actually don't understand what you're asking...

7. @Jun Okumura
Results in each state are strongly correlated so if there are a number of states where one candidate is a 80% favorite, then there is a pretty good chance that they will all go for the favored candidate. This also implies that most elections will be predicted be very well but if the model misses it will by a lot. Obviously luck is needed for states that are true toss-ups on election day too. A better metric for judging performance than number of states called correctly would be the product of the probabilities assigned to the outcome of each state.

1. "Results in each state are strongly correlated"

I thought about that, but then I got to thinking, 80-20 for candidate A over candidate B can't be strongly correlated with 80-20 for candidate B over candidate A, and that was more or less where I gave up.

2. That is definitely true but if you assume uniform swing which is a pretty decent approximation then the model will predict all or almost all of the states if the polling is good. In any case, the model doesn't have to successfully predict a bunch of independent events it just needs to predict two sets of events [each candidate winning all their 80/20 states].

3. Anonymous12:06 AM

Nate Silver's model actually takes into account correlations between states. I'm not sure what method he uses to do this, but I'm guessing he relies on some mix of historical correlations among poll results, demographic data, etc.

4. I now have a better idea of what's going on. I hadn't realized that Silver kept updating his odds to the last minute. If he did that and his performance hadn't improved, then it meant that there was something very wrong with his model.

But this raises for me another question. If I were using his model for practical purposes, I'd need to know its performance over time. For example, suppose it's Nov. 5 and I need to figure out where to concentrate my resources to maximize their impact on the electoral outcome. One of the variables that I'd consider would be the odds in the various locations that I'd be considering, no? I'm sure Nov. 6 involves different considerations since my resources would become less mobile as it nears Nov. 7, but again, odds would figure into my decisions. And so on.

The media focuses on Silver's perfect or near-perfect score because they have a different concern: maintaining the interest of viewers well after the ballots have been closed, when there is nothing, really, that voters and the campaigns can do but wait for the tally and all talk about surges, firewalls and the like are illusory at best.

8. Noah: there must be a better way, that works even if the forecaster only makes one probabilistic prediction per event. My econometrics is cr*p, but wouldn't something like this work?

Suppose two weather forecasters give a probability of rain tomorrow as P(t). Let R(t) be a dummy for rain tomorrow, where R(t)=0 if it rains and R(t)=0 if it doesn't rain. And we have a long sample of R(t) and P(t) for both forecasters.

Estimate a regression R(t) = a + b.P(t) for each forecaster

The forecaster who gets a bigger b and smaller a is a better forecaster??? Or something like that???

Econometricians must have solved this problem (I hope). That's why God invented econometricians, to solve problems like this, that people like me puzzle over but can't figure out. If econometricians can't figure this one out for me, why do they exist?

1. I mean: R(t)=1 if it rains and R(t)=0 if it doesn't rain

2. You can only ask that question if you have a good answer to the question of why you exist.

3. That'll turn out to be the same thing, I'm pretty sure.

4. Ben Johannson4:34 PM

Do silver or other statisticians in the public opinion forecasting realm ever use hindcasting as a method of tuning/testing the accuracy of their models? I certainly never hear about it if they do.

9. s jay7:49 AM

Data from Google's internal prediction market and its relevance to probabilities...

10. s jay8:02 AM

As long as you have an Intrade-type market for predictions, you can measure the accuracy of any forecaster by using their predictions to make money (long Silver, short Intrade, or vice-versa). If Silver is systematically better than Intrade, over time, Intrade will converge on Silver or it will be possible to make an arbitrarily large amount of money without taking commensurate risk, something markets tend to prevent (magically!?) because otherwise the market will be destroyed (irreparably destabilized).

11. Alan Goldhammer8:05 AM

Silver has been pretty transparent about his methodology and his new book is a pretty good read even if one has a good grasp of probability and finite mathematics. He also has a good data set of past predictions that go beyond just the 2008 presidential race including the Senate races and 2010 midterm elections when he called the Republican sweep of the House. What he does is slightly different from what a bookmaker does. The bookmaker sets preliminary odds of an occurrence happening and then adjusts those odds depending on the wagers coming in so that he/she can assure that a profit will be made. Silver is looking at data streams and is agnostic about the outcome since there is no financial impact on his decision (unless Joe Scarborough takes him up on the charity bet).

As we know, his probability percentage is somewhat complicated as it necessarily relies on the combinations/permutations to get to 270 electoral votes. Once all of the certain state outcomes are discarded, one is still left with 7-10 states whose outcomes will determine the result. Today Silver notes that Obama has just over 80% chance of winning. One can judge him on this prediction as well as the individual predictions of each state. He could get the overall election call correct and still miss the results in some states.

None of this is rocket science and a lot of us have been doing this same thing in looking at baseball statistics and outcomes where there is a huge data set. It's safe to say that new metric models for baseball performance have been more successful than economic models in recent years.

12. The idea here isn't quite right, since Noah is operating from a win/lose perspective, while the percentages in Noah's model don't just tell you who will win, but how they will win. As Silver has pointed out, if Obama won California by 20 percent less than Silver has him down for, and Romney won Colorado by 5 percent more than Silver has him down for, he'd be less wrong about Colorado than about California.

1. That's right. I didn't mention predictions for vote shares, only for win/lose events.

13. I want a Minimum Squared Error function for Nate Silver.

14. Anonymous10:45 PM

I'm gonna get picky on you here, Noah.

You mean probabilities, not percentages. E.g. "70% _chance_."

Percentages are just a different notation for ratios. Ratios can measure lots of different things besides probabilities.

15. theta5:05 PM

Probabilities imply fair odds for betting on outcomes. Traders express their views in a similar but better way, in the form of a two way market. In this case for example 69/71, meaning I am willing to "buy" the outcome at odds of 69% and "sell" at 71%.