Noahpinion: Why predict percentages?

Thursday, November 01, 2012

Why predict percentages?

"I'll bet you $20 that there's a 70% chance that Obama will win the election."

That's a bet nobody will ever collect on, because it's impossible to verify a percentage chance of something. So why do forecasters like Nate Silver - and any bookie or oddsmaker - say that there's a 70% chance of Obama winning? Why don't they just make an up-or-down prediction?

The answer is: Those percentages give you information to the extent that you believe in the forecaster. If you believe that Nate Silver's model is the best available forecast of the election results, then you believe that the odds he gives are the "fair" odds. Knowing the fair odds will help you hedge properly against the chance that Obama or Romney will be elected president, say for example if you have a business whose livelihood depends on policy. It might also help you make a buck on InTrade, especially if you believe that things like market manipulation can make those prediction markets temporarily inefficient.

But will those odds tell you whether to believe in a forecaster? Surprisingly, the answer is "maybe". The main way to ascertain how good a forecaster is is to observe a repeated sample - just watch the forecaster make 100 forecasts (of who will win, not of what the odds are!), and observe how often (s)he is wrong. This is the main way that you tell how good a forecaster is. But it isn't the only way. If you look at the forecaster's odds and find that they move in predictable ways, then you know that the forecaster could have done a better job. After all, why predict a 70% chance today when you have good information telling you that it will change to a 78% chance tomorrow? Just predict a 78% chance today!

So the odds can be useful in evaluating forecasters, as well as in making use of forecasts.

25 comments:

Unknown1:03 PM
Look at something like the greatest thing in the world, KenPom. Pomeroy's model, like Silver's, gives probablistic predictions of college basketball games. Now, there are a lot more basketball games in any year than there are elections, but KenPom checks by seeing if his 50-55% favorites are winning 50-55% of their games, that the 70% favorites win 70% of the time, and so on. If we're patient, the way to test Silver is to see if he's hitting his percentages.
ReplyDelete
Replies
Jolly Green2:08 PM
Starting to look more and more like I'm gonna owe you and Chris beers on Nov. 7th. But I prefer to think of our bet as properly hedging against even a small chance of crushing depression on that day.
ReplyDelete
Replies
Absalon3:31 PM
Silver is making predictions for individual states so you can check the quality of his prediction by comparing outcomes with his predictions for the individual states. The electoral college prediction is just an (appropriate) aggregate of the predictions for the individual states.
ReplyDelete
Replies
Seth4:49 PM
I think bookies' odds are actually a fairly direct outcome of their hedging activities. ( Here's a write-up , though I seem to recall having seen a rather better one elsewhere: http://e-sportbets.org/how-does-your-bookmaker-calculate-its-odds/ ) That has a lot in common with dynamic hedging strategies used in securities trading.

So what Silver is trying to do is not really the same as what a bookie does. He's more concerned with balancing actual voter choices (as told to pollsters) as accurately as possible rather than balancing bettor's *predictions* of voters' choices.
ReplyDelete
Replies
Simon10:29 PM
Noah, I was very sad to see that your previous post on entrepreneurship did not have any Easter Eggs in its photo accompanying the post.

Good to see you are back on your game here!
ReplyDelete
Replies
Jun Okumura2:55 AM
Mr. Smith:

I remember reading somewhere that Silver got 49 out of 50 states right in 2008. My admittedly arithmetically-challenged brain tells me that if that were indicative of the long-term accuracy of his model, the model must have made an unbelievable number of in-the-bag calls and made them stick. Here's an illustrative example that I made up:

His model calls 45 states 100%-0% and the remaining 5 states 80%-20% for one candidate or other, which brings the overall hypothetical accuracy of the calls to 98%, or what the model's actual record turned out to be.

Is the model so powerful, are states that easy to call (yes I realize that many states are deeply blue or red), are the results anomalous, or is my reasoning off by a mile (or, as we call it here, 1.6 km)?

ReplyDelete
Replies
maxhtimmons3:56 AM
@Jun Okumura
Results in each state are strongly correlated so if there are a number of states where one candidate is a 80% favorite, then there is a pretty good chance that they will all go for the favored candidate. This also implies that most elections will be predicted be very well but if the model misses it will by a lot. Obviously luck is needed for states that are true toss-ups on election day too. A better metric for judging performance than number of states called correctly would be the product of the probabilities assigned to the outcome of each state.
ReplyDelete
Replies
Nick Rowe5:21 AM
Noah: there must be a better way, that works even if the forecaster only makes one probabilistic prediction per event. My econometrics is cr*p, but wouldn't something like this work?

Suppose two weather forecasters give a probability of rain tomorrow as P(t). Let R(t) be a dummy for rain tomorrow, where R(t)=0 if it rains and R(t)=0 if it doesn't rain. And we have a long sample of R(t) and P(t) for both forecasters.

Estimate a regression R(t) = a + b.P(t) for each forecaster

The forecaster who gets a bigger b and smaller a is a better forecaster??? Or something like that???

Econometricians must have solved this problem (I hope). That's why God invented econometricians, to solve problems like this, that people like me puzzle over but can't figure out. If econometricians can't figure this one out for me, why do they exist?
ReplyDelete
Replies
s jay7:49 AM
Data from Google's internal prediction market and its relevance to probabilities...
http://www.cooperationcommons.com/cooperationcommons/blog/howard-rheingold/87-googles-internal-prediction-markets
ReplyDelete
Replies
s jay8:02 AM
As long as you have an Intrade-type market for predictions, you can measure the accuracy of any forecaster by using their predictions to make money (long Silver, short Intrade, or vice-versa). If Silver is systematically better than Intrade, over time, Intrade will converge on Silver or it will be possible to make an arbitrarily large amount of money without taking commensurate risk, something markets tend to prevent (magically!?) because otherwise the market will be destroyed (irreparably destabilized).
ReplyDelete
Replies
Alan Goldhammer8:05 AM
Silver has been pretty transparent about his methodology and his new book is a pretty good read even if one has a good grasp of probability and finite mathematics. He also has a good data set of past predictions that go beyond just the 2008 presidential race including the Senate races and 2010 midterm elections when he called the Republican sweep of the House. What he does is slightly different from what a bookmaker does. The bookmaker sets preliminary odds of an occurrence happening and then adjusts those odds depending on the wagers coming in so that he/she can assure that a profit will be made. Silver is looking at data streams and is agnostic about the outcome since there is no financial impact on his decision (unless Joe Scarborough takes him up on the charity bet).

As we know, his probability percentage is somewhat complicated as it necessarily relies on the combinations/permutations to get to 270 electoral votes. Once all of the certain state outcomes are discarded, one is still left with 7-10 states whose outcomes will determine the result. Today Silver notes that Obama has just over 80% chance of winning. One can judge him on this prediction as well as the individual predictions of each state. He could get the overall election call correct and still miss the results in some states.

None of this is rocket science and a lot of us have been doing this same thing in looking at baseball statistics and outcomes where there is a huge data set. It's safe to say that new metric models for baseball performance have been more successful than economic models in recent years.
ReplyDelete
Replies
Roger Gathmann8:22 AM
The idea here isn't quite right, since Noah is operating from a win/lose perspective, while the percentages in Noah's model don't just tell you who will win, but how they will win. As Silver has pointed out, if Obama won California by 20 percent less than Silver has him down for, and Romney won Colorado by 5 percent more than Silver has him down for, he'd be less wrong about Colorado than about California.
ReplyDelete
Replies
Davyde12:10 PM
I want a Minimum Squared Error function for Nate Silver.
ReplyDelete
Replies
Anonymous10:45 PM
I'm gonna get picky on you here, Noah.

You mean probabilities, not percentages. E.g. "70% _chance_."

Percentages are just a different notation for ratios. Ratios can measure lots of different things besides probabilities.
ReplyDelete
Replies
theta5:05 PM
Probabilities imply fair odds for betting on outcomes. Traders express their views in a similar but better way, in the form of a two way market. In this case for example 69/71, meaning I am willing to "buy" the outcome at odds of 69% and "sell" at 71%.
ReplyDelete
Replies

Add comment