Noahpinion: Bayesian vs. Frequentist: Is there any "there" there?

Sunday, January 27, 2013

Bayesian vs. Frequentist: Is there any "there" there?

Hey! Don't gamble on your statistics! SNAKE EYES...

The Bayesian/Frequentist thing has been in the news/blogs recently. Nate Silver's book (which I have not yet read btw) comes out strongly in favor of the Bayesian approach, which has seen some pushback from skeptics at the New Yorker. Meanwhile, Larry Wasserman says Nate Silver is really a frequentist (though Andrew Gelman disagrees), XKCD makes fun of Frequentists quite unfairly, and Brad DeLong suggests a third way that I kind of like. Also, Larry Wasserman gripes about people confusing the two techniques, and Andrew Gelman cautions that Bayesian inference is more a matter of taste tan a true revolution. If you're a stats or probability nerd, dive in and have fun.

I'm by no means an expert in this field, so my take is going to be less than professional. But my impression is that although the Bayesian/Frequentist debate is interesting and intellectually fun, there's really not much "there" there...a sea change in statistical methods is not going to produce big leaps in the performance of statistical models or the reliability of statisticians' conclusions about the world.

Why do I think this? Basically, because Bayesian inference has been around for a while - several decades, in fact - and people still do Frequentist inference. If Bayesian inference was clearly and obviously better, Frequentist inference would be a thing of the past. The fact that both still coexist strongly hints that either the difference is a matter of taste, or else the two methods are of different utility in different situations.

So, my prior is that despite being so-hip-right-now, Bayesian is not the Statistical Jesus.

I actually have some other reasons for thinking this. It seems to me that the big difference between Bayesian and Frequentist generally comes when the data is kind of crappy. When you have tons and tons of (very informative) data, your Bayesian priors are going to get swamped by the evidence, and your Frequentist hypothesis tests are going to find everything worth finding (Note: this is actually not always true; see Cosma Shalizi for an extreme example where Bayesian methods fail to draw a simple conclusion from infinite data). The big difference, it seems to me, comes in when you have a bit of data, but not much.

When you have a bit of data, but not much, Frequentist - at least, the classical type of hypothesis testing - basically just throws up its hands and says "We don't know." It provides no guidance one way or another as to how to proceed. Bayesian, on the other hand, says "Go with your priors." That gives Bayesian an opportunity to be better than Frequentist - it's often better to temper your judgment with a little bit of data than to throw away the little bit of data. Advantage: Bayesian.

BUT, this is dangerous. Sometimes your priors are totally nuts (again, see Shalizi's example for an extreme case of this). In this case, you're in trouble. And here's where I feel like Frequentist might sometimes have an advantage. In Bayesian, you (formally) condition your priors only on the data. In Frequentist, in practice, it seems to me that when the data is not very informative, people also condition their priors on the fact that the data isn't very informative. In other words, if I have a strong prior, and crappy data, in Bayesian I know exactly what to do; I stick with my priors. In Frequentist, nobody tells me what to do, but what I'll probably do is weaken my prior based on the fact that I couldn't find strong support for it. In other words, Bayesians seem in danger of choosing too narrow a definition of what constitutes "data".

(I'm sure I've said this clumsily, and a statistician listening to me say this in person would probably smack me in the head. Sorry.)

But anyway, it seems to me that the interesting differences between Bayesian and Frequentist depend mainly on the behavior of the scientist in situations where the data is not so awesome. For Bayesian, it's all about what priors you choose. Choose bad priors, and you get bad results...GIGO, basically. For Frequentist, it's about what hypotheses you choose to test, how heavily you penalize Type 1 errors relative to Type 2 errors, and, most crucially, what you do when you don't get clear results. There can be "good Bayesians" and "bad Bayesians", "good Frequentists" and "bad Frequentists". And what's good and bad for each technique can be highly situational.

So I'm guessing that the Bayesian/Frequentist thing is mainly a philosophy-of-science question instead of a practical question with a clear answer.

But again, I'm not a statistician, and this is just a guess. I'll try to get a real statistician to write a guest post that explores these issues in a more rigorous, well-informed way.

Update: Every actual statistician or econometrician I've talked to about this has said essentially "This debate is old and boring, both approaches have their uses, we've moved on." So this kind of reinforces my prior that there's no "there" there...

Update 2: Andrew Gelman comments. This part especially caught my eye:

One thing I’d like economists to get out of this discussion is: statistical ideas matter. To use Smith’s terminology, there is a there there. P-values are not the foundation of all statistics (indeed analysis of p-values can lead people seriously astray). A statistically significant pattern doesn’t always map to the real world in the way that people claim.

Indeed, I’m down on the model of social science in which you try to “prove something” via statistical significance. I prefer the paradigm of exploration and understanding. (See here for an elaboration of this point in the context of a recent controversial example published in an econ journal.)

Update 3: Interestingly, an anonymous commenter writes:

Whenever I've done Bayesian estimation of macro models (using Dynare/IRIS or whatever), the estimates hug the priors pretty tight and so it's really not that different from calibration.

Update 4: A commenter points me to this interesting paper by Robert Kass. Abstract:

Statistics has moved beyond the frequentist-Bayesian controversies of the past. Where does this leave our ability to interpret results? I suggest that a philosophy compatible with statistical practice, labeled here statistical pragmatism, serves as a foundation for inference. Statistical pragmatism is inclusive and emphasizes the assumptions that connect statistical models with observed data. I argue that introductory courses often mischaracterize the process of statistical inference and I propose an alternative "big picture" depiction.

55 comments:

Kindred Winecoff6:08 PM
I'm not a statistician either, but as I under it likelihood models are Bayesian models with (invariantly) flat priors, mathematically. Frequentist models are special cases of generalized linear likelihood models. So frequentists are likelihoodists who are Bayesians who a) don't know it; b) implicitly assume that they know nothing about the world when they specify their models (which is self-contradictory).

Bayesians frequently test their models to see how sensitive they are to the prior, so your concern above is usually not a problem in practice. Bayesians frequently use uninformative priors to "let the data speak", at least as a baseline. The point is that Bayesian models are more flexible, involve more realistic assumptions about the data generating process, yield much more intuitive statistical results (i.e. actual probabilities), and provide much more information about the relationship between variables (i.e. full probability distributions rather than point estimates).

That said, the differences between the two from the perspective of inference are usually minor, as you note.
ReplyDelete
Replies
Aziz6:21 PM
I think it is pretty indisputable that the Bayesian interpretation of probability is the correct one. Probability measures a degree of belief, not a proportion of outcomes.

P*≠P

The observed probability distribution (P) does not equal the real probability distribution (P*). In the nonlinear and wild world we live in, only continued measurement into the future can give us P* for any future period (generally, we underestimate the tails). In an ultra-controlled non-fat-tailed environment P can look a lot like P* (making the frequentist approach look correct) but even then P* may diverge from P given a large enough data set (one massive rare event can significantly shift tails).

Why does the frequentist approach survive? Because it is useful in controlled environments where black swans are negligibly improbable. But it should come, I think, with the above caveat.
ReplyDelete
Replies
Anonymous6:31 PM
The basic philosophy of Bayesianism seems to appeal to me more, just because you have to put your prior assumptions out there. Most of the critique that I've seen comes from having intentionally stupid priors, but questioning assumptions should be a big part of modeling.
ReplyDelete
Replies
Anonymous7:46 PM
I dont know how noah smith can think he is an economist... its so embarrasing, please bury yourself or put glue in your mouth
ReplyDelete
Replies
Matthew Martin7:49 PM
I think you've missed an important point about Bayesian statistics--essentially, choosing a prior lets the statistician to formally incorporate information we already know into the analysis. This prior knowledge could come from other research papers on the topic, prior stages in the same experiment (very useful in clinical trials) or maybe just intuitive logic. Frequentists do exactly the same thing, but the difference is that they aren't supposed to--it technically invalidates their results. Consider for example a placebo-controlled clinical trial. The treatment and placebo groups are never truly random--ethics dictate that we balance the two groups to look as similar as possible, because this will increase the statistical power and help us identify potential risks of treatment much sooner, potentially saving lives. At the end of the trial we get a frequentist p-value of, say, 0.05, but in reality this is wrong--we are pretty sure that because of balancing the two groups the real p-value is less than 0.05, but a frequentist has no way of knowing what it actually is. My understanding is that in Bayesian statistics this is no longer a concern--we know that our results are accurate given all the available information, including both the prior and the data.

Also, I mentioned the need for a clinical trial to do ongoing statistics to identify risks to the treatment group as soon as possible. Strictly speaking, this ongoing analysis also invalidates the frequentist results--continuation of the trial should be completely independent of the results of the trial, and not shut down when we have evidence of harms to the patients. Bayesian statistics, by contrast, allows us to do ongoing analysis without in anyway invalidating the results. And since Bayesian inferences incorporate both the prior information and the data, it can statistically identify risks to patients in the trial much sooner than can frequentist methods. In that respect, it could actually be considered unethical to rely on frequentist methods for human-subjects research that involves more than minimal risk.

More generally, I think the point needs to be made that frequentist probability theory is really just a subset of Bayesian theory but with lots of implicit assumptions about the prior that aren't necessarily justifiable.

That said, you are right--Bayesian statistics won't be able to tell us a whole lot that we didn't already know. For the most part it tells us the same things but with a purer internal logic.
ReplyDelete
Replies
Lord8:07 PM
If you weaken your priors due to lack of evidence, I posit you are a Bayesian. The Frequentist takes his hypothesis and data as fixed. If he chooses to alter them, he is doing another experiment. In this, a Bayesian is just a Frequentist doing multiple experiments in succession, often on the same data, whereas the Frequentist would be concerned the change in hypothesis might invalidate the data collection.
ReplyDelete
Replies
MC8:08 PM
Minor point: I've heard that one reason Bayesian statistics hasn't been used a lot more in economics is simply because, until the last 20(?) years or so, it was very hard to implement computationally. The falling cost of computing power has really opened things up now, because (I think) you can implement a lot of analytically intractable stuff numerically or by simulation. Now there's a bit of slow adjustment going on, as people trained in frequentist methodology update their skill sets. Anyway, that's what they're telling me in some of my econometrics classes. So, maybe Bayesian statistics hasn't had really had several decades to try and prove its superiority.

That said, it has seen a lot of use in the past decade, and I agree with you that it doesn't seem to be a game-changer.
ReplyDelete
Replies
Carlos8:18 PM
Noah,

There are at least two kinds of debate that look the same but are not. The philosophical Bayesian x Frequentist and the practical silly "Null Hypothesis Significance Testing (NHST)" x "Please, think about what the hell you're doing".

Nate Silver points most to the second debate. When he says frequentism, he is really saying silly NHST. Of course, some people get mad with that, because they claim the name "frequentist" to themselves and do not like when bad practice is associated with that name.

Now, why it is important to state the difference between this two debates?

Take your statement for example: "When you have a bit of data, but not much, Frequentist - at least, the classical type of hypothesis testing - basically just throws up its hands and says "We don't know.""

That is not true in practice.

When you have a bit of data, you usually do not reject the Null Hypothesis. And what do people do? They don't say "we don't know", they say that there is evidence in favor of the null, whithout ever checking the sensitivity (power or severity) of the test (needed in a coherent frequentist approach), nor, in Bayesian terms, the prior probability of the hypothesis...

So both would make an statement about reality with crappy data.

ReplyDelete
Replies
Carlos8:45 PM
Posting the comment that I have posted on Brad DeLong's:

When you see people doing significance testing in applied work, how often do you see they stating the sensitivity (power or severity) of the test against meaninful alternatives? (I'll answer that... from 80 to 90% do not even mention it, see: http://repositorio.bce.unb.br/handle/10482/11230 (portuguese) or McCloskey "The Standard Error of Regressions")

This is not because there are not papers teaching how to do it (at least approximately): e.g. see Andrews (http://ideas.repec.org/a/ecm/emetrp/v57y1989i5p1059-90.html)

And there are plenty of papers with meaningless debates going on with underpowered test, for example, the growth debate arround institutions x geography x culture...
ReplyDelete
Replies
Anonymous9:21 PM
Noah,

I would appreciate your moving from the abstract to the specific.

Slackwire has up a very interesting graph, on the decline in bank lending for tangible capital/investment?

http://slackwire.blogspot.com/2013/01/what-is-business-borrowing-for.html

You have written about the economy breaking in the early 1970s, which the graph confirms to my eye.

Is this graph Bayesian/Frequentist or Delong's third way or something else entirely
ReplyDelete
Replies
Eric L10:28 PM
If the probability of a hypothesis being true given the data is what you want to know, then presenting anything else (e.g. the probability that your data would be different if an alternative hypothesis were true) isn't being frequentist so much as it is being a bad statistician. However in a lot of cases neither of these questions are precisely what matters and you are really using statistics to get at something a little fuzzier -- have I collected enough data, am I taking reasonable steps to prevent myself from seeing patterns in noise, etc. Often when testing scientific hypotheses the precise probability is uninteresting but a significance test is important, and here either approach can help. In fact you would not want to publish a result that passed a bayesian test but failed a frequentist test, not because the conclusion is particularly likely to be wrong but saying "my experiment doesn't add much certainty but given what we already know the conclusion is quite certain" is not an interesting result.

I don't agree with this, though:

It seems to me that the big difference between Bayesian and Frequentist generally comes when the data is kind of crappy.

The difference also comes when the data on priors is good, and especially when the prior is lopsided. The latter may sound like a corner case, but it is the normal case in medicine and there are plenty of other cases in the real world. I (well, not me exactly) had a health scare recently, and it would have seemed much worse had we been presented with inappropriate frequentist statistics (the false positive rate of this test is low, so null hypothesis rejected with high confidence!) rather than the prior and posterior probabilities we were presented with (In the absence of this test you would be very unlikely to have this; given the result there is a 1 in 30 chance you have this.) Of course, DeLong's "value of being right" considerations applied here and we opted to do further testing to make sure we were fine, but it goes to show the difference can be big even with good data.
ReplyDelete
Replies
Clark Goble10:34 PM
That seems right to me. Although I'm not doing those sorts of analysis much anymore so I'm not sure my view's worth much. Seems to me in practice most scientists practice a staggering array of inconsistency when it comes to either epistemology or metaphysics. (And of course one needn't engage in the philosophical debate here) So just as say a typical physicist engages in an incoherent mix of instrumentalist, empirical and realist approaches to physics I suspect the average science (I'll even throw economists in there) engages in a mix of Bayesian and Frequentist approaches. As you say, often those more sympathetic to Frequentist approaches simply weaken their priors. At least that's what I see in practice although not always what is argued for.
ReplyDelete
Replies
Richard H. Serlin10:43 PM
My old adviser, Chris Lamereoux, was a big Bayesian, with some well known Bayesian papers. I talked with him about this years ago, about the obviousness of including important prior information, and he said the smart sensible thing; good frequentest statisticians and econometricians of course consider apriori information, but do so in an informal and less rigid way.
ReplyDelete
Replies
Ram Vangala1:06 AM
Let Pr(H) be our degree of belief in hypothesis H; Pr(E) our degree of belief in evidence E. Suppose we perform experiments that repeatedly demonstrate E. We may model this as Pr(E) -> 1.

Recall that Pr(H) = Pr(H|E)*Pr(E) + Pr(H|~E)*Pr(~E) [the law of total probability, restated in terms of conditional probabilities]. As Pr(E) -> 1, Pr(~E) -> 0, and Pr(H|~E) -> 0. Thus, Pr(H) -> Pr(H|E). In short, as we become very nearly certain of E, our degree of belief in H ought to condition on E.

Frequentists don't really have grounds for disagreeing with the above. Most frequentist procedures can be defended on Bayesian grounds, provided the appropriate loss function and prior, so you're correct that this is not a major practical issue for statisticians. The problem occurs when you're trying to teach a computer to learn from its observations. The only way to do frequentist inference sensibly is to implicitly be a reasonable Bayesian. Without making this explicit, though, a computer is not going to do frequentist inferences sensibly without a human going through its SAS output or the like.
ReplyDelete
Replies
Anonymous2:17 AM
That's nice and everything, but what's unclear to me is how our prior knowledge, usually vague and diffuse (otherwise there would be no point in further analysis), is supposed to translate into precise distributions with precise parameters.

BINGO. The practical differences are more political than anything else, possibly because Bayesians suffer from what I'll call 'Frequentist envy'. We've all seen this one a million times: some study that claims statistical significance with p set at - surprise! - 0.05. Queue much gnashing of teeth from the stat folks about how this runs against the grain of good statistical practices. Then the Bayesians jump in and start sneering about this thing called 'priors'. 'Priors'? Do you mean the application of domain-specific knowledge that could just as easily been done with the old way, and should have been? That's some mighty thin gruel there, yet that's what the debate really seems to come down to so often in practice.

Frequentists are well aware of Bayes theorem, btw, and use it quite routinely. Something that Bayesians like to pretend doesn't happen all that often. And most people I know use Bayesian methods when the situation calls for it and Frequentist methods likewise. They're just tools in the toolbox after all.
ReplyDelete
Replies
Michael Roberts2:45 AM
Ack, I tire of this old debate too. And I agree that there is far more heat than light to be found in it.

But you do kind of walk all over the underlying philosophies, as well as some practical issues.

A few points:

1.) Frequentists and Bayesians may have nearly identical uncertainty measures in most cases but they interpret them differently. This would see to have little practical difference in how the measures are applied.

2.) philosophical frequentists hate it when anyone uses their uncertainty measures as if they are Bayesian measures (which is typical); the converse is not true.

3.) Bayesian tools, like Makov Chain Monte Carlo, can be useful even if you're a frequentist at heart.
ReplyDelete
Replies
Anonymous3:18 AM
When I don't know my audience I go for the low-brow approach (not wrong, not confusing, and hopefully, not insulting), which is the case here.

So looking back over the comments, I think Carlos nails it at comments 8:18 and 8:45.

Yes, there are significant differences between the two approaches, and yes, there are times when one would clearly prefer one over the other. But 95% of the time what you see is meaningless bickering ;-) The Nate Silver thing falls into that 95%. Imho.

ReplyDelete
Replies
Alan Martin3:21 AM
For me the reason for the Bayesian surge has been computers. In my job as a Bioinformatician, I built a Bayesian generative model, the results of which are sold for 70 dollars a pop wholesale. In 2007 when I finished the work it took 15 minutes to run. This wouldn't have been possible with Bayes's own calculating methods (paper and pen).

The main reason for the extensive computer involvement is the calculation of likelihoods based upon multiple non-independent data sources that can't be dealt with linearly. So hidden-markov-models mixed with generative elements produce a far superior result.

None of this would be possible with just frequentist stats.

So I say that the reason for the rise of the Bayesians isn't some shift in opinion about how science or statistics should be done but that it's the only proper way to use the incidental data we have and it's only recently that we can actually succeed.
ReplyDelete
Replies
Colin Lewis5:19 AM
I think a key missing element was discovered by Mandelbrot with his research on fractals, which I will get to. But first what if the events cannot be precisely measured? In this case a frequentist interpretation of “proof” is in principle impossible, and we then become Bayesian using subjective data and whatever additional data which we 'deem' relevant to elements of the analysis to form a “prior.” In a review of Mandelbrot’s The Misbehavior of Markets, Taleb offers an interesting formula that he says: “seems to have fooled behavioral finance researchers.” He writes: “A simple implication of the confusion about risk measurement applies to the research-papers-and-tenure-generating equity premium puzzle. It seems to have fooled economists on both sides of the fence (both neoclassical and behavioral finance researchers). They wonder why stocks, adjusted for risks, yield so much more than bonds and come up with theories to “explain” such anomalies. Yet the risk-adjustment can be faulty: take away the Gaussian assumption and the puzzle disappears. Ironically, this simple idea makes a greater contribution to your financial welfare than volumes of self-canceling trading advice.” The pdf of the review is here (http://www.fooledbyrandomness.com/mandelbrotandhudson.pdf)

So my question is should we move beyond Bayesian and Frequentist when looking at probabilities and look at fractals otherwise we omit what Mandelbrot called “roughness.” In other words research focuses too much on smoothness, bell curves and the golden mean and if we look at roughness in far more detail will we will be able to provide greater insight to the matter at hand?
ReplyDelete
Replies
Unknown5:58 AM
I really think people should take a look at this Chris Sims' text, found at http://sims.princeton.edu/yftp/EmetSoc607/AppliedBayes.pdf

Also, the open letter by Kruschke is worth your while:
http://www.indiana.edu/~kruschke/AnOpenLetter.htm
ReplyDelete
Replies
Min6:14 AM
"Basically, because Bayesian inference has been around for a while - several decades, in fact"

How about centuries? ;)

The frequentist view was a reaction against the Bayesian view, which came to be perceived as subjective. What we are seeing now is a Bayesian revival. Since this is an economics blog, let me highly recommend Keynes's book, "A Treatise on Probability". Keynes was not a mainstream Bayesian, but he grappled with the problems of Bayesianism. Because the frequentist view was so dominant for much of the 20th century, there is a disconnect between modern Bayesianism and earlier writers, such as Keynes. From what I have seen in recent discussions, it seems that modern Bayesians have gone back to simple prior distributions, something that both Keynes and I. J. Good rejected, in different ways. Perhaps we will see some Hegelian synthesis. (Moi, I think that we will come to realize that neither Bayesian nor Fisherian statistics can deliver what they promise.)
ReplyDelete
Replies
Cameron Hoppe8:41 AM
Min is right. Bayesian probability has been around formally for at least 350 years, and the philosophical idea since before the days of Aristotle.

I can tell you from experience that Bayesian probability is way more important in the areas of quality control and environmental impact measurements. Noah touches on the reason. You can't assume your process is unchanging or that new pollutants haven't entered the environment. You have to assume they can and eventually will. Essentially, every sample out of spec has to be treated as evidence of a changed process.
ReplyDelete
Replies
Lulz4l1f38:49 AM
I liked Nate Silver's book. An unexpectedly good read. I received the book as a gift and didn't buy it myself, and I expected it to be a dry read, but it wasn't at all.

I think the book has the potential for a broad appeal, and when you consider the fact that books that revolve around topics like statistical analysis and Bayesian infeference are usually pompous, inaccessible, and dull beyond belief, I think Nate should be commended for writing something that makes these concepts accessible to a wide audience.
ReplyDelete
Replies
Anonymous9:54 AM
Noah, statistics is not science, they are lies, the worst kinds of lies (Mark Twain: lies, damn lies, and statistics).

Someone above, praising the Bayesian view, forgot to read the 2007 prospectus, which actually reads:

"2 Recent Successes

Macro policy modeling

• The ECB and the New York Fed have research groups working on models using a Bayesian approach and meant to integrate into the regular policy cycle. Other
central banks and academic researchers are also working on models using these methods

http://sims.princeton.edu/yftp/EmetSoc607/AppliedBayes.pdf, page 6

Thank you very much but that track record says it all about statistics. No thanks.

From the entire Western experience, statistics show only one thing about statistics---that statisticians lie, always selecting what they count, blah, blah, blah and how they analyze such to come up with the conclusion they were determined to reach
ReplyDelete
Replies
David Beckworth10:47 AM
Noah, probably repeating some other comments, but here goes.

First, Bayesian methods have been around a long time but only recently have emerged because computing costs have come down. It would be interesting to see whether there are now proportionally more papers using Bayesian methods since computing cost went down and if that trend is increasing or stabilizing.

Second, what Matthew Martin said above. Frequentist effectively do the same thing as Bayesians, but pretend otherwise. they build empirical models and throw variables in and out based on some implicit prior which never get reported in their write ups.

Third, my understanding is that Bayesian models generally provide better forecast than frequentist's models. (Not that either are spectacular).
ReplyDelete
Replies
Anonymous12:58 PM
Scanned quickly to see if some one commented on your picture at the beginning. Ha Ha, In The Beginning! My reptile brain rang up Einstein immediately and his comment that God does not roll dice.
I don't know jack about statistics. Is God a Bayesian or a frequentist?
ReplyDelete
Replies
Anonymous3:02 PM
Whenever I've done Bayesian estimation of macro models (using Dynare/IRIS or whatever), the estimates hug the priors pretty tight and so it's really not that different from calibration.
ReplyDelete
Replies
rosserjb@jmu.edu3:31 PM
Minor technical point on discussion of Shalizi. Infinity can actually make things worse for Bayesians, particularly infinite dimensional space. So, it is an old and well known result by Diaconiis and Freedman that if the support is discontinuous and one is using an infinite-sided die, one may not get convergence. The true answer may be 0.5, but one might converge on an oscillation between 0.25 and 0.75, for example using Bayesian metnods.

However, it is true that this depends on the matter of having a prior that it is "too far off." If one starts out within the continuous portion of the support that contains 0.5, one will converge.
ReplyDelete
Replies
Anonymous6:19 PM
In practice, everyone's a statistical pragmatist nowadays anyway. See this paper by Robert Kass: http://arxiv.org/abs/1106.2895
ReplyDelete
Replies
Unknown8:11 PM
Thanks for the post. As a statistician, I think it's nice to see these issues being discussed. However, I think a lot of what has been written both in the post and in the comments is based on a few misconceptions. I think Andrew Gelman's comment did a nice job (as usual) of addressing some of them. To me, his most important point, and the one that I would have raised had he not done so, is this:

"...non-Bayeisan work in areas such as wavelets, lasso, etc., are full of regularization ideas that are central to Bayes as well. Or, consider work in multiple comparisons, a problem that Bayesians attack using hierarchical models. And non-Bayesians use the false discovery rate, which has many similarities to the Bayesian approach (as has been noted by Efron and others)."

The idea of "shrinkage" or "borrowing strength" is so pervasive in statistics (at least among those who know what they are doing) that it frequently blurs practical distinctions between Bayesian and non-Bayesian analyses. A key compromise is empirical Bayes procedures, which is a favorite strategy of some of our most famous luminaries. Commenter Min mentioned a "Hegelian synthesis." Empirical Bayes is one such synthesis. Reference priors is another.

Which brings me to another important point. In the post and in the comments, it is assumed that priors are necessarily somehow invented by the analyst and implied that rigor in this regard is impossible. This is completely wrong. This is a long literature on "reference" priors, which are meant to be default choices when the analyst is unwilling to use subjective priors. An overlapping idea is "non-informative" priors, which are non-informative in a precise and mathematically well-defined sense (actually several different senses, depending on the details).

Also, I want to note that it can be proven that Bayes procedures are provably superior to standard frequentist procedures, even when evaluated using frequentist criteria. This is related to shrinkage, empirical Bayes, and all the rest. Wikipedia "admissibility" or "James-Stein" to get a sense for why.

Finally, the statement, "If Bayesian inference was clearly and obviously better, Frequentist inference would be a thing of the past," misses a lot of historical context. Nobody knew how to fit non-trivial Bayesian models until 1990 brought is the Gibbs sampler. This is not a matter of computing power, as some have suggested -- the issue was more fundamental.

The great Brad Efron wrote a piece called "Why isn't everyone a Bayesian" back in 1986. Despite not being a Bayesian, he doesn't come up with a particularly compelling answer to his own question (http://www.stat.duke.edu/courses/Spring08/sta122/Handouts/EfronWhyEveryone.pdf). One last bit of recommended reading is a piece by Bayarri and Berger (http://www.isds.duke.edu/~berger/papers/interplay.pdf), who take another stab at this question.
ReplyDelete
Replies
David7:06 AM
One area where the "crappy data" issue becomes extremely important is in pharmaceutical clinical trials. People tend to think that there are two possible outcomes of trials: a) the medication was shown to work or b) the medication was shown to be ineffective. In fact, there is a third possible outcome: c) The trial neither proves nor disproves the hypothesis that the drug works. In practice, outcome (c) is very common. For some indications, it is the most common outcome.

This leads to charges that pharma companies intentionally hide trials with negative results. They don't publish all their trials! But it turns out to be really hard to get a journal to accept a paper that basically says, "we ran this trial but didn't learn anything."

I forget the exact numbers, but for the trials used to get approval of Prozac, it was something like 2-3 trials with positive outcomes, and 8-10 "failed trials," ie trials which couldn't draw a conclusion one way or the other. This is common in psychiatric medicine. Its hard to consistently identify who has a condition, its hard to determine if the condition improved during the trial, and many patients get better on their own, with no treatment at all (at least temporarily).
ReplyDelete
Replies

Add comment