## Tuesday, June 04, 2013

### What is "derp"? The answer is technical.

There has been much discussion lately concerning the word "derp" and its appropriate usage. For example, Josh Barro used the word to describe conservative bigmouth Erick Erickson, and Paul Krugman used it as well. This prompted a primer on the history of the term, followed elsewhere by the usual hand-wringing by self-appointed cultural policemen annoyed by the word.

Now, I myself have used the word "derp" quite a lot. Possibly more than any other pundit I know, with the exception of Dave Weigel. But in any case, not only do I consider myself an expert in the use of "derp", I also have a very precise idea of what "derp" means, and how it should be used. I think "derp" is incredibly useful as a term for an important concept for which the English language has no other word.

It has to do with Bayesian probability.

Bayesian probability basically says that "probability" is, to some degree, subjective. It's your best guess for how likely something is. But to be Bayesian, your "best guess" must take the observable evidence into account. Updating your beliefs by looking at the outside world is called "Bayesian inference". Your initial guess about the probability is called your "prior belief", or just your "prior" for short. Your final guess, after you look at the evidence, is called your "posterior." The observable evidence is what changes your prior into your posterior.

How much does the evidence change your belief? That depends on three things. It depends on A) how different the evidence is from your prior, B) how strong the evidence is, and C) how strong your prior is.

What does it mean for a prior to be "strong"? It means you really, really believe something to be true. If your start off with a very strong prior, even solid evidence to the contrary won't change your mind. In other words, your posterior will come directly from your prior. (And where do priors come from? On this, Bayesian theory is silent. Let's assume they come directly from your...um...posterior.)

There are many people who have very strong priors about things. For example, there are people who believe, very strongly, that solar power will never be cost-efficient. If you confront them with evidence of solar's rapid price declines, they will continue to insist that, despite this evidence, solar will simply never be cost-competitive with fossil fuels. That they continue to insist this does not necessarily make them irrational in the Bayesian sense; they simply have very strong priors. Someday they may be convinced - for example, if and when unsubsidized solar power starts being adopted on a mass scale. It'll just take a LOT to convince them. (A more entertaining example can be seen in this classic comedy video.)

But here's the thing: When those people keep broadcasting their priors to the world again and again after every new piece of evidence comes out, it gets very annoying. After every article comes out about a new solar technology breakthrough, or a new cost drop, they'll just repeat "Solar will never be cost-competitive." That is unhelpful and uninformative, since they're just restating their priors over and over. Thus, it is annoying. Guys, we know what you think already.

English has no word for "the constant, repetitive reiteration of strong priors". Yet it is a well-known phenomenon in the world of punditry, debate, and public affairs. On Twitter, we call it "derp".

So "derp" is a unique and useful English word. Let's keep using it.

(Also, the verb associated with "derp" is "herp". It describes the action of coughing a large sticky mass of derp onto the internet in front of you. For example, to use it in a sentence: "That twerp just herped a flerp of derp!" A "flerp" is a unit I made up. It is the amount of derp that can be herped by one twerp. See?)

1. Anonymous12:02 AM

How many utils per flerp does a twerp get from herping derp? The evidence would seem to indicate that the marginal utility of herping derp can be increasing in flerps per herp.

1. Anonymous9:25 AM

Using the theory of diminishing marginal returns, I believe " the utility of herping derp can be increasing in flerps per herp". But the Marginal Utility would have to be decreasing.

2. Anonymous9:49 AM

Maybe herping derp is an example that violates the law of diminishing marginal returns (DMR). Your prior is that DMR is always true, however, the evidence is that twerps receive greater marginal returns per flerp the more derp they herp. Since the evidence is infinite and self evident, no prior belief, no matter how strongly held, can withstand it.

2. Anonymous12:22 AM

And here I thought "derp" was making fun of someone for saying and/or doing something dumb or silly. As in: "derp. Derp."

1. Anonymous2:34 PM

Isn't that exactly what he said?

3. Anonymous12:25 AM

The essence of the Liberal outlook lies not in what opinions are held, but in how they are held: instead of being held dogmatically, they are held tentatively, and with a consciousness that new evidence may at any moment lead to their abandonment. Bertrand Russell

1. Anonymous2:27 PM

Ha!

2. Anonymous5:08 PM

You say that like it is a bad thing...

3. But Russell didn't hold _that_ opinion tentatively!

4. Kate,

Of course he did. He was once asked if he would be willing to die for his principles, and he replied "Of course not. They might be wrong."

-dlj.

4. I have a strong prior that everything weird on the internet involves cats.

I'm still not totally convinced that "derp" isn't cat related.

5. Anonymous1:41 AM

http://farm4.staticflickr.com/3046/5827274160_ac8e90c279_o.jpg

6. Love the wording of this description of Bayesian probability. It takes what seems like a complicated concept, and explains it clearly and simply.

Some cliches/quotes can be described based on this:

"If you don't stand for something, you'll fall for anything" - If you have a weak prior, weak evidence will change your belief disproportionately than if you had a strong prior

"Extraordinary allegations require extraordinary evidence."-
Lance Armstrong. The cheater, using this concept to his advantage.

"A wise man proportions his belief to the evidence."
David Hume. Indeed.

7. Anonymous2:43 AM

This is really hilarious stuff man!
Beside being annoying, "derpism" is dangerous too!
Anyone who is not willing to reconsider his or prior (no matter how strong they are) in the face of new evidence is simply a fanatic who, if given the chance, could do more harm than good.

8. To be perfectly honest, I had no idea what "derp" even meant until you came along with this post, Noah Smith.

My first thought when I first saw the word was that it might have had a similar meaning to "Doh!" (a la Homer Simpson).

Evidently, I was wrong.

9. I'm pretty sure flerp is an imperial unit, what's the metric/SI equivalent?

1. Plerp?

2. Anonymous6:28 PM

the Jonah Goldberg, which is measured in kilo-eV*IQ/cruller

10. Now, I myself have used the word "derp" quite a lot.
Ahem. :-)

Great explanation, though.

11. Sadly, although I have a reasonable understanding of Bayesian probability, like Blue Aurora above I had no idea what a "derp" was before reading this.

Great article, thanks. One (very minor) quibble: you say "That is unhelpful and uninformative, since they're just restating their priors over and over." Technically, they are stating their posterior - it's just very similar to their prior. But, hopefully, each new piece of evidence makes another little dent in it.

Of course, some people's prior assumptions may be so strong that the evidence can't change it (see Cromwell's rule)...

12. Anonymous7:12 AM

What about a Siamese Twerp? Does that count as one or two units of flerp?

13. ArgosyJones8:46 AM

Derp is a slur referring to people suffering from Dow Jones syndrome. I'll thank you not to toss it about casually. Many people on the internet are afflicted.

14. Sir, you win the Internets.

15. ArgosyJones may be joking, but there's a pretty strong history of using the phrase "herp-a-derp" to refer to people with Down's syndrome and other developmental delays: http://www.urbandictionary.com/define.php?term=herp%20derps . If you wouldn't be comfortable using some of the other insults in this vein in polite company, you might want to rethink your usage of derp.

16. Brendan10:01 AM

I think your definition is a bit off, since it's perfectly possible to have a high prior *and* be willing to change that prior quickly in the light of new evidence. For example, I assign a high prior to propositions like "there isn't a tiger in my bathroom right now," but am willing to change it in a hurry if I see/hear otherwise.

I don't think probability theory has an accepted term for what you are talking about, though I know Brian Skyrms calls it "resiliency." A probability assignment is resilient to the degree, roughly, that you are unwilling to change it in the light of new evidence (but will instead change your probability assignments on other statements). It's an instance of what philosophers of science call the "Quine-Duhem problem"--you can always hold any to any theory (however crazy) so long as you are willing to changes to other beliefs.

1. Sounds related to "falsifiability".

2. Brendan,

In your example, if you have a high prior for not-tiger (say 0.99), you must assign a correspondingly low probability to seeing a tiger in your bathroom, as p(~tiger) = 1-p(tiger). Now say you see the tiger, call that evidence E. How likely is it that you see the tiger if the tiger is there? Say 0.9. How likely is it that you see the tiger if the tiger isn't there? Say 0.01 (the 1% accounting for sudden psychosis, drugs, etc.). Now, by Bayes' Rule, p(tiger|E) =(p(E|tiger)*p(tiger))/p(E) = (0.9*0.1)/0.11 ~= 0.82. Seem reasonable?

Wallfly, it is very much related to falsifiability.

P(tiger|E) = p(tiger)*p(

3. Brendan11:36 AM

Sharif,

Yep, that's a good way of modeling a simple case. One of way of thinking about what is happening with derping is that (1) they assign super-high probabilities to certain propositions AND (2) they "insulate" these priors from revision by failing to assign high (or low) likelihoods to any particular predictions. So, for example, logical truths (such as "it is either raining right now or not raining right now") will tend to be well insulated (since they don't allow for many precise predictions), whereas beliefs about tigers won't be (since they do allow for predictions). Beliefs about economic theory *ought* to be more similar to the tiger case, but derpers fail to see this.

Of course, another possibility is that the derpers simply have incoherent probability assignments, and thus shouldn't be modeled using Bayesian epistemology at all. But that's no fun.

17. So a derp is akin to a fanatic - someone who redoubles or triples their efforts while forgetting what their cause or purpose was to begin with

18. Is derp then always a hypothesis, a la this Bayesian definition. I.e., it seems there should be a distinction between between values (which are typically strongly held), and beliefs about the way the world is (the "model of" side of Geertz's definition of ideology - in contrast to the "model for").
What I am getting at is, some values (or lack thereof) may be repellent but I take it they are not "derp".

19. Dogmatic is an alternative. However Derp is far more satisfying.

20. Anonymous11:07 AM

Derping is what parents use as a technique of training their infants to use language and just about everything else.

Read some Kuhn and Wittgenstein -- everyone depends upon derping to crawl out of their ignorance, without it humans would remain cognitive snails.

21. According to the Merriam-Webster Thesaurus (http://www.merriam-webster.com/thesaurus/stubborn)

Stubborn (adjective): sticking to an opinion, purpose, or course of action in spite of reason, arguments, or persuasion

Synonyms: adamant, adamantine, bullheaded, dogged, hard, hardened, hardheaded, hard-nosed, headstrong, immovable, implacable, inconvincible, inflexible, intransigent, mulish, obdurate, opinionated, ossified, pat, pertinacious, perverse, pigheaded, self-opinionated, self-willed, stiff-necked, stubborn, unbending, uncompromising, unrelenting, unyielding, willful (or wilful)

22. Isn't twerp the noun of the past tense of to tweet?

23. ProfDC1:30 PM

I came here from Josh Barro's blog, and love Noah's definition. Following the idea of WARP, how about DERP as a backronym for "Determined Exponents Repeating Priors"?

24. For example, there are people who believe, very strongly, that solar power will be cost-efficient. If you confront them with evidence that, according to Todd Woody of the New York Times, "Worldwide, testing labs, developers, financiers and insurers are reporting similar problems and say the \$77 billion solar industry is facing a quality crisis just as solar panels are on the verge of widespread adoption," they will continue to insist that, despite this evidence, solar will soon be cost-competitive with fossil fuels."

In addition, your link is far less positive about solar than you imply, pointing out that fossil fuels are still far cheaper, and that there's no way to store the electricity that solar cells generate.

1. you tell em hoss

25. Anonymous5:56 PM

How many flerps would a herp derp herp if a herp derp could herp flerps?

26. a geek is someone who creates a mathematical definition of derp (dont take offense i can geek out with the best).

A single quanta of derp is one of those comments at the bottom of a blog.

27. Anonymous9:00 PM

I thought derp came from the movie Team America: World Police. What's the etymology of the word?

28. Noah,

I agree with what you're saying, but you should have chosen a better example then solar power costs --or at least linked to a better source then that The Week article. I got through reading it, and all I could think of was Disco Stu's linear projection of disco record sales from 1974.

1. You have just demonstrated the phenomenon of anti-solar derp. Thanks for derping by! :-)

2. Brian —

1/ Disco Stu's projections are exponential, not linear.

2/ Because a 40-year trend of continued falling costs in solar power is the same as a 2 or 3 year trend of rising disco record sales, right?

The latter is a short-term trend based on fickle consumer tastes. The former is a long-term trend based on improved technology and there are massive, massive incentives for continued investment and innovation — falling energy costs, and clean non-polluting energy!

Obviously there are some potential hurdles and stumbling blocks on the road toward solar energy that is cheaper than fossil fuels. But comparing it to disco record sales is a pretty big herp.

29. I like it when you talk derpy to me.

30. The word "derp" comes from South Park (and BASEketball before that). I think it's a little ridiculous that you "consider [yourself] an expert in the use of 'derp,'" given that your version of derp is quite different from the word Stone and Parker coined. Whereas "derp!" was originally an exclamatory phrase, you use it as a noun. Both uses call to mind similar associations, but they are entirely different parts of speech, and therefore they are used in very different contexts.

What I mean to say is, you're welcome to use derp in this new way--language is meant to change over time, and that's perfectly great-- but before you call yourself an expert in the use of a word, you might as well look in to the other ways that people use that same word.

31. Sorry this is off topic but previously you have disputed that a problem with macroeconomics today is that accounting logic isn't followed. JKH has a post all about this. http://monetaryrealism.com/the-accounting-quest-of-steve-keen/
There hasn't been any input from those with your viewpoint so it is kind of one sided at the moment. It would be great to get your side of the argument.

32. Noah, the following:

"(And where do priors come from? On this, Bayesian theory is silent. Let's assume they come directly from your...um...posterior.)"

was precious. Still laughing with the word-play!

1. From your prior posterior :)

BTW, CA, ask any Bayesian about the silence on priors, and they'll laugh - at you.

33. Anonymous1:45 PM

Noah:

Just need to say thanks for this post. The stuff on "derp" is interesting to be sure, but I had been searching for a simple, non-jargon-intensive explanation of just WTF Bayesian analysis is, exactly -- and had given up. Now I've got one.

34. "English has no word for "the constant, repetitive reiteration of strong priors".

1. No, propaganda is intentional disinformation.

2. That's not quite propaganda, which is presentation of only one side of argument. "Black propaganda" is maybe what you're descrihing, according to Wikipedia (yeah I know, Wikipedia). I like your derp definition and Bayesian explanation.

3. From wikipedia: "Propaganda is a form of communication that is aimed towards influencing the attitude of the community toward some cause or position by presenting only one side of an argument.".

One of the major techniques of propaganda is repetition.

I could understand if you wanted to say that derp is a particular type of propaganda.

4. Nope. Presenting only one side of the argument is a way of intentionally distorting the available evidence. Not the same thing as restating priors. Derp is not the same as propaganda.

5. Have you just restated a prior or presented only one side of the argument? :-)

Just as you presented some requirements for predictions, allow me to suggest a requirement for a claim that propaganda is not the same as derp: at the least you should make a Venn diagram of characteristics of propaganda, of derp, of both and neither. With such a model, we could have a better discussion.

6. Anonymous10:51 PM

Derp = this discussion.

35. Anonymous6:34 PM

Love the post & thanks for the info. However, your definition of the Flerp - "the amount of derp that can be herped by one twerp" - could be improved by adding a temporal dimension, and the Friedman Unit would be the most appropriate denominator.

But what would the numerator be? Perhaps "MeMes" - (pronounced "Me! Me!") to undermine the impressive aspect of "meme" as a measure of meaning, reminding the audience that the derp the twerp herped contains no real meaning beyond a cry for attention.

But the average twerp could probably produce an awful lot of MeMes per Friedman unit, so the Flerp is kinda like the Tesla - a unit which suffers from being rather too large for daily use (unless you study solar flares or run an MRI clinic). But this makes it (the Flerp) even more useful rhetorically, when one can accuse a twerp of herping a whole flerp of derp in 120 charaters or le

-elkern

36. "And where do priors come from? On this, Bayesian theory is silent."

I'd have to think Bayesian statisticians and econometricians would think that the prior belief should come from an intelligent logical evaluation of the prior data, information, evidence. It shouldn't be a dogmatic.

My adviser was a successful Bayesian, Chris Lamereoux. Once in a conversation he summed it up well; any good statistician will consider the prior, and other evidence outside of the current study, in an intelligent way in coming up with final beliefs, but Bayesians do so in a very formal way. But he agreed with me that that formality can be a straight jacket that can lead to a less accurate intelligent conclusion than a more flexible analysis and inclusion of your other evidence.

1. And I should note that Bayesians also do sensitivity analysis with priors. In other words, they try a series of ever stronger priors to see if the conclusion still hold up qualitatively to them.

2. There was a comment (by Nate Silver? Krugman? A poli sci blogger?) that one advantage of formal models is that they force you to state your assumptions explicitly (being aware, of course, about implicit zeros from unconsidered factors).

As has been repeatedly pointed out, many frequentist procedures are equivalent to certain Bayesian procedures, given certain priors. This means that non-Bayesians are doing Bayesian-equivalent work, but without acknowledging or examining priors.

37. Strong priors do at times make sense. Say you go to a magic show, and the magician is very impressive and you don't have an idea how they did most of the tricks. Do you assume that everything you know about physics is wrong, or do you consider the possibility that Penn and Teller are just very good at their job?

Note, I saw a couple of episodes of a TV show where magicians tried to fool Penn and Teller, as in showing them a trick and the two tried to figure out how it was done, and a few managed to fool Penn and Teller themselves.

38. Anonymous11:35 AM

"That twerp just herped a heap of derp!" sounds better to me, and you don't need to make up a word.

39. thanks for this post. I am in Italy and out of touch with how the kids these days talk (kids as in Krugman and the other Kewl Kids).

I was honestly wondering what "derp" means. I was accused of derp on twitter. Having read this post, I can now herp that the feeling is mutual.

40. Bayesian probability is a form of pseudo-science, as any with a formal education in mathematics should know, I say this because even though I find your post hilarious... I also find you constantly invoke Bayes.

Probability is a just a subset of measure theory, in particular, let F be a sigma algebra on a non-empty set S, and Pr a valued function on F, then Pr is a probability measure on F if and only if

i) Pr is non-negative real valued function on F.
ii) Pr is completely additive in F (for any countably infinite collection of pairwise disjoint sets in F, the probability of their union equals the sum of their probabilities).
iii) Pr is normed (Pr(S)=1).

Obviously we can form state spaces and so on using the correct definition of the mathematical meaning of probability, and it's completely impossible to force a Bayesian read of this (there's no enough variables to represent the nonsense, even if you assume that the domain is a collection of mental facts).

41. I love the video you linked to, but now I feel stupid for having an iPhone.

42. Alex Bollinger7:29 AM

I agree. I've been seeing this attitude for years and I generally just call it "dishonesty." That's imprecise because lots of these people are also/instead stupid or ensconced in a world where outside criticism never makes it in (watching Fox all day, for example).

43. Anonymous12:16 PM

If you're exposed to too many Derps, will you get Derpes?

44. Anonymous9:14 AM

Actually in World of Tanks I drive a KV-2 with a 152 mm Derp Cannon. And I love Derp. I once killed 9 tanks with the Derp

from the Wot wiki- "Derp Gun - A gun that causes a lot of damage with one shot, usually having a very long reload time and low penetration. Usually associated with short, High-caliber guns that load HE. Arty's guns are not considered derps (e.g. the 'derp gun' on the USSR tank KV-2)"

45. Anonymous9:42 AM

English language has no other word... how about mumpsimus?! doesn't have a ring to it, I guess.

46. Anonymous3:35 AM

http://knowyourmeme.com/memes/derp

47. "(And where do priors come from? On this, Bayesian theory is silent. Let's assume they come directly from your...um...posterior.)"

Is it really silent? My understanding is that, as a field of study, it's clamorously noisy on that subject. The first (rather obvious) suggestion would be that we adopt as our priors all the previously established body of knowledge discovered by science. That way, if we see a rock falling upward, we can with some confidence rule out a lot of silly ideas to do with telekinesis, and instead home in on things like (a) a tornado in the vicinity, (b) it's not really a rock, (c) you imagined it, etc. If you want to find the objective truth, you need priors to guard you against taking a single freak result too seriously - they give appropriate weight to the previous mountain of evidence.

It was neat wordplay, but there's a possible implication here that you suspect all the previously established body of knowledge discovered by science was just pulled out of someone's ass, or else that you think I'd be no worse off if I pulled some alternative beliefs out of my own. And I'm fairly sure you don't believe that!

The problem with derp is not that these commentators have strong priors, but that they are simply not doing anything like "updating" of anything recognizably based in the hard-won established scientific knowledge.

They started with what they *wished* to be true, and now, whatever happens, they continue to cling to their comforting, fluffy myth-blanket, regardless. Just like religious folks. Not remotely related to Bayesian inference.

1. Anonymous1:54 PM

Unfortunately, the word "Bayesian" has been used as a talisman by certain internet pundits to explain why hypotheses they don't like can't be true no matter what data support them. If the guy doing the assignment of prior probabilities insists, for example, that a non-Western medical treatment is as unlikely to be effective as a rock is to fall upward, then a dozen positive clinical trials can still leave him virtually certain that all of the actual *science* related to the hypothesis is wrong and should be rejected out of hand. A statistician might call him a Dunning-Kruger case, but he's an extreme example of the overall truth that most of the hypotheses we ponder involve subjects far less certain than gravity, and even if there has been some formal scientific study of the subject, one's evaluation of its meaning and strength will always be influenced by one's personal beliefs. Any estimation of the strength of new evidence for a hypothesis will be equally influenced by ideology - so if I like a study's results, I will say that p<.01 is very strong evidence, while if you don't like them, you will find some reason to call the study "flawed", "weak", or even "worthless". So I wonder if "Bayesian" talk without attached and fully justified numbers is not simply a form of pseudoquantification designed to intimidate one's ideological opponents.

There is also, by the way, the question of what kinds of evidence qualify as evidence at all. You suggest that priors might be derived from all previous "knowledge discovered by science." This opens up a can of worms regarding, at least, the treatment of knowledge not discovered by formal science. In my opinion, the proposition that rocks do not fly was adequately supported far earlier and more strongly by a couple hundred thousand years of human experience, involving literally trillions of man-hours of observation of rocks in their natural habitats, than by the relatively limited and theoretical scientific studies of the matter. If we presume that it is correct to reject the evidence of one's eyes in the case of a particular flying-rock sighting today, it would also have been correct to do so in ancient Rome on the basis of general understanding regarding the behavior of rocks. You certainly did not suggest that you think persons in ancient Rome or in nonliterate traditional cultures today could not claim to know that rocks don't fly - but there are others out there on the Net who imply just that, with some serious and in my opinion unsupportable epistemological implications.

48. The resistance or transition time to overcome strong priors may perhaps be reduced by smart implementations of at least two mechanisms: 1. an integrated 'incentivization' mechanism such as +/- reps, shares, and re-tweets and properly feeding that back into to users view. 2. A 'credibility' badge overlay (also in the form of an incentivization element) that rewards users who have exhibited the capacity to overcome strong priors. We've seen evidence that both can be effective if properly utilized. In addition, we are developing technologies to implement these techniques which will ultimately be used to 'measure' the notions of credibility and truth of people in online discussions.

49. This is a great explanation for how internet pundits think about Bayesian inference. This is also terrible explanation of Bayesian inference.

What you're describing is closer to Bayesian updating, and even then, you're not describing it very well. (For one, because Bayesian updating rules are pretty tricky, not widely accepted, and often hard to compute, unstable or undefined.)

50. A response: The involuntary "derp" http://benrizzo.wordpress.com/2013/08/15/the-involuntary-derp/

51. Anonymous9:18 AM

Another factor regarding the strength of priors:

"It is difficult to get a man to understand something, when his salary depends upon his not understanding it!"
— Upton Sinclair

52. Derp is a mentally challenged dirtbag. Nuff Said.

53. This is the most useful article I've read all year.

54. Anonymous12:48 PM

Every comment: aherp aderp

55. Useful jargon, as would be analogous: true believer and stupid ( there are many types of stupid, however). Derp seems another descriptor. The term "normalcy bias" works, maybe, to understand the cause of derp, but derp is more fun.

These are Cipolla's five fundamental laws of stupidity:

Always and inevitably each of us underestimates the number of stupid individuals in circulation.
The probability that a given person is stupid is independent of any other characteristic possessed by that person.
A person is stupid if they cause damage to another person or group of people without experiencing personal gain, or even worse causing damage to themselves in the process.
Non-stupid people always underestimate the harmful potential of stupid people; they constantly forget that at any time anywhere, and in any circumstance, dealing with or associating themselves with stupid individuals invariably constitutes a costly error.
A stupid person is the most dangerous type of person there is.

56. Thanks my friend told me I was derp and I was like what does that mean?

57. derp is something like myself a moron.