More philosophy-of-science blogging!
If you haven't yet read "Tantalus on the Road to Asymptotia", Ed Leamer's recent essay, and if you're at all interested in statistics, empirical economics, or science in general, then you should go read it. The essay is primarily a reply to an extremely important 2010 discussion paper by Joshua Angrist and Jorn-Steffen Pischke, called "The Credibility Revolution in Empirical Economics". That paper in turn is mainly a response to a 1983 Leamer essay called "Let's Take the Con Out of Econometrics". Such are the time scales over which deep academic debates are conducted. Actually, you should read all three.
The more recent two essays are discussing the idea of "natural experiments", and to what degree these make empirical economic studies (econometrics) more reliable. This is a very deep question about science. Normally, statistics can only see correlation, not causation; for example, you see that every time roosters crow, the sun comes up shortly afterward, but this doesn't tell you which caused which. A "natural experiment" would be if, for example, some disease killed all the roosters in town. Seeing that even without roosters, the sun still came up, you could conclude that rooster crowing (or, at least, rooster crowing in this specific town) was not necessary to summon the sun.
This natural experiment is very similar to a lab experiment. In fact, how is it different? Well, you might say, a lab experiment is controlled, and a natural experiment is not; in a lab, you can make sure outside stuff isn't disturbing your setup, while in a natural experiment you can't. This, in fact, is Ed Leamer's critique of natural experiments.
But I'm not sure that's right. In a lab experiment, we only convince ourselves that we've excluded all the outside causes. But sometimes, stuff that we didn't think about is messing with our experiment - cosmic rays, or the composition of the air, etc. Sure, lab experiments tend to exclude a lot more causes than natural experiments, but this need not be the case. For example, in finance experiments, even if you use an incredibly simple asset-market setup, subjects' behavior may be distorted by their pre-existing beliefs about the real-world stock market.
As I see it, the biggest advantage of lab experiments is that you can do them many times. You just can't do that with natural experiments. First of all, that allows you to control for a lot more things, since any confounding influence would have to be constant over space and time. Second, you can generate as much data as you want, making small-sample problems (another problem noted by Leamer) irrelevant. And third, it allows you to vary the setup intentionally, exploring the scope of an effect or a theory, and gaining a more complete picture of how the thing works. In other words, in even a perfectly designed natural experiment, we don't get to choose the questions we ask the world. In the lab, we do.
Ed Leamer focuses on the effect of confounding influences in natural experiments. He suggests doing sensitivity analysis to find what assumptions and specifications you need for a result to hold. I think that's a good idea. Basically, it's a way of groping around for something that looks like a set of scope conditions - testing the hypothesis to failure, to get a slightly better idea of when your theory works and when it doesn't. (Of course, if the natural experiment were a lab experiment, you could do this infinitely better, but if wishes were horses...)
As I see it, there are basically four "levels" of science. Each level gives you more confidence in your understanding of the world (i.e., in your theories and models). The levels are:
Level 1: History
This is basically just establishing precedents. It helps you define the set of things that can happen. Imagine a world without writing, and you'll see how important history is.
Level 2: Non-causal statistics
This is basically hunting for correlations. It can help you generate some guesses and ideas about what might cause what. It can also throw cold water on existing theories, since if A causes B, then we should probably see some kind of correlation, however variable or out-of-order, between A and B.
Level 3: Natural experiments
This is when you have some sort of randomized variation, but no ability to control the environment. An ideal natural experiment lets you establish that a causal effect occurred, but it's very hard to tell whether the setting was ideal or confounded, and you get only a limited amount of data.
Level 4: Lab experiments
By allowing replication and control of the environment, lab experiments usually produce more convincing conclusions about causal effects, generate as much data as you want, and allow you to explore the scope of the scope of the effects you find (i.e. when they do and don't happen).
If we could always understand the world through lab experiments, we would. When we can't put things in a lab - like the macroeconomy, or the Milky Way galaxy - then we should look for natural experiments. But if we can't find sources of random variation, then we should at least look for correlations. And if we don't have reliable quantitative data, the best we can do is just write down what we see.
Update: Noah receives a partial smackdown from...Noah's dad! The father is not satisfied with my one-dimensional classification of research methods, and wants to bring external validity into the picture:
Briefly, research methods vary on two important dimensions, one we can call internal validity (how sure are we that we know what caused our results?), and the other ecological validity (do our observations relate to the real world?). Only the experimental method can logically show cause-and-effect, so it is highest in internal validity, but the artificial situations created by controlling so many factors make it low in ecological validity (also, experiments can be flawed in many ways, such as poor methods, restriction in the range of observations, confounding factors we didn't think about, etc., which is why replication and attempts to falsify claims are intrinsically important to experimental science). Naturalistic observation is highest in ecological validity, lowest in internal validity. Other methods, such as correlation, ex post facto "experiments" (aka, "natural" experiments), and case studies are in-between on both dimensions.
Even experiments can vary on these two dimensions, some tightly controlled and measured, some using more naturalistic real-world manipulations and more complex settings in which many factors can interact. The ideal situation is one in which experiments at both ends of this continuum show the same thing, thereby bolstering internal and ecological validity. I refer to this approach as "alignment" in a research area, which helps tie real-world phenomena to causes. This ties, for example, my highly controlled lab research on creative cognitive processes (like fixation or incubation) to more naturalistic research with design students, and to research with real designers with real jobs.Well, there you have it. Note that I was originally trained as a physicist, and in physics, the Principle of Superposition assures you that any conclusion with internal validity will have external validity as well (i.e., the real-world motion of objects is just assumed to be caused by a straightforward combination of things that you can observe in labs). This is less so in other sciences, and much less so in social sciences like psychology and econ.