Doctors believe in breaking fevers, though there is no evidence that helps. Flu shots also don’t seem to work. I’ve also mentioned how uclers came to be declared a disease due to “stress”, when in fact they were clearly due to bacterial infection. Meanwhile, several large-scale tests of medicine use — from the RAND insurance study, or the 2003 Medicare Drug expansion — find minimal evidence that more medicine leads to better health.
I think our body of medical knowledge does illustrate how hard it can be to generate reliable knowledge, even in cases when we can easily run numerous experiments on a randomized basis.
Softer sciences have an envy of the hard sciences. Their researchers envy how reliable the experimental results are in a physics or chemistry experiment. In the hard sciences, it's possible to do controlled experiments where all of the relevant variables are controlled. Further, the models are simple enough that there aren't a host of alternative models that can explain any experiment. For example, if your theory is that the acceleration due to gravity is the same for all masses of objects, and your experiment is consistent with that theory, it's hard to come up with any simpler theory that would explain the same thing. "It doesn't matter" is already as simple as it gets.
I spent a lot of time with the Learning Sciences group at Georgia Tech. While they put an admirably high effort into careful experimental validation of their tools, methods, and theories, they were quite frank that the experimental data were hard to draw inferences from. They could describe a situation, but they couldn't reliably tell you the why of a situation.
The problem is that even with randomized trials, there are so many variables that it's hard to draw any strong conclusions. There is always a plausible explanation based on one of the uncontrolled variables. For learning sciences, a particularly troublesome variable is the presence of an education researcher in the process. Students seem to always do better when there's an experimenter present. Take away the experimenter, and the whole social dynamic changes, and that has a bigger effect than the particular tool. Seymour Papert's Mindstorms is a notorious example. Papert paints a beautiful picture of students learning deep things in his Logo-based classrooms, a picture that has inspired large numbers of educators. I highly recommend it to any would-be teacher. However, nobody can replicate exactly what he describes. It seems you need Papert, not just his tools, and Papert is darned hard to emulate.
All too often we focus on a small effect that is dwarfed by the other variables. The teacher, the software engineer, and the musician are more important than the tools. In how many other areas of knowledge have we fallen into this trap? We ask a question that seems obviously the one to ask--Logo, or Basic? Emacs, or vi? Yet, that question is framed so badly that we are doomed to failure no matter how good are experiments are. We end up comparing clarinets to marimbas, and from that starting point we'll never understand harmony and rhythm.