A frequent defense of this startling error rate is that the scientific process is supposed to wend its way through many wrong ideas before finally approaching truth. But that’s a complete mischaracterization of what’s going on here. Scientists might indeed be expected to come up with many mistaken explanations when investigating a disease or anything else. But these “mistakes” are supposed to come in the form of incorrect theories—that a certain drug is safe and effective for most people, that a certain type of diet is better than another for weight loss. The point of scientific studies is to determine whether a theory is right or wrong. A study that accurately finds a theory to be incorrect has arrived at a correct finding. A study that mistakenly concludes an incorrect theory is correct, or vice-versa, has arrived at a wrong finding. If scientists can’t reliably test the correctness of their theories, then science is in trouble—bad testing isn’t supposed to be part of the scientific process. Yet medical journals, as we’ve seen, are full of such unreliable findings.

Another frequent claim, especially within science journalism, is that the wrongness problems go away when reporters stick with randomized control trials (RCTs). These are the so-called gold standard of medical studies, and typically involve randomly assigning subjects to a treatment group or a non-treatment group, so that the two groups can be compared. But it isn’t true that journalistic problems stem from basing articles on studies that aren’t RCTs. Ioannidis and others have found that RCTs, too (even large ones), are plagued with inaccurate findings, if to a lesser extent. Remember that virtually every drug that gets pulled off the market when dangerous side effects emerge was proven “safe” in a large RCT. Even those studies of the effectiveness of third-party prayer were fairly large RCTs. Meanwhile, some of the best studies have not been rcts, including those that convincingly demonstrated the danger of cigarettes, and the effectiveness of seat belts.

Why do studies end up with wrong findings? In fact, there are so many distorting forces baked into the process of testing the accuracy of a medical theory, that it’s harder to explain how researchers manage to produce valid findings, aside from sheer luck. To cite just a few of these problems:

Mismeasurement To test the safety and efficacy of a drug, for example, what researchers really want to know is how thousands of people will fare long-term when taking the drug. But it would be unethical (and illegal) to give unproven drugs to thousands of people, and no one wants to wait 20 years for results. So scientists must rely on animal studies, which tend to translate poorly to humans, and on various short-cuts and indirect measurements in human studies that they hope give them a good indication of what a new drug is doing. The difficulty of setting up good human studies, and of making relevant, accurate measurements on people, plagues virtually all medical research.

Confounders Study subjects may lose weight on a certain diet, but was it because of the diet, or because of the support they got from doctors and others running the study? Or because they knew their habits and weight were being recorded? Or because they knew they could quit the diet when the study was over? So many factors affect every aspect of human health that it’s nearly impossible to tease them apart and see clearly the effect of changing any one of them.

David H. Freedman is a contributing editor at The Atlantic, and a consulting editor at Johns Hopkins Medicine International and at the McGill University Desautels Faculty of Management.