F Preregistration and the Hypothesis: A Fix for the Reproducibility Crisis?
There is one final issue regarding reproducibility and the hypothesis that we have to go over because it concerns a method that is being widely touted as a solution to the crisis; it is the plan to “preregister” studies before conducting them.
Here's the rationale for it and what it would involve.Box 7.2 When Reproducibility Took a Backseat to Truth: The LTP Wars
From the late 1980s and through the 1990s, rival laboratories studying how we learn and remember facts published a succession of progressively more sophisticated investigations that became known as the “LTP Wars.” 62 At stake was an understanding of the molecular basis of memory and a phenomenon called long-term potentiation (LTP), which is still the leading contender to be the fundamental neurophysiological mechanism of memory storage 63 Throughout the decade-long contest, the laboratories rarely replicated each other's observations; instead, they executed rigorous conceptual tests of the two diametrically opposed hypotheses that were at the center of the dispute.
For more than 100 years, neuroscientists had suspected that synapses, the submicroscopic64 junction points where nerve cells in the brain signal to each other, are the sites of memory formation. Suppose that you learn that Lusaka is the capital of Zambia. If we were to zoom in on one of the prime memory formation regions of your brain, the hippocampus, we'd find that the synapses between the signaling cells and their receptive partners had been made “stronger” by LTP while you were learning. The strengthening means that the signaling cells can more easily activate their receptive cell partners in the neural circuit that stored “Lusaka” and retrieve the name from memory when you need it in the future. (Do not worry if you feel that you don't really understand memory much better from this description—no one else does either.) While there is still a lot to learn about memory, elucidating the precise molecular mechanisms of LTP would represent a huge step forward; it could, for instance, contribute to creating a drug to treat memory disorders, such as Alzheimer's disease.65
Broadly speaking, two kinds of changes could make synapses stronger: either the presynaptic (sending) cell could send out more chemical neurotransmitter or the postsynaptic (receiving) cell could gain more receptors to the transmitter that was sent out.
Think of a synapse as a miniature garden; the transmitter receptors are flowers. The garden becomes more beautiful if you make sure it gets plenty of sunlight (“presynaptic”) or, alternatively, if you enrich the soil (“postsynaptic”). The LTP Wars were fought to test the two hypotheses of where synaptic strengthening occurred: on the pre- or postsynaptic sides of the synapse.The first solid experiments detected more of the chemical neurotransmitter being sent out from the signaling cells after LTP was established.66 The results were predicted by the Pre-side hypothesis, but the experiment was not always reproducible. Rather than try to resolve the irreproducibility issue, the Postside researchers took an entirely different tack. The presynaptic hypothesis also predicted that all postsynaptic receptor responses would be strengthened equally, just as all of the flowers grow better if the garden gets more sunlight. The Post-siders found that not all types of receptor responses were strengthened equally by LTP—one kind was greatly enhanced while another was completely unaffected—so the presence of more neurotransmitter alone could not explain the synaptic strengthening. Something on the postsynaptic side, in the “soil” of the synaptic garden, made some of the receptor responses stronger, they felt. The first skirmishes in the LTP wars ended indecisively.
The next round began, again, not with attempts to replicate or reconcile the earlier results, but with the opening of new fronts by testing new predictions. Pre-Siders used high-resolution electrical signaling data and, applying a well-established quantitative theory of neurotransmitter release, concluded that the LTP data was consistent with the presynaptic hypothesis. However, critically, their analysis depended on subtle assumptions that they could not verify. The Post-siders developed a similarly hard-core quantitative conceptual test of the Pre- siders' hypothesis and found that the presynaptic effect was a minor component—the main action was on the postsynaptic side.
And so it went for several more years, with the two sides subjecting the pre- and postsynaptic hypotheses to increasingly severe experimental tests. Because, as Karl Popper emphasized, we must decide when to accept that a test falsifies a hypothesis, and because falsification itself can never be conclusive, a struggle for Truth such as this one, between teams of brilliant and fiercely competitive investigators, can be protracted. The experiments were ingenious and intricate; attempts to annihilate the opposing hypothesis at first appeared decisive, only to have it reappear, like the evil robot in the Terminator movies, apparently unconquerable. Reproducibility, or the lack of it, no doubt created headaches for the participants throughout the LTP Wars, stimulated controversy, and wasted effort. Yet the headaches also drove the development of increasingly more convincing and telling experiments. In the end, the Pre-siders could not counter the Post-siders sharp attacks, and the Pre-siders essentially conceded. The Post-side hypothesis itself has withstood many challenges and has not been falsified. Thus far.
The story illustrates how a major basic science controversy is settled, not by reproducibility studies, but by intellectual struggle in which specific competing hypotheses are subjected to increasingly rigorous tests until one is eliminated.
If, before you did an investigation, you announced exactly what you planned to do; which data you would collect and how you would collect, analyze, and interpret them; and you were assured that your report would be published regardless of what you found, then, it is thought, you could not be biased in favor of any particular outcome. Your report would be less likely to be tainted by perverse incentives and more likely to be reproducible. The veracity of scientific reports would go up. This is the underlying premise of the program of preregistration. According to one scheme, “Before researchers even begin the experiments, they submit a manuscript presenting a clear hypothesis that they plan to test and their proposed experimental methods and analyses.” Presumably you could have done a pilot experiment or two, and the concept is that, before doing the major experiments, you submit an outline of proposed work to journals, where, after appropriate review and revision, it is accepted.
The journal agrees to publish the eventual data, provided that the authors do the work exactly as proposed. When applicable, the idea of doing experiments and letting the chips fall where they may has much in its favor. The caveat here is that preregistration may not always be applicable.7.F.1 When Preregistration Can Help
We can start with the term “hypothesis” in the description of preregistration. Does it refer to a genuine, explicit scientific hypothesis, or is it merely a prediction (see Chapter 2)? Preregistration seems well suited to advanced (meaning based on copious prior information) focused studies in which only one or two variables are manipulated, the parameters of the design and analysis can be laid out in detail beforehand, and the experiments are executed almost robotically. This is the essence of a “decision-making” study (see Chapter 4). The archetypal clinical trial of drug efficacy is an ideal case. The study asks, “Does the drug work or doesn't it?” Preregistration is ideally suited for clinical trials, where the important variables—safety, dosage range, statistical procedures, patient pool—have been worked out. We want nothing but reliable data—no new, unexpected twists or discoveries at this stage, please! A preregistered protocol in such a case is just what the doctor ordered.
Pure Discovery Science projects are also compatible with preregistration. In the optimal condition, we simply want to characterize and classify parameters that we encounter in a novel data space. Naturally, although there are no explicit scientific hypotheses, Discovery Science involves plenty of implicit ones that could engage an investigator's conscious or unconscious bias, and preregistration could prevent bias from affecting the results.
7.F.2 When Preregistration Won't Work
Certain types of science are inherently incompatible with preregistration. When basic research scientists conduct an experiment, they engage in an active dialog with nature: you prod the preparation, see how it responds, formulate a hypothesis, test a prediction, do another test, reject the hypothesis, formulate a new one, etc.
Yet for preregistration to work, the study must be described in full detail in advance; no changes allowed. Few basic scientists, I suspect, could live with such restrictions.What would you do if, while conducting a preregistered study, it became clear that your preregistered hypothesis was wrong and that a new one would account for the data much better? Do you stick with the proposed experiments to the bitter end? The answer should be “yes,” because that's what you promised to do, and, furthermore, that's what your statistical analysis requires you to do. Still, it might be a big sacrifice to put aside a more exciting idea to carry on the laudable, selfless path of filling in the blanks that you promised to fill in. A glance at many published papers shows that the plot line of a paper often goes through one or more initially promising and ultimately rejected hypotheses. Such studies are precluded by the preregistration protocol, which prohibits modifications to your experimental design. Why? Because this could defeat the whole purpose of the preregistration, which is to prevent you from meddling with the process while it is under way. And it would undercut the promise that you made to the journal to do what you said you would do. Essentially, the journals would being giving you free rein to do anything you like while guaranteeing you a publication! Great for you, not so great for them.
One solution to such problems is to seal the data until it is all in, thus preventing any midcourse corrections. Again, while this could work for certain types of study (e.g., clinical trials), it will be impractical for many Small Science projects where investigators are in constant contact with their results.
Another problem that preregistration poses is how to prevent intellectual “misappropriation” of scientific sights, up to and including intellectual theft. Necessarily, the results of preregistered studies will be unknown before the experiment is conducted: that is, after all, the point.
The proposers have to lay all of their cards on the table, their brilliant, original hypothesis along with their experimental design and predicted outcomes. Reviewers, who are also often competitors, must have access to all relevant information in order to review the proposed work. And, unlike a conventional grant application or completed research article, the proposer may have no intellectual protection in the form of established priority (e.g., a meeting abstract or other public communication of the main finding that marks it as hers). Appeals to integrity—stealing ideas is wrong—will often have little effect because blatant intellectual theft is not the main concern. There is a gossamer, nearly invisible line between stealing someone else's idea and having a latent idea of your own take on urgent significance when you become aware ofyour competitors interest in it. Consider: you've been kicking around a pet hypothesis for years, but some reason haven't gotten around to exploring it in the lab. Suddenly you learn that your competitor is thinking along the same lines! You would be well within your rights to start work on it right away—it was, after all, your idea first! With fearful scenarios like this in mind, many scientists will understandably resist broadcasting their ideas through preregistration.In other words, preregistration can be problematical, its disadvantages in direct proportion to the novelty and significance of the planned experiment, the ability of others to rip it off, and its value to the preregistrant who might be a young scientist trying to launch her independent career.
There is also a practical question of how ready journals will be to base costly publication decisions on the promised execution of experimental designs. High- impact journals are constantly on the lookout for brand new, game-changing (the current buzz word is “transformative”) results. Can they be sufficiently confident that a promising experimental design will deliver the goods without seeing the results in advance? To take the chance that the data that they’ve agreed to publish will be boring, uninterpretable, or “negative?”
Will the scientific community seek out journals that specialize in preregistered studies? As it stands, there are lower impact journals that are considered “archival,” in which reliable, though not always exciting, work appears. The worth of the journal impact factor and similar scientific ratings systems has not yet been resolved, and the availability of scientific rewards—jobs, grants, etc.— to those publishing rigorous, but humdrum, pre-registered studies in less than high-impact journals is also uncertain.
Finally, we may wonder how a pre-registration system could be “gamed” and what could be done about it. It is common lore that researchers sometimes “propose” projects in grant applications that they have already largely completed. You can safely propose to do a seemingly risky experiment when you already know how it will turn out, where the likely pitfalls are, and how to avoid them. Why not collect the experimental results and then, with the data in hand, “pre-register” a plan, complete with eye-catching manipulations, controls, and deep “foresight” into possible problems. If you were clever enough, you’d gain much credibility for a novel design, which will eventually deliver important results and, incidentally, grease the skids for your future projects. Hopes that a pre-registration system cannot be rigged seem faint.
Other, equally commendable experiments in scientific publishing that might have an impact on reproducibility or improve scientific communication have not yet taken hold. It is now possible to publish comments on a publication in an online format linked to an article at a number of journal websites, but this worthwhile-sounding initiative hasn’t stimulated much commentary it seems.
Preregistration will no doubt improve the reproducibility of certain kinds of scientific project—Discovery Science and clinical trials being perhaps the prime candidates—but truly hypothesis-based research is unlikely to benefit from the concept, at least without seismic changes in the culture of the basic science that is most closely associated with hypotheses.
7.