D Popper and Probability

Recall that, because scientific facts can't be proved to be true, philosophers who felt that it was important to justify scientific truths had to give up on the idea of conclusive verifiability and be satisfied with “verifiability in principle,” or probable verifiability (Chapter 2) replacing the ideal of Truth with a watered-down version.

Many people found this to be reasonable; if we can't really be sure of anything, they thought that settling for probable truth would be an acceptable compromise. Popper found the thought of compromise completely unacceptable; it violated the principle that science aimed to discover Truth, and he took his principles seriously. Whatever “probable truth” might be, it definitely was not Truth, yet he could not arbitrarily dismiss it. Indeed, probability itself posed a serious challenge to his program of falsification.

5. D.1 Probable Truth and Popper

Popper began by analyzing the notion of the probable truth of a hypothesis from a frequentist perspective in the longest chapter in The Logic of Scientific Discovery.⁹ For example, he asked how you could define “probable” truth. Could the probable truth of a hypothesis—its probability—be the number of correct predictions a hypothesis makes divided by the total number of predictions it makes? At first glance, this might make sense: a “probably true” hypothesis should certainly make many correct predictions. The problem is that the frequentist probability, a ratio, will be strongly determined by the denominator, which in this case is the total number of predictions that a hypothesis makes. Every hypothesis potentially makes an infinite number of predictions,¹⁰ and we can only ever test a finite number of them. If you divide any finite number by an infinite number the result is zero, and, hence, the probable truth of a hypothesis assessed in this way is zero.

Not a very helpful insight.

Taking another approach, Popper asks, “How, using the frequentist definition, would we interpret a hypothesis that has a probability of 0.5?” Would it mean that half of its predictions are true and half are false? This, too, is nonsensical. Or consider, if we don't know the Truth in advance, how would we know if we're halfway to it? After exploring a number of logical possibilities, Popper concludes that “probable truth” is meaningless within a frequentist framework.

If probable truth meant anything, we'd have to understand it subjectively, but subjectivity has no place in Popper's worldview. For one thing, subjectivity would imply that truth would “depend on the skill and training of an experimenter,” rather than on “objectively reproducible and testable results.” The thought demolishes any hope of rigorously defining the probable truth of a hypothesis and, for Popper, also firmly closes the case against inductivism, which considered that repeated observations made a conclusion more likely to be true. But Popper's failure to define probable truth was the beginning, not the end, of his real struggles with probability.

5. D.2 The Problem of Probability and Hypothesis Testing

The bedrock notion of Conjectures and Refutations is that, to be scientifically meaningful, a statement must be falsifiable in principle. Yet, even in principle, a probable statement cannot be absolutely falsified. Consider, when the local high school student asks you to buy a raffle ticket to support the school's chess team, you mentally write it off as a donation; knowing that 1,000 tickets will be sold and the chances are 999 to 1 against winning, you assume you have “no chance.” Nevertheless, “against all odds” you win! Amazing! Still, winning was always a possibility, and, in fact, if the chance of occurrence of any physical process is not literally zero, it may happen. This means that, no matter how improbable your experimental results are, they could not, strictly speaking, falsify your hypothesis.

The hope that falsification would provide a sound foundation for scientific reasoning was faltering. Probability, the stick that Popper had used to fend off his nemesis, came back to beat him.

5. D.3 Popper, Probability, and Falsification

One of the seminal events in the evolution of Popper's thinking was the test of Einstein's groundbreaking General Theory of Relativity that was carried out by the astronomer Arthur Eddington (see Chapter 2.I.2.). The theory, you recall, predicts that light travels through space along curved paths. Eddington's observation that starlight did, in fact, curve as it traveled through space, thus falsifying Newton's theory, greatly impressed Popper. Physics became his ideal of how science should be carried out. There was only one problem. Physics, especially the branch of physics called quantum mechanics, is probabilistic to its core (at least in its traditional interpretation, see Chapter 10.C, for a nonprobabilistic interpretation). An electron is never in any one place; it only has a probability of being there. How could Popper reconcile the great strides that physicists were making with the fact that statements having a non-zero probability of being true can't be falsified?

Popper's solution was pragmatic. What physicists actually did was to consider some explanations as so improbable that they could safely, if tentatively, be ruled out. Physicists have, Popper said, adopted a convention, a methodological rule, that went something like this: we should not explain things as being the way they are because of a series of unlikely, unaccountable accidents. Instead, we should agree to disregard explanations that are sufficiently improbable that we can't (for the time being) accept that they are true.

Suppose that you're driving down a highway one night, and you notice that all of the hundreds of cars coming toward you have only their left headlights lit. While it is physically possible that all of the right-side headlights randomly happened to burn out, such a thing would be so extremely improbable that you should assume that there was a simpler causal explanation.

Maybe the line of left-lighted cars was a staged event of some kind, or maybe teenaged boys had smashed all of the right-hand lights of cars parked in a lot because that's the sort of thing that teenaged boys would do. If you concluded that all of the right headlights just happened to have burnt out, you would be violating the methodological rule not to accept extraordinarily improbable explanations for phenomena. Physicists established rules for how unlikely a result could be before they invented a theory to explain it.

Popper fastens on the notion of a methodological rule and proposes that scientists in any field could agree to adopt their own convention for making decisions where probability came into play. The crux of the idea is that, if their experimental results were so unlikely that they fell outside the probability limit of what an existing theory could explain, then the results would falsify the theory. These ideas are close in spirit to the statistical testing that scientists today do. For example, the “methodological rule” for considering a result in biology to be significant is commonly accepted as p < 0.05. Yet Popper was working in the 1930s, before the establishment of modern statistical methods, and he doesn't refer to proper statistical testing.¹¹

You may find Popper's proposal unremarkable, even obvious, but it has subtle and far-reaching implications that must be understood by anyone wishing to know how science works and how statistics influences our knowledge, so I'll take a minute to emphasize it here and come back to it later as well.

What we're saying is that Popper's methodological rule, as it has been extended and refined by statistical analysis, is a tool for making decisions; as a tool it is vitally important, though it is still only a tool. A statistical test can reveal that a particular result is so extremely unlikely in light of a given hypothesis that, according to our judgment and the conventions adopted in our branch of science, we can regard it as effectively impossible.

Thus, if such a result occurs, we can consider it as evidence that falsifies the hypothesis. Statisticians and philosophers still actively debate the meaning and validity of the rules, as well as the distinct but related matter of what conclusions and opinions are “warranted” by statistical test results.¹² These elevated conversations notwithstanding, scientists today consider a statistical significance level (nowadays, a p-value) to be a rational device for evaluating the predictions made by a hypothesis.

We keep in mind, though, that getting a result with a low p-value, say p < 0.001, only means that that result would only rarely occur by chance alone; it does not mean that the result is special or, more importantly, that it is true. A significance level is a handy indicator, not an end in itself; it is like the oil level indicated by the dipstick in your car's engine, not a merit badge or an achievement. It gives you information that, provided you're willing and able to interpret it, may be useful to your scientific, decision-making process. We'll return to the concept of p-value later in this chapter and again in Chapter 7 on the Reproducibility Crisis.

In summary, despite the fact that non-zero probability statements in themselves are literally unfalsifiable, when combined with a methodological rule for interpreting them, they are compatible with falsification tests of predictions. (Remember that statistical tests are testing scientific predictions directly, and scientific hypotheses only indirectly.) Right statistical reasoning is entirely consistent with the rest of Popper's thinking: uncertainty is the ultimate reality, and, as long as the ultimate goal is the search for Truth and the results permanently subject to criticism and revision, we can learn from our mistakes. Statistical testing fits right in.

The question then becomes, “what kind of statistics should we use?” The standard answer for biologists, psychologists, and others has been the frequentist-based, NHST program, which we'll turn to next. While I'll start with a few basic concepts, even readers with solid practical experience in statistical methods may find something new here. We scientists need to know a little about the complex history of what we've come to think of as “statistics” in order to appreciate how we've been misled and, in many cases, gone wrong. First, what is NHST all about?

<< | >>

↑

Source: Alger Bradley E.. Defense of the Scientific Hypothesis: From Reproducibility Crisis to Big Data. Oxford University Press,2020. — 449 p.. 2020

D Popper and Probability

More on the topic D Popper and Probability: