WILL A MORE GENERAL APPEAL TO PROBABILITY SHOW THE EPISTEMIC VALUE OF SIMPLICITY?

Let's start with a simple formulation of Bayes' theorem:

p(h/e)

p (h )x p (e/h)

p^(e)

On the left, p(h/e), the “posterior” probability, represents the probability of the hypothesis h given the evidence e.

On the right, p(h), the “prior probability,” represents the probability of h independently of the information e. The expression p(e/h), the “likelihood,” represents the probability of information e, on the assumption that h is true. And p(e) represents the prior probability of the information e. There are two ways that the simplicity of h could affect the posterior probability of h: by affecting the prior probability of h or by affecting the likelihood (or, of course, both). In this section I will discuss simplicity and prior probability. In section 9, I will turn to simplicity and likelihood.

Let us suppose that we have two conflicting hypotheses, hl and h2. And let us also suppose that both hypotheses entail the evidence e, so that the “likelihood” of each hypothesis is the same, viz., 1. In this case, the posterior probability of each will be equal to the prior probability of the hypothesis in question divided by the prior probability of e. Now suppose that hl has a higher prior probability than h2. 'lhen, since the prior probability of e is the same in both cases, and since the likelihood of each hypothesis is l, it follows from Bayes' theorem that p(h1/e) > p(h2/e).

Now let's introduce simplicity. Suppose that if one hypothesis is simpler than another, then it has a higher prior probability. ^at is, suppose that if hl is simpler than h2, then p(h1) > p(h2). Then on simplicity grounds alone, it follows that p(h1/ e) > p(h2/e ). Now, if we construe probability in an epistemic way—as measuring some notion of reasonable belief—then using probability, we show that simplicity is indeed an epistemic virtue.

We show that if two conflicting hypotheses each entail e, then, given e, it is more reasonable to believe the simpler hypothesis than the more complex one. Admittedly, this by itself is not sufficient to make e evidence that h—in the sense of A-evidence or explanatory B-evidence. Nor is it strong enough to make it reasonable to believe the simpler hypothesis. But simplicity will provide a basis for comparing the believability of hypotheses. It will be of some epistemic value.

The crucial question, then, is why we should assign h1 a higher prior than h2 on grounds of simplicity. The answer varies depending on your interpretation of probability. If you are a subjective Bayesian, you can assign prior probabilities any way you like, for any reason whatever, so long as your entire system of probabilities, prior and posterior, is “coherent”—that is, so long as all your probability assignments do not violate the formal rules of mathematical probability. “Coherence” in your set of probabilities (your “degrees of belief”) is both necessary and sufficient for your set of beliefs to be reasonable, according to the Bayesian. So, if you want to assign higher prior probabilities to simpler hypotheses, you are free to do so, so long as you satisfy coherence. You may have reasons for assigning higher priors to simple hypotheses, including the belief that nature is simple, but they are your reasons, and it does not follow that others should do the same. Except for coherence, it is entirely subjective.

This view should be distinguished from the pragmatic claim that I will discuss later—the claim that simplicity is a pragmatic virtue, one that can make a theory easier to use. Like subjective probabilities, this can vary from one person to another. But the pragmatic claim is not an epistemic one. It does not say that if one theory is simpler to use than a competitor, then the probability of that theory, subjective or otherwise, is higher that of the competitor. I won't pursue a subjective probability account of simplicity, because it essentially abandons the attempt to justify simplicity as an epistemic virtue—at least as a universal objective epistemic one that most simplicity enthusiasts want, including Newton and Einstein.

In effect, it says that it can be a virtue or not, depending on your subjective beliefs.

What I will ask is how an objectivist about probability can justify the claim that simpler theories have a higher prior probability. This will vary, depending on what sort of probability objectivist you are. If you are a frequentist or a propensity theorist, then the question of simplicity doesn't arise. You don't assign probabilities to hypotheses, but to types or classes of events in a sequence or to propensities in the world. And these probabilities are not in any very direct way epi- stemic, unless you add some postulate relating frequency or propensity probabilities to epistemic ones.^[79] Asking whether string theory is more probable, and therefore more believable, than the standard model because it is simpler cannot be answered by the frequentist whose definition of probability is not applicable to theories. So let's concentrate on objective probability theories that are explicitly epistemic. These are of two sorts: empirical (e.g., my own objective epistemic theory^[80] ) and a priori (e.g., Carnap's).

With the former, prior probabilities are, in general, empirical.^[81] Referring to Bayes' theorem, p(h) is the probability of h independently of e. It is not the probability of h independently of everything. So, e.g., if h is that Sally has disease D, and e is that Sally has symptoms S, then one empirical way to interpret p(h) is as a number determined at least in part by the empirically established “base rate” of disease D in the general population. Now, suppose we make the very broad claim that that, for any hypothesis h that assigns a property to an individual, the simplicity of h gives h a higher prior probability than does the “base rate” of that property in the general population by itself. How is that to be justified empirically? The only way I can see is by empirically establishing that: (i) nature is simple, or (ii) simple theories have turned out to be more empirically successful than complex ones.

But claims (i) and (ii) are just the ones I have been questioning and found wanting so far. It will not do to invoke objective empirical priors based in part on claims about the simplicity of nature, or the success of simple theories, unless you can empirically justify those claims.

So let's try a view of prior probability according to which it will be an a priori fact, not an empirical one, that simpler theories have higher prior probabilities than more complex ones. The best exponent of such a viewpoint is Carnap.^[82] For him, probabilities are to be determined by reference to a “linguistic framework” that contains a fundamental vocabulary, rules of inference, and probability axioms that include, but go well beyond, the usual axioms of probability. There are many different possible linguistic frameworks, with different prior probabilities assigned. In some frameworks, simple hypotheses will have much higher prior probabilities than complex ones, in others, less so (and there are even frameworks in which simplicity plays no role whatever). Once a linguistic framework has been chosen, then, for Carnap, it is an a priori fact what prior probability an hypothesis will have.

For example, suppose we have a linguistic framework containing just one property term P, and two names a and b. There are four possible basic states of a world describable in such a framework: Both a and b have P; a has P but b does not; a does not have P but b does; and neither a nor b has P. These, Carnap calls “state-descriptions” (they describe possible states of the world that our linguistic framework can represent). Now, we are to assign prior probabilities to these state- descriptions, subject to the standard rules of probability. How shall we do so in such a way that these rules are satisfied? There are an infinite number of ways. One that Carnap particularly likes (he calls it c*) is to assign equal priors to the

THE COMPLEX STORY OF SIMPLICITY | 103 state-descriptions in which both a and b have P and in which both a and b lack P.

Without going into the mathematical details, the assignment of priors will be such that both these state- descriptions will have a prior probability of 1/3, whereas the two state-descriptions in which P is present in one item and lacking in the other will each have a prior of 1/6. Now, in one respect, the two favored state-descriptions are simpler than the others. In both, everything is uniform— everything either has P or lacks it; whereas in the others, one thing has P while the other lacks it. The former represent simpler possible worlds than the latter. They are more uniform.

What is the justification for assigning priors in such a way that simpler hypotheses have higher priors than complex ones? There are two ways to deal with this question. One is to treat it as what Carnap calls an “internal” question, which is to be answered by reference to the rules governing the framework. The other is to treat it as an “external” question—one to be answered by indicating why those rules rather than others were chosen. Treated as an internal question, Carnap's answer is that this is just a stipulation, or convention, needed to formulate the most fundamental rules of the linguistic framework. Within the framework there is no argument to be given for formulating the fundamental priors one way rather than another. As an internal question in the example above we can ask what the prior probability is of the statedescription in which both a and b have P. The answer to this is determined a priori by appeal to the rules of that framework. Similarly, we can determine a priori by computation and appeal to the rules whether in general simple hypotheses have higher priors than complex ones. If they do, and if we interpret probabilities epistemically as providing a basis for (degrees) of rational belief, which Carnap allows, then we

get the a priori result that simplicity is an epistemic virtue. But if we ask why the rules are set up this way rather than some other—why, e.g., the simpler hypothesis “everything has P” has a higher prior than the more complex hypothesis “one thing has P and one thing lacks it”—that is an external question to be answered pragmatically in terms of ease of use and familiarity or aesthetically in terms of the elegance of the system.

External questions cannot be answered empirically or by a priori calculation.

For Carnap, then, you can build a probability system so that simple hypotheses will have significantly higher priors than complex ones. You can also build a probability system so that simple hypotheses will have only modestly higher priors than complex ones, or even ones in which the priors are the same. But the only justification you can have for using one system rather than another is pragmatic or aesthetic, not epistemic. In short, using probability you can represent the claim that simplicity is an epistemic virtue by assigning higher priors to simpler hypotheses. But this does not suffice to justify such an assignment, either empirically or a priori.

<< | >>

↑

Source: Achinstein P.. Speculation: Within and about Science. Oxford: Oxford University Press,2019. — 297 p.. 2019

WILL A MORE GENERAL APPEAL TO PROBABILITY SHOW THE EPISTEMIC VALUE OF SIMPLICITY?

More on the topic WILL A MORE GENERAL APPEAL TO PROBABILITY SHOW THE EPISTEMIC VALUE OF SIMPLICITY?: