THE PROBABILITY DEFINITION
According to one probability definition e is potential evidence that h if and only if the probability of h given e is greater than the prior probability of h:
(la) e is potential evidence that h if and only if p(h/e) > p(h).
Or, if b is background information,
(lb) e is potential evidence that h if and only if p(h/e & b) > p(h/b).
A definition of this sort is offered by many writers.[7] However, despite its widespread acceptance it cannot possibly be correct if “evidence” and “probability” are being used as they are in ordinary language or science. For one thing, neither (1a) nor (1b) requires that e be true; and this, as noted earlier, seems to be necessary for evidence. That Alan has yellow skin is not evidence that he has an i.c.b., if he does not have yellow skin. However, even with the addition of a truth-requirement the resulting definition is unsatisfactory. I shall concentrate on (1b), since this is the most prevalent form of the definition, and note two types of counterexamples. The first shows that an increase in probability is not sufficient for evidence, the second that it is not necessary.
The lottery case. Let b be the background information that on Monday 1,000 lottery tickets were sold and that John bought 100 and Bill bought 1. Let e be the information that on Tuesday all the lottery tickets except those of John and Bill have been destroyed but that one ticket will still be drawn at random. Let h be the hypothesis that Bill will win. The probability that Bill will win has been increased approximately tenfold over its prior probability. But surely e is not evidence that Bill will win. If anything it is evidence that John will win.
Reverting to the principle of the previous section which relates potential (as well as veridical) evidence to a good reason for belief, assume for the sake of argument that in the light of b, e is potential evidence that (h) Bill will win.
Then according to the principle of reasonable belief, given b, e is a good reason for believing h. But surely it is not. In the light of the background information that on Monday John bought 100 and Bill bought 1 of the 1,000 lottery tickets sold, the fact that on Tuesday all of the tickets except those of John and Bill have been destroyed but one ticket will still be drawn at random is not a good reason at all for believing that Bill will win. Someone who believes that Bill will win for such a reason is believing something irrationally.Events often occur which increase the probability or risk of certain consequences. But the fact that such events occur is not necessarily evidence that these consequences will ensue; it may be no good reason at all for expecting such consequences. When I walk across the street I increase the probability that I will be hit by a 2001 Cadillac, but the fact that I am walking across the street is not evidence that I will be hit by a 2001 Cadillac. When Michael Phelps goes swimming he increases the probability that he will drown, but the fact that he is swimming is not evidence that he will drown.
What these examples show is that for e to be evidence that h it is not sufficient that e increase h's (prior) probability. The next example shows that it is not even necessary.
The paradox of ideal evidence.[8] Let b be the background information that in the first 5,000 spins of this roulette wheel the ball landed on numbers other than 3 approximately 35/36ths of the time. Let e be the information that in the second 5,000 spins the ball landed on numbers other than 3 approximately 35/36ths of the time. Let h be the hypothesis that on the 10,001st spin the ball will land on a number other than 3. The following claim seems reasonable:
p( h / e&b) = p( h / b ) = 35/3 6.
That is, the probability that the ball will land on a number other than 3 on the 10,001st spin is unchanged by e, which means, according to (1b), that e is not evidence that h.
But it seems unreasonable to claim that the fact that the ball landed on numbers other than 3 approximately 35/36ths of the time during the second 5,000 spins is not evidence that it will land on a number other than 3 on the 10,001st spin, even though there is another fact which is also evidence for this. More generally, e can be evidence that h even if there is other equally good evidence that h. To be sure, if we have already obtained the first batch of evidence there may be no need to obtain the second. But this does not mean that the second batch is not evidence that h.In the light of these examples perhaps it will be agreed that “e is evidence that h” cannot be defined simply as “e increases h's probability.” But it may be contended that a related concept can be so defined, namely, “e increases the evidence that h.” Thus,
e increases the evidence that h if and only if p(h/e&b) > p(h/b).
However, increasing the evidence that h is not the same as increasing the probability of h. To increase the evidence that h is to start with information which is evidence that h and add to it something which is also evidence that h or at least is so when conjoined with previous information. But to do this it is neither sufficient nor necessary to increase h's probability. The lottery example shows that it is not sufficient, while the paradox of ideal evidence shows that it is not necessary. In the lottery example there is no increase in evidence that Bill will win, since in the first place there is no evidence that he will win, and the combined new and old information is not evidence that he will win, even though the probability that he will win has increased. In the paradox of ideal evidence there is an increase in evidence that the ball will land on a number other than 3 on the 10,001st spin, but there is no increase in the probability of this hypothesis.
At this point a second definition of evidence in terms of probability might be offered.
(2) e is potential evidence that h if and only if p(h/e) > k (where k is some number, say 1/2).
Some writers, indeed, claim that the concept of evidence (or, confirmation) is ambiguous and that it can mean either (1) or (2).[9] One of these meanings is simply that given e, h has a certain (high) probability.
This proposal has the advantage of being able to handle both the lottery case and the paradox of ideal evidence. In the lottery case, although h's probability is increased by e, the probability of h given e and b is not high. (It is 1/101.) Therefore, by (2), e&b is not evidence that Bill will win. On the other hand it is evidence that John will win, since p(John will win/e&b) = 100/101. And this is as it should be.
The paradox of ideal evidence is also avoided by (2) since the fact that (e) the ball landed on numbers other than 3 approximately 35/36ths of the time during the second 5,000 spins makes the probability very high that it will land on a number other than 3 on the 10,001st spin. In this case p(h/e) > k, and therefore, by (2), e is evidence that h, even though p(h/e&b) = p(h/b).
However, (2) is beset by a major problem of its own.
The Wheatiescase (or the problem of irrelevant information).[10] Let e be the information that this man eats the breakfast cereal Wheaties. Let h be the hypothesis that this man will not become pregnant. The probability of h given e is extremely high (since the probability of h is extremely high and is not diminished by the assumption of e). But e is not evidence that h. To claim that the fact that this man eats Wheaties is evidence that he will not become pregnant is to make a bad joke at best.
Such examples can easily be multiplied. The fact that Jones is drinking whisky (praying to God, taking vitamin C, etc.) to get rid of his cold is not evidence that he will recover within a week, despite the fact that people who have done these things do generally recover within a week (i.e., despite the fact that the probability of recovering in this time, given these remedies, is very high).
It may well be for this reason that some writers prefer definition (1) over (2). On (1) e in the present examples would not be evidence that h because p(h/e) = p(h), i.e., because e is probabilistically irrelevant for h. I would agree that the reason that e is not evidence that h is that e is irrelevant for h, but this is not mere probabilistic irrelevance (as will be argued later).A defender of (2) might reply that in the Wheaties example we can say that the probability of h given e is high only because we are assuming as background information the fact that no man has ever become pregnant; and he may insist that this background information be incorporated into the probability statement itself by writing “p(h/e&b) > k”. In section 5 contrasting views about the role of background information with respect to probability (and evidence) statements will be noted, only one of which insists that such information always be incorporated into the probability statement itself. However, even if the latter viewpoint is espoused the Wheaties example presents a problem for (2) if we agree that information that is irrelevant for h can be added to information that is evidence that h without the result being evidence that h.
Suppose that b is evidence that h and that p(h/b) > k. There will be some irrelevant e such that p(h/e&b) = p(h/b), yet e&b is not evidence that h. Thus let h be the hypothesis that this man will not become pregnant. Let b be the information that no man has ever become pregnant, and let e be the information that this man eats Wheaties. We may conclude that p(h/e&b) > k, which, as demanded above, incorporates b into the probability statement. But although b is evidence that h we would be most reluctant to say that e&b is too.11
Before leaving probability definitions one further comment is in order. In a paper entitled “Bayes' Theorem and the History of Science” (Minnesota Studies in the Philosophy of Science, vol. 5, ed.
R. Stuewer) Salmon has suggested that the notion of evidence that confirms a hypothesis can be understood in terms of Bayes' theorem of probabilities, a simple form of which is p(h/e) = p(h) x p(e/h)/p(e). According to this theorem, to determine the probability of h on e we must determine three quantities: the initial probability of h (p(h)), the “likelihood” of h on e (p(e/h)), and the initial probability of e (p(e)). Salmon criticizes a view of evidence which says that if a hypothesis entails an observational conclusions e which is true, then e is evidence that h. This view, he points out, considers only one of the probabilities above, namely, p(e/h) = 1. To determine whether e is evidence that h one must also consider the initial probabilities of h and e.This may be a valid criticism but it does not avoid the previous problems. Suppose that Bayes' theorem is used to determine the “posterior” probability of h, that is, p(h/e), by reference to the initial probabilities of h and of e and the likelihood of h on e. We must still determine what, if anything, this has to do with whether e is evidence that h. If definitions (1) or (2) are used to determine this we confront all of the previous [11] difficulties even though we have used Bayes' theorem in calculating p(h/e). Thus let e be the information that this man eats Wheaties, and let h be the hypothesis that this man will not become pregnant. Assume the following probabilities, which do not seem unreasonable: p(h) = 1, p(e/h) = 1/10, p(e) = 1/10. Then by Bayes' theorem, p(h/e)= 1. Using definition (2) we must conclude that e is evidence that h.
3.