"Processing Evidence With Bayes' Rule"
Limitation: This works under
the assumption that statements are either TRUE or FALSE.
What can we do if we have uncertainty
associated with each statement?
Hypothesis H: John has the
flu.
Evidence E: John has a runny nose.
Other available general information...
The probability that somebody has
the flu (during this period) is P(H). This is called the a priori
probability. Let's suppose P(H) = 0.1.
If someone has the flu, then there's
high probability that they have a runny nose. That is,
P(E | H) = 0.7.
However, they can have a runny
nose without having the flu.
P(E | ~H) = 0.2.
Given that John has a runny nose, it is more likely that he has the flu than we thought before we knew about the runny nose. We can compute this probability by using Bayes' rule.
P(H | E) = P(E | H) P(H) / P(E)
where P(E) = P(E | H) P(H) + P(E | ~H) (1 - P(H))
So P(H | E) = (0.7) (0.1) / ((0.7)
(0.1) + (0.2) (0.3))
= 0.07 / (0.07 + 0.06) =
0.07 / 0.13
or a little more than one half.
1. P(A and B) = P(A) * P(B given A)
2. Conversely, P(B given A) = P(A and B) / P(A)
By symmetry,
3. P(A and B) = P(B) * P(A given B)
Putting 2 and 3 together,
P(B given A) = P(A given B) * P(B) / P(A)
Replace A by E for evidence
Replace B by H for hypothesis
Use "|" for "given"
P(H|E) = P(E|H) P(H) / P(E) ; Bayes' Rule.
Note that P(E) can be expressed
as P(E|H)P(H) + P(E|~H) P(~H)
Here P(~H) = 1 - P(H), so that
P(E) = P(E|H)P(H) + P(E|~H) (1 - P(H))
So we can write Bayes' Rule as
P(E|H) P(H)
P(H|E) = -------------------------
P(E|H)P(H) + P(E|~H) (1 - P(H))
Let A be an event with probability
P(A).
Then by the Odds of A, we mean
the ratio of P(A) to P(~A).
For example, the odds of a fair
coin coming up heads is P(heads) : P(not heads)
= 0.5 / 0.5 = 1.
By the way, this is equivalent
to the popular way of expressing odds: "Fifty : fifty odds"
which means 50 / 50 = 1.
Let's write O(A) for the odds on
A.
The definition is then, O(A)
= P(A) / (1 - P(A))
Conditional Odds:
O(H|E) = P(H|E) / (1 - P(H|E))
Now let's write down Bayes' Rule twice: Once in the normal way to get P(H|E), and once for the complementary value of P(~H|E).
P(E|H) P(H)
P(H|E) = ----------
P(E)
P(E|~H) P(~H)
P(~H|E) = -----------
P(E)
Now let's divide the top equation by the bottom one. We get...
P(H|E)
P(E|H) P(H)
-----
= ------- ----
P(~H|E)
P(E|~H) P(~H)
Let's call the middle term Lambda. Then we have, by the definition of odds,
O(H|E) = Lambda O(H)
This says that to take into account some evidence E, we update the odds on the hypothesis H by multipling by some number Lambda.
By learning that E is true, we update the odds on H being true.
If, on the other hand, we were to learn that E is false, then that also should impact the odds we place on H. A complementary formula turns out to be
O(H|~E) = Lambda' O(H)
where Lambda' is some other number.
The two numbers Lambda and Lambda'
characterize the relationship between the evidence E and the Hypothesis
H for the purpose of updating odds.
).
In order to manage the updating of probability or odds values through such a network, we need a means to generalize Bayes' rule so that it can handle not only situations when E is certainly true or E is certainly false but also situations where E is established with a certain probability or a certain odds.
Important research in probabilistic
inference networks was performed in the development of two classic AI programs:
MYCIN and PROSPECTOR. For MYCIN, E. Shortliffe developed a general
framework for probabilistic inference networks in which certainty values
are updated according to rules. For the PROSPECTOR system, Nilsson
et al developed a set of probability updating techniques that respected
the attributes of necessity, sufficiency, and independence of evidence
with respect to a hypothesis.
Last modified: November 2, 1998
Steve Tanimoto