Image of Parson Bayes
For my grandkids:

In his book Thinking Fast and Slow, Daniel Kahneman gives an example of elementary Bayesian inference, posing this question:
"A cab was involved in a hit-and-run accident at night. Two cab companies, the Green and the Blue, operate in the city. You are given the following data: 85% of the cabs in the city are Green and 15% are Blue. A witness identified the cab as Blue. The court tested the reliability of the witness under the circumstances that existed on the night of the accident and concluded that the witness correctly identified each one of the two colors 80% of the time and failed 20% of the time. What is the probability that the cab involved in the accident was Blue rather than Green?"
Kahneman goes on to observe that "The two sources of information can be combined by Bayes's rule. The correct answer is 41%. However, you can probably guess what people do when faced with this problem: they ignore the base rate and go with the witness. The most common answer is 80%."

So why is the correct answer 41%?

Example of elementary Bayesian inference from Thinking Fast and Slow

The image illustrates Bayes' Theorem using a tree diagram. At the top, there are two branches. The right branch, "Cab is blue," has a probability of .15, 15%. The left branch, "Cab is Green," has a probability of .85, 85%. These are the "prior probabilities," and represent what the chance of a cab being blue or green would be, if the cab were randomly selected from all cabs in the city. On the right branch, "Cab is blue", there is another split into two branches. The right branch here shows the conditional probability of the witness correctly identifying the cab as blue, if it was blue. That number is .8, 80%. Multiplying .15 by .8 gives us .12, 12%, which is the posterior probability of the accident involving a blue cab that was correctly identified as blue, a "true positive". The left branch at this level shows the probability of the witness identifying the cab as green, given it was really blue. This probability is just one, minus .8, that is, .2. Multiplying .15 by .2 gives us .03, 3%, which is the posterior probability of the accident involving a blue cab mistakenly identified as green. This is a "false negative."

Following the left branch from the top of the diagram, "Cab is green," there is a similar split. Here the left branch shows the conditional probability of the witness correctly identifying the cab as green, if it was green. As on the other side of the tree, that number is .8, 80%. Multiplying .85 by .8 gives us .68, 68%, which is the posterior probability of the accident involving a green cab that was correctly identified as green, a "true negative". The right branch at this level shows the probability of the witness mistakenly identifying the cab as blue, given it was really green. Again, this probability is just one, minus .8, that is, .2. Multiplying .85 by .2 gives us .17, 17%, which is the posterior probability of the accident involving a green cab mistakenly identified as blue. This is a "false positive."

The true positive and false positive probabilities can be combined, using Bayes's Theorem, to give us the answer given by Kahneman. To give the probability that the accident cab was really blue, if identified as such by the witness, we divide the true positive probability by the sum of the true positive and false positive probabilities. This is .12 divided by .12 plus .17, which computes as .4138, or about 41%, as Kahneman said.

The result is surprising to some, but the reason is easy to understand: even though the witness identifies cab colors with 80% accuracy, there are so many more green cabs than blue that the chance of mistaken identification outweighs the chance of correct identification. This also works the other way. If the percentages of green and blue cabs were reversed, so that 85% of the cabs were blue and 15% green, the probability that the accident cab was blue, and was correctly identified as such, would rise to 96%. If half the cabs in the city were green and half blue, then the base rate would be uninformative in this example, and the chance of a correct identification would be 80%. You can try these and other scenarios using the calculator below.


Calculator




Click here to run the calculator for Kahneman's example, or enter the values yourself above. (Calculator accepts arguments by query string, e.g. http://anesi.com/bayes.htm?p_a=.15&p_b_a=.8&p_not_b_not_a=.8)

Bayes' Theorem is easy to understand when shown graphically, as above. A more usual example, and one more relevant to most people, would be a medical test for differential diagnosis. Take the example above, but read A = patient with twitching nostrils has Wilbur's Nostritis (persistent twitching of the nostrils, I just made that up), and B = patient tests positive for Wilbur's Nostritis. Leave all the numbers the same. Say 15% of patients presenting with twitching nostrils have Wilbur's Nostritis (one of many causes of nostril twitch), and for these, Wilbur's Test will detect it with 80% sensitivity. But 85% of patients with twitching nostrils do not have Wilbur's Nostritis, and of these, 20% will get a positive test result, because the test is only 80% specific. So you would expect that only 41% of patients presenting with twitching nostrils who get a positive test result on Wilbur's Test will actually have Wilbur's Nostritis.

This is not an unusual scenario for medical tests, and explains why tests are often repeated, or additional and different tests done. For example, if the patient has tested positive on Wilbur's Test, and you have another, different test (Orville's Test) that has the same specificity and sensitivity, you can start with .41 as the prior probability P(A), and if the patient tests positive on Orville's test, the probability of his having Wilbur's Nostritis rises to about 74%. (A bunch of assumptions are involved here that we will not tarry over.)

It should be noted that sensitivity and specificity often have different values. If you take the original example and change the specificity to 97%, but leave the sensitivity at 80%, then P(A|B) doubles from about 41% to about 82%. If you increase specificity to 100% (green cab always identified as such), false positives are zero, and P(A|B) = 100%. You still have the false negatives, of course, but you can at least be sure that if the cab was identified as blue, it really was blue.