Jim Berger gives the following example illustrating the difference between frequentist and Bayesian approaches to inference in his book The Likelihood Principle.

Experiment 1:

A fine musician, specializing in classical works, tells us that he is able to distinguish if Hayden or Mozart composed some classical song. Small excerpts of the compositions of both authors are selected at random and the experiment consists of playing them for identification by the musician. The musician makes 10 correct guesses in exactly 10 trials.

Experiment 2:

A drunken man says he can correctly guess in a coin toss what face of the coin will fall down. Again, after 10 trials the man correctly guesses the outcomes of the 10 throws.

A frequentist statistician would have as much confidence in the musician’s ability to identify composers as in the drunk’s ability to predict coin tosses. In both cases the data are 10 successes out of 10 trials. But a Bayesian statistician would combine the data with a prior distribution. Presumably most people would be inclined *a priori* to have more confidence in the musician’s claim than the drunk’s claim. After applying Bayes theorem to analyze the data, the credibility of both claims will have increased, though the musician will continue to have more credibility than the drunk. On the other hand, if you start out believing that it is completely impossible for drunks to predict coin flips, then your posterior probability for the drunk’s claim will continue to be zero, no matter how much evidence you collect.

Dennis Lindley coined the term “Cromwell’s rule” for the advice that nothing should have zero prior probability unless it is logically impossible. The name comes from a statement by Oliver Cromwell addressed to the Church of Scotland:

I beseech you, in the bowels of Christ, think it possible that you may be mistaken.

In probabilistic terms, “think it possible that you may be mistaken” corresponds to “don’t give anything zero prior probability.” If an event has zero prior probability, it will have zero posterior probability, no matter how much evidence is collected. If an event has tiny but non-zero prior probability, enough evidence can eventually increase the posterior probability to a large value.

The difference between a small positive prior probability and a zero prior probability is the difference between a skeptical mind and a closed mind.