Bayesian reasoning

Bayesian reasoning is an application of probability theory to inductive reasoning (and abductive reasoning). It relies on an interpretation of probabilities as expressions of an agent’s uncertainty about the world, rather than as concerning some notion of objective chance in the world. The perspective here is that, when done correctly, inductive reasoning is simply a generalisation of deductive reasoning, where knowledge of the truth or falsity of a proposition corresponds to adopting the extreme probabilities $1$ and $0$.

It can be shown by so-called “Dutch Book” arguments, that a rational agent must set their degrees of belief in the outcomes of events in such a way that they satisfy the axioms of probability theory. The idea here is that to believe a proposition to degree $p$ is equivalent to being prepared to accept a wager at the corresponding odds. For instance, if I believe there is a 0.75 chance of rain today, then I should be prepared to accept a wager so that I receive $S$ units of currency if it does not rain, and pay out $S/3$ units if it does rain. Note that $S$ may be chosen to be negative by the bettor.

It can be shown then that such betting odds must satisfy the probability axioms, otherwise it will be possible for someone to place multiple bets which will cause the bookmaker to suffer a certain loss whatever the outcome. For example, in the case above, my degree of belief that it will *not* rain today should be 0.25. Were I to offer, say, 0.5, someone could stake $(-3)$ units on the first bet and $(-2)$ units on the second bet, and will gain $1$ unit whether or not it rains. Of course, real bookmakers have odds which sum to more than 1, but they suffer no guaranteed loss since clients are only allowed positive stakes.

Some consider the reliance on the idea of the undesirability of certain financial loss to be unbefitting for a justification of what is supposed to be an extension of ordinary deductive logic (Jaynes 2003). Axiomatisations in terms of the properties one should expect of degrees of plausibility have been given, and it can be shown from such axioms that these degrees satisfy the axioms of probability. Richard Cox is responsible for one such axiomatisation (for the moment see Wikipedia: Cox’s theorem).

Using Bayes' Rule?, degrees of belief can be updated on receipt of new evidence.

$P(h|e) = P(e|h) \cdot \frac{P(h)}{P(e)},$

where $h$ is a hypothesis and $e$ is evidence.

The idea here is that when $e$ is observed, your degree of belief in $h$ should be changed from $P(h)$ to $P(h|e)$. This is known as **conditionalizing**. If $P(h|e) \gt P(h)$, we say that $e$ has provided **confirmation** for $h$.

Typically, situations will involve a range of possible hypotheses, $h_1, h_2, \ldots$, and applying Bayes’ Rule will allow us to compare how these fare as new observations are made. For example, comparing the fate of two hypotheses,

$\frac{P(h_1|e)}{P(h_2|e)} = \frac{P(e|h_1)}{P(e|h_2)}\cdot \frac{P(h_1)}{P(h_2)}.$

How to assign prior probabilities to hypotheses when you don’t think you have an exhaustive set of rivals is not obvious. When astronomers in the nineteenth century tried to account for the anomalies in the position of Mercury’s perihelion, they tried out all manner of explanations: maybe there was a planet inside Mercury’s orbit, maybe there was a cloud of dust surrounding the sun, maybe the power in the inverse square law ought to be (2 - $\epsilon$),… Assigning priors and changing these as evidence comes in is one thing, but it would have been wise to have reserved some of the prior for ‘none of the above’.

Interestingly, one of the first people to give a qualitative sketch of how such an approach would work was George Polya in ‘Mathematics and Plausible Reasoning’ (Polya), where examples from mathematics are widely used. The idea of a Bayesian account of plausible reasoning in mathematics surprises many, it being assumed that mathematicians rely solely on deduction. (See also Chap. 4 of Corfield03.)

For some Bayesians, degrees of belief must satisfy further restrictions. One extreme form of this view holds that given a particular state of knowledge, there is a single best set of degrees of belief that should be adopted for any proposition.

Some such restrictions are generally accepted. If, for example, all I know of an event is that it has $n$ possible outcomes, the objective Bayesian will apply the principle of indifference to set their degrees of belief to $1/n$ for each outcome. On the other hand, if there is background knowledge concerning differences between the outcomes, indifference need not hold. This principle of indifference can be generalized to other kinds of invariance, such as the Jeffreys prior (wiki).

Other objective Bayesian principles include maximum entropy (see Jaynes 2003). For instance, Jaynes argues that if all that is known of a die is that the mean value of throws is equal to, say, 4, then a prior distribution over $\{1, 2, 3, 4, 5, 6\}$ should be chosen which maximizes entropy, subject to the constraint that the mean is 4.

- David Corfield,
*Towards a Philosophy of Real Mathematics*, Cambridge University Press, 2003. - Edwin Jaynes,
*Probability Theory: The Logic of Science*, Cambridge University Press, 2003. - George Polya,
*Mathematics and Plausible Reasoning: Vol. II: Patterns of Plausible Inference*, Princeton University Press, 1954.

Revised on September 26, 2014 08:34:05
by David Corfield
(129.12.18.81)