In elementary probability theory, Bayes' formula refers to a version of the formula
and relates the conditional probability of given to the one of given .
From the category-theoretic point of view, this formula expresses a duality, sometimes called Bayesian inversion, which often gives rise to a dagger structure.
In logical reasoning, implications in general cannot be reversed: , alone, does not imply .
In probability theory, instead, conditional statements exhibit a duality which is absent in pure logical reasoning.
Consider for example a city in which all taxis are yellow. (The implication is .) If we see a yellow car, of course we can’t be sure it’s a taxi. However, it’s more likely that it’s a taxi compared to a randomly colored car. This increase in likelihood is larger if the fraction of yellow cars is small. Indeed, Bayes' rule says that
More generally, in a city where most taxis are yellow, there is a high conditional probability that a given taxi is yellow,
The higher this probability is, the higher is the probability that a given yellow car is a taxi, again according to Bayes' rule:
In categorical probability, this phenomenon can be modeled by saying that to each “conditional” morphism in the form there corresponds a canonical morphism , called the Bayesian inverse. The name, which is kept for historical reasons, is somewhat improper, since we don’t actually have an inverse morphism in the sense of category theory, but simply a reversal of the arrow.
In some cases, this symmetry gives rise to a dagger category.
In traditional probability theory, Bayesian inversions are a special case of conditional probability. Some care must be taken to avoid dividing by zero or incurring into paradoxes via limiting procedures.
In the discrete case, a probability distribution on a set is simply a function such that
A stochastic map is a function such that for all ,
If we now equip and with discrete probability distributions and , obtaining discrete probability spaces, we say that is a measure-preserving stochastic map if for every ,
A Bayesian inverse of is then defined to be a measure-preserving stochastic map such that for every and , the following Bayes formula holds.
In the discrete case, Bayesian inverses always exist, and can be obtained by taking
for those with , and an arbitrary number on the with . (To ensure the normalization condition, on such one can for example take , where is a fixed element of . Note that is nonempty since it admits a probability measure.)
Bayesian inverses are not unique, but they are uniquely defined on the support of . That is, they are unique up to almost sure equality.
The situation is more delicate outside the discrete case. Given probability spaces and , and a measure-preserving Markov kernel , a Bayesian inverse of is a Markov kernel such that for all and , the following Bayes-type formula holds.
As one can see from Markov kernel - Almost sure equality, this formula specifies a kernel only up to almost sure equality, just as in the discrete case.
Existence, on the other hand, is more tricky. In general, a kernel as above may fail to exist. The problem is that in order for to be a well-defined Markov kernel , we need the following two conditions:
The first condition can be taken care of using conditional expectation. That however does not assure the second condition. It can be shown, however, that if is standard Borel and either
or
then a Bayesian inverse always exists.
(See also Markov kernel - Bayesian inversion and Markov kernel - Conditionals.)
In categorical probability, Bayesian inverses are axiomatized in a way which reflects the measure-theoretic version of the concepts. One then can choose to work in categories where such axioms are satisfied.
In Markov categories, Bayesian inverses are defined in a way that parallels the construction for Markov kernels.
The abstraction of a probability space is given by an object in a Markov category, together with a state . As usual, the abstraction of a kernel is a morphism .
A Bayesian inverse of with respect to is a morphism such that the following equation holds, where .
This recovers the classical probability definitions when instantiated in Stoch and its subcategories.
Just as in traditional probability, Bayesian inverses are unique only up to almost sure equality?. Also, just as in traditional probability, they may fail to exist. Being an instance of conditionals, however, they always exists when conditionals exist, such as in the category BorelStoch.
(See also Markov category - conditionals.)
In the category of couplings, the idea of Bayesian inversion is made explicit from the start by means of the dagger structure. Given probability spaces and , a coupling between them can be seen equivalently as going from to or from to . This duality, when the couplings are expressed via Markov kernels, reflects exactly Bayesian inversion. Therefore, at the level of joint distributions, one can consider the duality given by Bayesian inversions to be already part of the symmetry of the category.
Categorical abstractions of the category of coupling via dagger categories have therefore the concept of Bayesian inversion already built in.
(For now, see the references.)
Tobias Fritz, A synthetic approach to Markov kernels, conditional independence and theorems on sufficient statistics, Advances of Mathematics 370, 2020. (arXiv:1908.07021)
Kenta Cho and Bart Jacobs, Disintegration and Bayesian Inversion via String Diagrams, Mathematical Structures of Computer Science 29, 2019. (arXiv:1709.00322)
Dario Stein and Sam Staton, Probabilistic Programming with Exact Conditions, JACM, 2023. (arXiv)
Noé Ensarguet and Paolo Perrone, Categorical probability spaces, ergodic decompositions, and transitions to equilibrium, 2023. (arXiv:2310.04267)
For the quantum case:
Bob Coecke and Robert W. Spekkens, Picturing classical and quantum Bayesian inference, Synthese, 186(3), 2012. (arXiv)
Arthur J. Parzygnat, Inverses, disintegrations, and Bayesian inversion in quantum Markov categories, 2020. (arXiv)
Arthur J. Parzygnat and Benjamin P. Russo, A noncommutative Bayes theorem, Linear Algebra Applications 644, 2022. (arXiv)
Arthur J. Parzygnat, Conditional distributions for quantum systems, EPTCS 343, 2021. (arXiv)
Arthur J. Parzygnat, Francesco Buscemi, Axioms for retrodiction: achieving time-reversal symmetry with a prior, Quantum 7(1013), 2023. arXiv
James Fullwood, Arthur J. Parzygnat: From time-reversal symmetry to quantum Bayes rules, PRX Quantum 4 020334 (2023) [arXiv:2212.08088, doi:10.1103/PRXQuantum.4.020334]
James Fullwood, Arthur J. Parzygnat: Operator representation of spatiotemporal quantum correlations [arXiv:2405.17555]
Arthur J. Parzygnat, James Fullwood: Time-symmetric correlations for open quantum systems [arXiv:2407.11123]
Arthur J. Parzygnat, Benjamin P. Russo, Non-commutative disintegrations: existence and uniqueness in finite dimensions, Journal of Noncommutative Geometry 17(3), 2023. (arXiv)
Review:
Last revised on November 14, 2024 at 12:08:59. See the history of this page for a list of all contributions to it.