The Giry monad (Giry 80, following Lawvere 62) is the monad on a category of suitable spaces which sends each suitable space to the space of suitable probability measures on .
It is one of the main examples of a probability monad, and hence one of the main structures used in categorical probability.
The Giry monad is defined on the category of measurable spaces, assigning to each measurable space the space of all probability measures on endowed with the -algebra generated by the set of all the evaluation maps
sending a probability measure to , where ranges over all the measurable sets of . The unit of the monad sends a point to the Dirac measure at , , while the monad-multiplication is defined by the natural transformation
given by
This makes the endofunctor into a monad, and as such this is the Giry monad on measurable spaces, as originally defined by Lawvere 1962.
An alternative choice, convenient for analysis purposes, and introduced by Giry, is obtained by restricting the category of measurable spaces to the (full) subcategory which are those measurable spaces generated by Polish spaces, , which are separable metric spaces for which a complete metric exists. The morphisms of this category are continuous functions.
Write
for the endofunctor which sends a space, , to the space of probability measures on the Borel subsets of . is equipped with the weakest topology which makes the integration map continuous for any , a bounded, continuous, real function on .
There is a natural transformation
given by
This makes the endofunctor into a monad, and this is the Giry monad on Polish spaces.
The Kleisli morphisms of the Giry monad on Meas (and related subcategories) are Markov kernels. Therefore its Kleisli category is the category Stoch. It is one of the most important examples of a Markov category.
We can’t say anything about the -algebras on the category of measurable spaces due to lack of structure and set-theoretical issues. However, the monad restricts to the full subcategory of standard Borel space where we can construct a factorization of the monad which allows us to understand how algebras arise via expectation maps.
If is any standard space then the space of probability measures is a superconvex space with the structure defined pointwise, i.e., if is a finite collection of probability measures on then, for every sequence with each the countable affine sum , is also a probability measure, defined at the measurable set in by
Given any -algebra the space has the structure of a superconvex space which makes the measurable function a countably affine (measurable) map.
Moreover, for any measurable space and -algebra on it, if is a map of -algebras then the measurable function is also countably affine.
Given define the superconvex space structure on by
Because a -algebra must satisfy we have
where the last line makes use of the definition of the convex structure on . Thus every -algebra is countably affine.
To prove the map of -algebras is a countably affine map we compute
Let be the category of standard spaces with a superconvex space structure with morphisms be countably affine measurable functions. Let which is the one-point compactification of the interval . is a second-countable compact Hausdorff space so it is a Polish space, and hence with the Borel -algebra is a standard space. Let denote the full subcategory of consisting of those spaces which are coseparable by .
Given any standard space and any let denote the functional sending . Let . Taking we obtain the space of endomaps on . Recall that a -generalized point of is a functional satisfying, for all and all , the equation
The reader can verify than any such -generalized element is -linear, weakly averaging, and additive.
We say an object in satisfies the fullness property if and only if for every the property
holds.
Every -generalized element of which satisfies the fullness property is a point, i.e, for a unique element . ( is the evaluation map at the point .)
Since the fullness property is satisfied there exist at least one element such that . Since lies in it is coseparated by , and hence there is at most one element satisfying, for all countably affine measurable maps , the equation .
Define to be the full subcategory of consisting of those objects which satisfy the fullness property.
Note that is an object in . (This is exercise 8.23 in the text Sets for Mathematics by Lawvere and Rosebrugh.)
We now proceed to show that for every standard space that is an object in .
Let be a standard space. Then for we have
We have
so taking the intersection over all elements in the generating field (with a basis) yields the result using the well known result if is a standard space and satisfy for all then .
The proof of the next three lemmas, which are all straight forward, can be found in (Sturtz 25)
Let be a standard space. Then every countably affine measurable function is a countable affine sum of the form where and .
Let be a standard space. Then
Let be a standard space. Then is an object in .
This lemma shows there exists a functor which is the Giry monad functor with codomain , and coupled with the partial forgetful functor which forgets the superconvex space structure, we obtain a factorization of the monad. (This will be an adjoint factorization once we prove some more facts.)
Let denote the full subcategory of consisting of the single object . The functor defined (on objects) by
is a full and faithful functor.
In the category every countably affine measurable function is determined by its value on points . Hence to prove the fully faithful property it suffices to prove those properties on points.
Since lies in , which has as a coseparator, the faithful property holds. The fullness property follows from Lemma .
Let denote the Dirac measure at .
If is an object in then there exists a unique countably affine measurable function such that for all .
Let denote the inclusion functor. Let denote the slice category of arrows , and let denote the projection functor. For the theorem is equivalent to saying with the projection map at component being .
Consider the cone over with vertex and natural transformation components .
Since there exists a unique -morphism such that for all countably affine maps . It follows that on that, for all in that . Since is a coseparator in it follows .
A more appropriate notation for the unique morphism is which, in the special case of lying in an -vector space coincides with the usual interpretation. For an arbitrary space the function is the unique morphism such that, for every , is the unique point such that for all countably affine measurable functions .
The function is a -algebra.
The property follows from the preceding corollary. To prove the property compose both sides of that equation by a countably affine measurable map . If we spell both sides of that equation out, using the property , the equation holds valid. The result of the lemma follows from the property that coseparates, i.e, the set of morphisms are jointly monic on .
Note that what we have shown is that is a subcategory of . Proving the reverse inclusion (or finding a counterexample) is an interesting open problem.
Concerning the algebras the above method of using generalize points can be used to obtain similar results. (Doberkat 03) gives a different representation for the algebras of , although, like the Eilenberg-Moore characterization, the representation is descriptive but not constructive. His representation for the algebras is based upon the idea that we want continuous maps such that the ‘fibres’ are convex and closed, and such that , the Delta distribution on , is in the fibre over . And there’s another condition which requires a compact subset of to be sent to a compact subset of .
As an example of -algebras, represented via convex spaces, Doberkat gives the example of closed and bounded convex subsets of some Euclidean space, and shows that the construction of a barycenter yields an algebra. We summarize that construction here.
Fix as a bounded, closed, and convex subset of the Euclidean space . A vector is called a barycenter of the probability measure iff, for all linear functionals on , the property holds. Since every linear functional on is given by for a unique , the defining property of a barycenter (for convex subsets of ) given above is equivalent to saying that, for all , the property holds.
The barycenter of exists, it is uniquely determined, and it is an element of .
Let be the barycenter of . Then is an algebra for the -monad.
The example given by the unit square fits into this theory nicely. The barycenter map is given by the marginalization map . However when the space cannot be characterized by a finite number of parameters, the above theory using barycenters directly cannot be applied, even though the marginalization map is still a -algebra. (It is still a barycenter map but not within the framework of .) These barycenter maps are the components of a natural transformation characterizing the counit of an adjunction of the -monad.
More information concerning the use of barycenter maps in finding algebras can be found at Radon monad.
Finally, we note that Doberkat points out that for discrete Polish space that is disconnected, and hence there can be no continuous map . Hence , irrelavant of any convex structure we endow it with, cannot be an algebra.
See also monads of probability, measures, and valuations.
Vladimir Voevodsky has also worked on a category theoretic treatment of probability theory, and gave few talks on this at IHES, Miami, in Moscow etc. Voevodsky had in mind applications in mathematical biology?, for example, population genetics:
…a categorical study of probability theory where “categorical” is understood in the sense of category theory. Originally, I developed this approach to probability to get a better understanding of the constructions which I had to deal with in population genetics. Later it evolved into something which seems to be also interesting from a purely mathematical point of view. On the elementary level it gives a category which is useful for the work with probabilistic constructions involving complicated combinations of stochastic processes of different types. On a more advanced level, applying in this context the old idea of a functor as a generalized object one gets a better view of the relationship between probability and the theory of (pre-)ordered topological vector spaces.
A talk in Moscow (20 Niv 2008, in Russian) can be viewed here, wmv 223.6 Mb. Abstract:
In early 60-ies Bill Lawvere defined a category whose objects are measurable spaces and morphisms are Markov kernels. I will try to show how this category allows one to think about many of the notions of probability theory in categorical terms and to connect probabilistic objects to objects of other types through various functors.
Voevodsky’s unfinished notes on categorical probability theory have been released posthumously.
Prakash Panangaden in Probabilistic Relations defines the category (stochastic relations) to have as objects sets equipped with a -field. Morphisms are conditional probability densities or stochastic kernels. So, a morphism from to is a function such that
If is a morphism from to , then from to is defined as .
Panangaden’s definition differs from Giry’s in the second clause where subprobability measures are allowed, rather than ordinary probability measures.
Panangaden emphasises that the mechanism is similar to the way that the category of relations can be constructed from the power set functor. Just as the category of relations is the Kleisli category of the powerset functor over the category of sets Set, is the Kleisli category of the functor over the category of measurable spaces and measurable functions which sends a measurable space, , to the measurable space of subprobability measures on . This functor gives rise to a monad.
What is gained by the move from probability measures to subprobability measures? One motivation seems to be to model probabilistic processes from to a coproduct . This you can iterate to form a process which looks to see where in you eventually end up. This relates to being traced.
There is a monad on , . A probability measure on is a subprobability measure on . Panangaden’s monad is a composite of Giry’s and .
measure, probability measure, pushforward measure, convex mixture
Radon monad, distribution monad, extended probabilistic powerdomain
The adjunction underlying the Giry monad was originally developed by Lawvere in 1962, prior to the full recognition of the relationship between monads and adjunctions. Although P. Huber had already shown in 1961 that every adjoint pair gives rise to a monad, it wasn’t until 1965 that the constructions of Eilenberg-Moore, and Kleisli, made the essential equivalence of both concepts manifest.
Lawvere’s construction was written up as an appendix to a proposal to the Arms Control and Disarmament Agency, set up by President Kennedy as part of the State Department to handle planning and execution of certain treaties with the Soviet Union. This appendix was intended to provide a reasonable framework for arms control verification protocols (Lawvere 20).
At that time, Lawvere was working for a “think tank” in California, and the purpose of the proposal was to provide a means for verifying compliance with limitations on nuclear weapons. In the 1980’s, Michèle Giry was collaborating with another French mathematician at that time who was also working with the French intelligence agency, and she was able to obtain a copy of the appendix. Giry then developed and extended some of the ideas in the appendix (Giry 80)
Gian-Carlo Rota had also (somehow) obtained a copy of the appendix, which ended up in the library at The American Institute of Mathematics, and only became publicly available in 2012.
From Lawvere 20:
I’d like to say that the idea of the category of probabilistic mappings, the document corresponding to that was not part of a seminar, as some of the circulations say, essentially it was the document submitted to the arms control and disarmament agency after suitable checking that the Pentagon didn’t disagree with it. Because of the fact that for arms control agencies as a side responsibility the forming of arms control agreements and part of these agreements must involve agreed upon protocols of verification. So the idea of that paper did not provide such protocols, but it purported to provide reasonable framework within which such protocol can be formulated.
The idea originates with
W. Lawvere, The category of probabilistic mappings, ms. 12 pages, 1962 (Lawvere Probability 1962)
(notice that the statement of origin on p.1 is wrong)
and was picked up and published in:
Michèle Giry, A categorical approach to probability theory, Categorical aspects of topology and analysis (Ottawa, Ont., 1980), pp. 68–85, Lecture Notes in Math. 915 Springer 1982 (doi:10.1007/BFb0092872)
(there are allegedly a few minor analytically incorrect points and gaps in proofs, observed by later authors).
Historical comments on the appearance of Lawvere 62 are made in
According to E. Burroni (2009), the Giry monad appears also in
The article
shows, in effect, that the Giry monad restricted to countable measurable spaces (with the discrete -algebra) yields the restricted Giry functor which has the codensity monad . This suggest that the natural numbers are ‘’sufficient’‘ in some sense. Indeed, the full subcategory of Polish spaces consisting of the single object of all natural numbers with the powerset -algebra is codense in - every continuous function is completely determined by its values on the countable dense subset of .
The article
views probability measures via double dualization, restricted to weakly averaging affine maps which preserves limits. A more satisfactory description of probability measures arises from recognizing the need for viewing them as weakly-averaging countably affine maps, obtained by double dualizing into , which then yields the characterization of -algebras summarized above, which is from the article
Some corrections from an earlier version of the Categorical Probability Theory article, were pointed out in
Apart from these papers, there are similar developments in
Franck van Breugel, The metric monad for probabilistic nondeterminism, features both the Lawvere/Giry monad and Panangaden’s monad.
Ernst-Erich Doberkat, Characterizing the Eilenberg-Moore algebras for a monad of stochastic relations (pdf)
Ernst-Erich Doberkat, Kleisli morphisms and randomized congruences, Journal of Pure and Applied Algebra Volume 211, Issue 3, December 2007, Pages 638-664 https://doi.org/10.1016/j.jpaa.2007.03.003
N. N. Cencov, Statistical decisions rules and optimal Inference, Translations of Math. Monographs 53, Amer. Math. Society 1982
(blog comment) Cencov’s “category of statistical decisions” coincides with Giry’s (Lawvere’s) category. I ( somebody) have the sense that Cencov discovered this category independently of Lawvere although years later.
category cafe related to Giry monad: category theoretic probability, coalgebraic modal logic
Samson Abramsky et al. Nuclear and trace ideals in tensored ∗-Categories,arxiv:math/9805102, on the representation of probability theory through monads, which looks to work Giry’s monad into a context even more closely resembling the category of relations.
There is also relation with work of Jacobs et al.
Robert Furber, Bart Jacobs, Towards a categorical account of conditional probability, arxiv:1306.0831
Bart Jacobs, Probabilities, distribution monads and convex categories, Theoretical
Computer Science 412(28) (2011) pp.3323–3336. https://doi.org/10.1016/j.tcs.2011.04.005, (preprint)
J. Culbertson and K. Sturtz use the Giry monad in their categorical approach to Bayesian reasoning and inference (both articles contain further references to the categorical approach to probability theory):
Jared Culbertson and Kirk Sturtz, A categorical foundation for Bayesian probability, Applied Cat. Struc. 2013 (preprint as arXiv:1205.1488)
Jared Culbertson and Kirk Sturtz, Bayesian machine learning via category theory, 2013 (arxiv:1312.1445)
Elisabeth Burroni, Lois distributives. Applications aux automates stochastiques, TAC 22, 2009 pp.199-221 (journal page)
where she derives stochastic automata as algebras for a suitable distributive law on the monoid and Giry monads.
B. Fong has a section on the Giry monad in his paper on Bayesian networks:
See also:
Discussion of the Giry monad extended to simplicial sets and used to characterize quantum contextuality via simplicial homotopy theory:
Cihan Okay, Sam Roberts, Stephen D. Bartlett, Robert Raussendorf, Topological proofs of contextuality in quantum mechanics, Quantum Information and Computation 17 (2017) 1135-1166 [arXiv:1701.01888, doi:10.26421/QIC17.13-14-5]
Cihan Okay, Aziz Kharoof, Selman Ipek, Simplicial quantum contextuality, Quantum 7 (2023) 1009 [arXiv:2204.06648, doi:10.22331/q-2023-05-22-1009]
Aziz Kharoof, Cihan Okay, Homotopical characterization of strongly contextual simplicial distributions on cone spaces [arXiv:2311.14111]
exposition:
Last revised on June 15, 2025 at 08:51:03. See the history of this page for a list of all contributions to it.