nLab Giry monad

Redirected from "Giry's monad".
Note: Giry monad, Giry monad, and Giry monad all redirect for "Giry's monad".
Contents

Contents

Idea

The Giry monad (Giry 80, following Lawvere 62) is the monad on a category of suitable spaces which sends each suitable space XX to the space of suitable probability measures on XX.

It is one of the main examples of a probability monad, and hence one of the main structures used in categorical probability.

Definition

The Giry monad is defined on the category of measurable spaces, assigning to each measurable space XX the space G(X)G(X) of all probability measures on XX endowed with the σ \sigma -algebra generated by the set of all the evaluation maps

ev U:G(X)[0,1] ev_U \colon G(X) \to [0,1]

sending a probability measure PP to P(U)P(U), where UU ranges over all the measurable sets of XX. The unit of the monad sends a point xXx \in X to the Dirac measure at xx, δ x\delta_x, while the monad-multiplication is defined by the natural transformation

μ X:G(G(X))G(X) \mu_{X} \;\colon\; G\big(G(X)\big) \longrightarrow G(X)

given by

μ X(Q)(U) qG(X)ev U(q)dQ. \mu_X (Q)(U) \;\coloneqq\; \textstyle{\int}_{q \in G(X)} ev_U(q) \,dQ \,.

This makes the endofunctor GG into a monad, and as such this is the Giry monad on measurable spaces, as originally defined by Lawvere 1962.

An alternative choice, convenient for analysis purposes, and introduced by Giry, is obtained by restricting the category of measurable spaces to the (full) subcategory which are those measurable spaces generated by Polish spaces, PolPol, which are separable metric spaces for which a complete metric exists. The morphisms of this category are continuous functions.

Write

P:PolPol P \colon Pol \to Pol

for the endofunctor which sends a space, XX, to the space of probability measures on the Borel subsets of XX. P(X)P(X) is equipped with the weakest topology which makes the integration map τ Xfdτ\tau \mapsto \int_{X}f d\tau continuous for any ff, a bounded, continuous, real function on XX.

There is a natural transformation

μ X:P(P(X))P(X) \mu_{X}: P(P(X)) \to P(X)

given by

μ X(M)(A):= P(X)τ(A)M(dτ). \mu_X (M)(A) := \int_{P(X)} \tau(A) M(d\tau).

This makes the endofunctor PP into a monad, and this is the Giry monad on Polish spaces.

Properties

Kleisli category

The Kleisli morphisms of the Giry monad on Meas (and related subcategories) are Markov kernels. Therefore its Kleisli category is the category Stoch. It is one of the most important examples of a Markov category.

Algebras over the Giry Monad on Standard spaces

We can’t say much about the GG-algebras on the category of measurable spaces due to set-theoretical issues, e.g., the hypothesis that measurable cardinals exist. However, the GG monad restricts to the full subcategory of standard Borel space where we can construct a factorization of the GG monad which allows us to understand how GG algebras arise via expectation maps.

Let us first note a relationship between the Giry monad on standard Borel space and the Radon monad. Every probability measure on a Standard Borel space? is a Radon measure. So, on compact Polish spaces, the Giry monad coincides with the measurable sets of the Radon monad. Do note that not every measurable map is continuous with regard to the topology of the Radon monad. This applies to algebras also — for example, the Giry algebra on [0,][0, \infty] is not continuous.

The Giry monad on Polish spaces uses the weak topology, which is weaker than the weak* topology of the Radon monad.

We now proceed to determine the category of algebras, Alg G\mathbf{Alg}_G, of the GG-monad.

If XX is any standard space then the space of probability measures GXG{X} is a superconvex space with the structure defined pointwise: if {P i} i=1 \{P_i\}_{i=1}^{\infty} is a countable collection of probability measures on XX then, for every sequence {p i} i=1 \{p_i\}_{i=1}^{\infty} with each p i[0,1]p_i \in [0,1] such that i=1 p i=1\sum_{i=1}^{\infty} p_i = 1, the countable affine sum i=1 p iP i\sum_{i=1}^{\infty} p_i P_i, is also a probability measure, defined at the measurable set UU in XX by

( i=1 p iP i)(U)= i=1 p iP i(U). \big(\sum_{i=1}^{\infty} p_i P_i\big)(U) = \sum_{i=1}^{\infty} p_i P_i(U).

By restriction to finite affine sums GXG{X} can also be viewed as a convex space.

Lemma

Given any GG-algebra GXhXG{X} \xrightarrow{h} X the base space XX has the structure of a convex space which makes the measurable function hh an affine (measurable) map. Moreover, morphisms of GG-algebras are also affine maps.

Proof

Given hh define the convex space structure on XX by

i=1 np ix i:=h( i=1 np iδ x i). \sum_{i=1}^{n} p_i x_i := h(\sum_{i=1}^{n} p_i \delta_{x_i}).

Because a GG-algebra hh must satisfy hμ X=hGhh \circ \mu_X = h \circ G{h} we have, for any finite sequence Q iGXQ_i \in GX,

(hμ X)( i=1 np iδ Q i) = (hGh)( i=1 np iδ Q i) h( i=1 np iQ i) = h( i=1 np iδ h(Q i)) = i=1 np ih(Q i) \begin{array}{lcl} (h \circ \mu_X)( \sum_{i=1}^{n} p_i \delta_{Q_i}) &=& (h \circ G{h})( \sum_{i=1}^{n} p_i \delta_{Q_i}) \\ h( \sum_{i=1}^{n} p_i Q_i) &=& h(\sum_{i=1}^{n} p_i \, \delta_{h(Q_i)}) \\ &=& \sum_{i=1}^{n} p_i h(Q_i) \end{array}

where the last line makes use of the definition of the convex structure on XX. Thus, every GG-algebra is affine.

To prove that any map of GG-algebras f:(X,h)(Y,k)f: (X, h) \to (Y, k) is an affine map, we compute

f( i=1 np ix i) = f(h( i=1 np iδ x i)) = k(G(f)( i=1 np iδ x i) = k( i=1 np iδ f(x i)) = i=1 np if(x i) . \begin{array}{lcll} f(\sum_{i=1}^{n} p_i x_i) &=& f(h(\sum_{i=1}^{n} p_i \delta_{x_i})) & \\ &=& k(G(f)(\sum_{i=1}^{n} p_i \delta_{x_i}) & \\ &=& k(\sum_{i=1}^{n} p_i \delta_{f(x_i)}) & \\ &=& \sum_{i=1}^{n} p_i f(x_i) & \end{array}.

Let StdCvx\mathbf{Std}\cap \mathbf{Cvx} be the category of standard spaces with a convex space structure with morphisms being affine measurable functions. Let \mathbb{R}_{\infty} be the one-point compactification of the real-line. \mathbb{R}_{\infty} is a second-countable compact Hausdorff space so it is a Polish space, and hence \mathbb{R}_{\infty} with the Borel σ\sigma-algebra is a standard space. Let StdCvx\mathbf{Std} \cap \mathbf{Cvx} denote the category whose objects are standard spaces with a convex space structure, and whose morphisms are affine measurable functions. Because \mathbb{R}_{\infty} is a coseparator in Cvx\mathbf{Cvx} it follows that \mathbb{R}_{\infty} is a coseparator in StdCvx\mathbf{Std} \cap \mathbf{Cvx}.

Given any standard space AA and any PG(A)P \in G(A) let Std(A, )P˜ \mathbf{Std}(A, \mathbb{R}_{\infty}) \xrightarrow{\tilde{P}} \mathbb{R}_{\infty} denote the functional sending f AfdPf \mapsto \int_A f \, dP. Let A=hom StdCvx(A, )\mathbb{R}_{\infty}^A = hom_{\mathbf{Std} \cap \mathbf{Cvx}}(A, \mathbb{R}_{\infty}). Taking A= A=\mathbb{R}_{\infty} we obtain the space \mathbb{R}_{\infty}^{\mathbb{R}_{\infty}} of endomaps on \mathbb{R}_{\infty}. Recall that a \mathbb{R}_{\infty}-generalized point of AA is a functional P˜\tilde{P} satisfying, for all ϕ \phi \in \mathbb{R}_{\infty}^{\mathbb{R}_{\infty}} and all m Am \in \mathbb{R}_{\infty}^A, the equation

ϕ(P˜(m))=P˜(ϕm). \phi \big( \tilde{P}(m) \big) = \tilde{P}(\phi \circ m).

The reader can verify than any such \mathbb{R}_{\infty}-generalized element P˜\tilde{P} is \mathbb{R}_{\infty}-linear, weakly averaging, and additive. Since P˜\tilde{P} is defined in terms of the Lebesque integral we have, more generally,

Lemma

Every \mathbb{R}_{\infty}-generalized element AP˜ \mathbb{R}_{\infty}^A \xrightarrow{\tilde{P}} \mathbb{R}_{\infty} of AA is countably additive.

Proof

Given a countable sequence {Af i } i\{A \xrightarrow{f_i} \mathbb{R}_{\infty}\}_i of measurable functions let g n= i=1 nf ig_n=\sum_{i=1}^n f_i. Since f i0f_i\ge 0 we have, by the monotone convergence theorem P˜( if i)=P˜(lim ng n)=lim n{P˜(g n)}=lim n i=1 nP˜(f i)= iP˜(f i)\tilde{P}(\sum_{i} f_i) = \tilde{P}(\lim_n g_n) = \lim_n \{\tilde{P}(g_n)\} = \lim_n \sum_{i=1}^n \tilde{P}(f_i) = \sum_i \tilde{P}(f_i).

We say an object AA in StdCvx\mathbf{Std} \cap \mathbf{Cvx} satisfies the fullness property if and only if for every PG(A)P \in G(A) the property

m Am 1(P˜(m))={aA|P˜(m)=m(a)m A} \displaystyle{ \bigcap_{m \in \mathbb{R}_{\infty}^A}} m^{-1}\big(\tilde{P}(m)\big) = \{a \in A \, | \, \tilde{P}(m)=m(a) \quad \forall m \in \mathbb{R}_{\infty}^A\} \ne \emptyset

holds.

Lemma

Every \mathbb{R}_{\infty}-generalized element P˜\tilde{P} of AA which satisfies the fullness property is a point, i.e, P˜=ev a\tilde{P} =ev_a for a unique element aAa \in A. (ev aev_a is the evaluation map at the point aa.)

Proof

Since the fullness property is satisfied there exist at least one element aAa \in A such that P˜(m)=m(a)\tilde{P}(m)=m(a). Since AA lies in StdCvx\mathbf{Std} \cap \mathbf{Cvx} is coseparated by \mathbb{R}_{\infty} there is at most one element aAa \in A satisfying, for all affine measurable maps Am A \xrightarrow{m} \mathbb{R}_{\infty}, the equation P˜(m)=m(a)\tilde{P}(m)=m(a).

Define Std Cvx\mathbf{Std}_{Cvx} to be the full subcategory of StdCvx\mathbf{Std} \cap \mathbf{Cvx} consisting of those objects which satisfy the fullness property.

Note that \mathbb{R}_{\infty} is an object in Std Cvx\mathbf{Std}_{Cvx}. (This is exercise 8.23 in the text Sets for Mathematics by Lawvere and Rosebrugh. Moreover, the expectation mapping G( )𝔼 (id ) G(\mathbb{R}_{\infty}) \xrightarrow{\mathbb{E}_{\bullet}(id_{\mathbb{R}_{\infty}})} \mathbb{R}_{\infty} is easily verified to be a GG-algebra.)

We now proceed to show that for every standard space XX that G(X)G(X) is an object in Std Cvx\mathbf{Std}_{Cvx}.

Lemma

Let (X,σ(𝔽))(X, \sigma(\mathbb{F})) be a standard space. Then for QG 2(X)Q \in \G^2(X) we have

U𝔽ev U 1(Q˜(ev U))=μ X(Q)\displaystyle{ \bigcap_{U \in \mathbb{F}}} ev_U^{-1}\big( \tilde{Q}(ev_U) \big) = \mu_X(Q)

Proof

We have

ev U 1(Q˜(ev U))={RG(X)|R(U)=μ X(Q)[U]} ev_U^{-1}\big( \tilde{Q}(ev_U) \big) = \{R \in G(X) \, | \, R(U)=\mu_X(Q)[U] \}

so taking the intersection over all elements in the generating field (with a basis) yields the result using the well known result if XX is a standard space and P,RG(X)P,R \in G(X) satisfy P(U)=R(U)P(U)=R(U) for all U𝔽U \in \mathbb{F} then P=RP=R.

Lemma

Let (X,σ(𝔽))(X, \sigma(\mathbb{F})) be a standard space. Then every affine measurable function G(X)m G(X) \xrightarrow{m} \mathbb{R}_{\infty} is a countable affine sum of the form m= iλ iev U im = \sum_i \lambda_i ev_{U_i} where λ iV\lambda_i \in V and U i𝔽U_i \in \mathbb{F}.

Proof

The superconvex space structure of G(X)G(X) is defined by the countable family of equations ev U( i=1 p iP i)= i=1 p iP i(U) ev_U( \sum_{i=1}^{\infty} p_i P_i) = \sum_{i=1}^{\infty}p_i P_i(U) for all U𝔽U \in \mathbb{F}. Any countable linear sum of the elements {ev U} U𝔽\{ev_U\}_{U \in \mathbb{F}} therefore defines a countable affine map G(X)λ jev U j G(X) \xrightarrow{ \lambda_j ev_{U_j}} \mathbb{R}_{\infty}. (Any constant transformation by cc can be represented by a term cev Xc ev_X since cev X(P)=cc ev_X(P) = c. Hence we can refer to countable linear sums instead of countable affine sums.)

Conversely, for a set function G(X)m G(X) \xrightarrow{m} \mathbb{R}_{\infty} to preserve the superconvex space structure of G(X)G(X) which is defined in terms of the elements ev Uev_U it must be a countable linear transformation of those elements. Indeed, for any measurable function Xf X \xrightarrow{f} \mathbb{R}_{\infty} the function G(X) Xfd G(X) \xrightarrow{\int_X f \, d{\bullet}} \mathbb{R}_{\infty} is, by the definition of the Lebesque integral, given by XfdP=( iℕ𝕒𝕥λ iev U i)P\int_X f \, dP = \big(\sum_{i \in \mathbb{Nat}} \lambda_i ev_{U_i}\big)P where ψ n= i=1 N nλ n,iχ U n,i\psi_n = \sum_{i=1}^{N_n} \lambda_{n,i} \chi_{U_{n,i}} is a simple function satisfying ψ nf\psi_n \le f and for which {ψ n} n=1 \{\psi_n\}_{n=1}^{\infty} converges pointwise to ff.

Lemma

Let (X,σ(𝔽))(X, \sigma(\mathbb{F})) be a standard space. Then

G(X)m m 1(Q˜(m))=μ X(Q)= U𝔽ev U 1(Q˜(ev U)). \displaystyle{ \bigcap_{G(X) \xrightarrow{m} \mathbb{R}_{\infty}}}m^{-1}\big(\tilde{Q}(m)\big) =\mu_X(Q) = \displaystyle{\bigcap_{U \in \mathbb{F}}} ev_U^{-1}\big( \tilde{Q}(ev_U) \big).

Proof

Since every \mathbb{R}_{\infty}-generalized point Q˜\tilde{Q} of G(X)G(X) is \mathbb{R}_{\infty}-linear and countably additive it follows that Q˜( iλ iev U i)= iλ iQ˜(ev U i)\tilde{Q}(\sum_i \lambda_i ev_{U_i}) = \sum_i \lambda_i \tilde{Q}(ev_{U_i}). Thus ( iλ iev U i) 1(Q˜( iλ iev U i)={PG(X)| iλ iP(U i)=λ iμ X(Q)[U i]}(\sum_i \lambda_i ev_{U_i})^{-1}(\tilde{Q}(\sum_i \lambda_i ev_{U_i}) = \{ P \in G(X) \, | \, \sum_i \lambda_i P(U_i) = \lambda_i \mu_X(Q)[U_i]\}. Now take the intersection over all such countably affine measurable functions mm, which includes the basic functions m=ev Um=ev_U, m( iλ iev U i) 1(Q˜( iλ iev U i)={PG(X)| iλ iR(U i)= iλ iμ X(Q)[U i]m}\bigcap_m (\sum_i \lambda_i ev_{U_i})^{-1}(\tilde{Q}(\sum_i \lambda_i ev_{U_i}) = \{ P \in G(X) \, | \, \sum_i \lambda_i R(U_i) = \sum_i \lambda_i \mu_X(Q)[U_i] \forall m\}. The only PG(X)P \in G(X) satisfying the right hand side term for all countably affine maps mm is clearly P=μ X(Q)P= \mu_X(Q).

Lemma

Let (X,σ(𝔽))(X, \sigma(\mathbb{F})) be a standard space. Then G(X)G(X) is an object in Std Cvx\mathbf{Std}_{Cvx}.

Proof

By the preceding lemma the space G(X)G(X) satisfies the fullness property. By the well-known result that if P,RG(X)P,R \in G(X) with P(U)=R(U)P(U)=R(U) for all U𝔽U \in \mathbb{F} it follows that P=RP=R, it follows that the evaluations maps G(X)ev UVG(X) \xrightarrow{ev_U} V coseparate the points of G(X)G(X). Hence the result follows by definition of Std Cvx\mathbf{Std}_{Cvx}.

This lemma shows there exists a functor StdG^Std Cvx\mathbf{Std} \xrightarrow{\hat{G}} \mathbf{Std}_{Cvx} which is the Giry monad functor with codomain Std Cvx\mathbf{Std}_{Cvx}, and coupled with the partial forgetful functor Std Cvx𝒰 CvxStd\mathbf{Std}_{Cvx} \xrightarrow{\mathcal{U}_{Cvx}} \mathbf{Std} which forgets the convex space structure, we obtain a factorization of the GG monad. (This will be an adjoint factorization once we prove some more facts.)

Theorem

Let \mathcal{R} denote the full subcategory of Std Cvx\mathbf{Std}_{Cvx} consisting of the single object \mathbb{R}_{\infty}. The functor defined (on objects) by

Std Cvx op 𝒴 Set A Std Cvx(A,) \begin{array}{ccc} \mathbf{Std}_{Cvx}^{op} & \xrightarrow{\mathcal{Y}} & \mathbf{Set}^{\mathcal{R}} \\ A & \mapsto & \mathbf{Std}_{Cvx}(A, \bullet) \end{array}

is a full and faithful functor.

Proof

In the category Std Cvx\mathbf{Std}_{Cvx} every affine measurable function is determined by its value on points 1aA1 \xrightarrow{a} A. Hence to prove the fully faithful property it suffices to prove those properties on points.

Since AA lies in Std Cvx\mathbf{Std}_{Cvx}, which has \mathbb{R}_{\infty} as a coseparator, the faithful property holds. The fullness property follows from Lemma .

Let δ a\delta_a denote the Dirac measure at aa.

Corollary

If AA is an object in Std Cvx\mathbf{Std}_{Cvx} then there exists a unique affine measurable function G(A)ϵ AAG(A) \xrightarrow{\epsilon_A} A such that ϵ A(δ a)=a\epsilon_A(\delta_a)=a for all aAa \in A.

Proof

Let ιStd Cvx\mathcal{R} \xrightarrow{\iota} \mathbf{Std}_{Cvx} denote the inclusion functor. Let AιA\downarrow \iota denote the slice category of arrows Am A \xrightarrow{m} \mathbb{R}_{\infty}, and let AιπA \downarrow \iota \xrightarrow{\pi} \mathcal{R} denote the projection functor. For 𝒟 A=AιπιStd Cvx\mathcal{D}_A = A \downarrow \iota \xrightarrow{\pi} \mathcal{R} \xrightarrow{\iota} \mathbf{Std}_{Cvx} the theorem is equivalent to saying A=lim𝒟 AA = \lim \mathcal{D}_A with the projection map at component mm being mm.

Consider the cone over 𝒟 A\mathcal{D}_A with vertex G(A)G(A) and natural transformation components 𝔼 (m)=𝔼 (id )G(m)\mathbb{E}_{\bullet}(m) = \mathbb{E}_{\bullet}(id_{\mathbb{R}_{\infty}}) \circ \G(m).

Since A=lim𝒟 AA=\lim \mathcal{D}_A there exists a unique Std Cvx\mathbf{Std}_{Cvx}-morphism G(A)ϵ AAG(A) \xrightarrow{\epsilon_A} A such that mϵ A=𝔼 (m)m \circ \epsilon_A = \mathbb{E}_{\bullet}(m) for all affine maps Am A \xrightarrow{m} \mathbb{R}_{\infty}. It follows that on δ aG(A)\delta_a \in G(A) that, for all Am A \xrightarrow{m} \mathbb{R}_{\infty} in Std Cvx\mathbf{Std}_{Cvx} that m(ϵ A(δ a))=m(a)m(\epsilon_A(\delta_a)) = m(a). Since \mathbb{R}_{\infty} is a coseparator in Std Cvx\mathbf{Std}_{Cvx} it follows ϵ A(δ a)=a\epsilon_A(\delta_a)=a.

A more appropriate notation for the unique morphism ϵ A\epsilon_A is 𝔼 (id A)\mathbb{E}_{\bullet}(id_A) which, in the special case of AA lying in an \mathbb{R}-vector space coincides with the usual interpretation. For an arbitrary space AA the function G(A)𝔼 (id A)AG(A) \xrightarrow{\mathbb{E}_{\bullet}(id_A)} A is the unique morphism such that, for every PG(A)P \in G(A), 𝔼 P(id A)A\mathbb{E}_P(id_A) \in A is the unique point such that m(𝔼 P(id A))= Am(a)dPm(\mathbb{E}_{P}(id_A)) = \int_A m(a) \, dP for all countably affine measurable functions Am A \xrightarrow{m} \mathbb{R}_{\infty}.

Lemma

The function ϵ A\epsilon_A is a GG-algebra.

Proof

The property ϵ A(δ a)=a\epsilon_A(\delta_a)=a follows from the preceding corollary. To prove the property ϵ Aμ A=ϵ AG(ϵ A)\epsilon_A \circ \mu_A = \epsilon_A \circ G(\epsilon_A) compose both sides of that equation by a countably affine measurable map Am A \xrightarrow{m} \mathbb{R}_{\infty}. If we spell both sides of that equation out, using the property Amd(μ X(Q))= PG(A)𝔼 P(m)dQ\int_A m \, d(\mu_X(Q)) = \int_{P \in \G(A)} \mathbb{E}_{P}(m) dQ, the equation holds valid. The result of the lemma follows from the property that \mathbb{R}_{\infty} coseparates, i.e, the set of morphisms Am A \xrightarrow{m} \mathbb{R}_{\infty} are jointly monic on AA.

A more detailed proof can be found in (Sturtz 25).

Lemma

Let A obStd CvxA \in_{ob} \mathbf{Std}_{Cvx}. Every affine measurable function Am A \xrightarrow{m} \mathbb{R}_{\infty} yields a morphism of G\G-algebras.

Proof

By Corollary 3.9 the affine measurable function ϵ A=𝔼 (id A)\epsilon_A =\mathbb{E}_{\bullet}(id_A) is the unique morphism in Std Cvx\mathbf{Std}_{Cvx} such that for every Am A \xrightarrow{m} \mathbb{R}_{\infty} the Std Cvx\mathbf{Std}_{Cvx}-diagram commutes. But both 𝔼 (id A)\mathbb{E}_{\bullet}(id_A) and 𝔼 (id )\mathbb{E}_{\bullet}(id_{\mathbb{R}_{\infty}}) are GG-algebras. Hence mm is a morphism of those algebras.

Lemma

The construction 𝔼 (id A)\mathbb{E}_{\bullet}(id_A) is natural in the argument AA.

Proof

The proof is straightforward using the previous lemma.

Using the naturality of 𝔼\mathbb{E} we obtain an adjunct pair StdG^Std Cvx\mathbf{Std} \xrightarrow{\hat{G}} \mathbf{Std}_{Cvx}, which is the Giry monad (functor) viewed as a functor into Std Cvx\mathbf{Std}_{Cvx}, and the partial forgetful functor Std Cvx𝒰 CvxStd\mathbf{Std}_{Cvx} \xrightarrow{\mathcal{U}_{Cvx}} \mathbf{Std} which forgets the convex space structure, with the natural transformation 𝔼\mathbb{E} as the counit of the adjunction. The composite functor 𝒰 CvxG^=G\mathcal{U}_{Cvx} \circ \hat{G} = G.

Since the Giry monad factors through Std Cvx\mathbf{Std}_{Cvx} it follows that Std Cvx\mathbf{Std}_{Cvx} is a subcategory of Alg G\mathbf{Alg}_{G}.

Theorem

Std Cvx=Alg G\mathbf{Std}_{Cvx} = \mathbf{Alg}_{G}.

Proof

Suppose that (X,h)(X,h) is a GG-algebra so that XX is an object in Alg G\mathbf{Alg}_G. By Lemma 3.1 XX has a convex space structure so XX is an object in StdCvx\mathbf{Std} \cap \mathbf{Cvx}. To show that XX is an object in Std Cvx\mathbf{Std}_{Cvx} we only need to verify that XX satisfies the fullness condition.

Take any affine measurable function Xm X \xrightarrow{m} \mathbb{R}_{\infty}. We claim that (X,h)m( ,𝔼 (id )(X,h) \xrightarrow{m} (\mathbb{R}_{\infty}, \mathbb{E}_{\bullet}(id_{\mathbb{R}_{\infty}}) is a morphism of GG-algebras. In other words, the right-hand side of the Std\mathbf{Std}-diagram commutes.

To prove this note that the space G(X)G(X) is, by Lemma 3.7, an object in Std Cvx\mathbf{Std}_{Cvx}, and that \mathbb{R}_{\infty} is also an object in Std Cvx\mathbf{Std}_{Cvx}. The composite map mhm \circ h is an affine measurable map and hence an arrow in Std Cvx\mathbf{Std}_{Cvx}. By Corollary 3.11 it follows that the outer square commutes. Thus we have

𝔼 (id )G(m)G(h)=m(hμ X)=mhG(h). \mathbb{E}_{\bullet}(id_{\mathbb{R}_{\infty}}) \circ G(m) \circ G(h) = m \circ (h \circ \mu_X) = m \circ h \circ G(h).

Now note that G(h)G(h) is an epimorphism (onto) because hh is an epimorphism. Consequently, cancelling the term G(h)G(h) on the right in the preceding equation shows that the right-hand side of the square in the diagram commutes.

Since mm is a morphism of GG-algebras it follows that for all PG(X)P \in G(X) that m(h(P))=𝔼 P(m)m( h(P)) = \mathbb{E}_P(m), which in turn implies that

h(P)m 1(𝔼 P(m)) h(P) \in m^{-1}\big(\mathbb{E}_P(m)\big)

or equivalently, h(P)m 1(P˜(m))h(P) \in m^{-1}( \tilde{P}(m) ). This equation holds for all affine measurable maps Xm X \xrightarrow{m} \mathbb{R}_{\infty}, and hence the fullness property is satisfied, i.e.,

m A|m 1(P˜(m)). \bigcap_{m \in \mathbb{R}_{\infty}^A|} m^{-1}(\tilde{P}(m)) \ne \emptyset.

Consequently XX lies in the category Std Cvx\mathbf{Std}_{Cvx}.

The fact that every morphism in Alg G\mathbf{Alg}_G is a morphism in Std Cvx\mathbf{Std}_{Cvx} follows from Lemma 3.1. Hence we have shown that Alg G\mathbf{Alg}_G is a subcategory of Std Cvx\mathbf{Std}_{Cvx}. Combining this fact with the result that Std Cvx\mathbf{Std}_{Cvx} is a subcategory of Alg G\mathbf{Alg}_G yields the result.

Concerning PP algebras, the above method of using generalized points can be used to obtain similar results. (Doberkat 03) gives a different representation for the algebras of PP, although, like the Eilenberg-Moore characterization, the representation is descriptive but not constructive. (This is evident in the fact that he does not theoretically relate the algebras to the expectation mapping which characterizes the PP-algebras.) His representation for the algebras is based upon the idea that we want continuous maps h:P(X)Xh:P(X) \rightarrow X such that the ‘fibres’ are convex and closed, and such that δ x\delta_x, the Delta distribution on xx, is in the fibre over xx. And there’s another condition which requires a compact subset of P(X)P(X) to be sent to a compact subset of XX.

Doberkat points out that for discrete Polish space XX that XX is disconnected, and hence there can be no continuous map PXXP{X} \rightarrow X. Hence XX, irrelavant of any convex structure we endow it with, cannot be an algebra. (In contrast, discrete standard measurable spaces X=nX=\mathbf{n}, where n\mathbf{n} is a countable set, do have GG-algebras GnnG{\mathbf{n}} \rightarrow \mathbf{n} defined by i=1 np iδ imin{itextrmsuchthatp i>0}\sum_{i=1}^{n} p_i \delta_i \mapsto \min \{ i \, \textrm{ such that } p_i \gt 0 \} where n\mathbf{n} has the (discrete) convex space structure 12i+12j=min(i,j)\frac{1}{2} i + \frac{1}{2} j = min(i,j).)

See also monads of probability, measures, and valuations.

Voevodsky’s work

Vladimir Voevodsky has also worked on a category theoretic treatment of probability theory, and gave few talks on this at IHES, Miami, in Moscow etc. Voevodsky had in mind applications in mathematical biology?, for example, population genetics:

See Miami Talk abstract

…a categorical study of probability theory where “categorical” is understood in the sense of category theory. Originally, I developed this approach to probability to get a better understanding of the constructions which I had to deal with in population genetics. Later it evolved into something which seems to be also interesting from a purely mathematical point of view. On the elementary level it gives a category which is useful for the work with probabilistic constructions involving complicated combinations of stochastic processes of different types. On a more advanced level, applying in this context the old idea of a functor as a generalized object one gets a better view of the relationship between probability and the theory of (pre-)ordered topological vector spaces.

A talk in Moscow (20 Niv 2008, in Russian) can be viewed here, wmv 223.6 Mb. Abstract:

In early 60-ies Bill Lawvere defined a category whose objects are measurable spaces and morphisms are Markov kernels. I will try to show how this category allows one to think about many of the notions of probability theory in categorical terms and to connect probabilistic objects to objects of other types through various functors.

Voevodsky’s unfinished notes on categorical probability theory have been released posthumously.

Panangaden’s monad

Prakash Panangaden in Probabilistic Relations defines the category SRelSRel (stochastic relations) to have as objects sets equipped with a σ\sigma-field. Morphisms are conditional probability densities or stochastic kernels. So, a morphism from (X,Σ X)( X, \Sigma_X) to (Y,Σ Y)( Y, \Sigma_Y) is a function h:X×Σ Y[0,1]h: X \times \Sigma_Y \to [0, 1] such that

  1. BΣ Y.λxX.h(x,B)\forall B \in \Sigma_Y . \lambda x \in X . h(x, B) is a bounded measurable function,
  2. xX.λBΣ Y.h(x,B)\forall x \in X . \lambda B \in \Sigma_Y . h(x, B) is a subprobability measure on Σ Y\Sigma_Y.

If kk is a morphism from YY to ZZ, then khk \cdot h from XX to ZZ is defined as (kh)(x,C)= Yk(y,C)h(x,dy)(k \cdot h)(x, C) = \int_Y k(y, C)h(x, d y).

Panangaden’s definition differs from Giry’s in the second clause where subprobability measures are allowed, rather than ordinary probability measures.

Panangaden emphasises that the mechanism is similar to the way that the category of relations can be constructed from the power set functor. Just as the category of relations is the Kleisli category of the powerset functor over the category of sets Set, SRelSRel is the Kleisli category of the functor over the category of measurable spaces and measurable functions which sends a measurable space, XX, to the measurable space of subprobability measures on XX. This functor gives rise to a monad.

What is gained by the move from probability measures to subprobability measures? One motivation seems to be to model probabilistic processes from XX to a coproduct X+YX + Y. This you can iterate to form a process which looks to see where in YY you eventually end up. This relates to SRelSRel being traced.

There is a monad on MeasureSpacesMeasureSpaces, 1+:MeasMeas1 + -: Meas \to Meas. A probability measure on 1+X1 + X is a subprobability measure on XX. Panangaden’s monad is a composite of Giry’s and 1+1 + -.

History

The adjunction underlying the Giry monad was originally developed by Lawvere in 1962, prior to the full recognition of the relationship between monads and adjunctions. Although P. Huber had already shown in 1961 that every adjoint pair gives rise to a monad, it wasn’t until 1965 that the constructions of Eilenberg-Moore, and Kleisli, made the essential equivalence of both concepts manifest.

Lawvere’s construction was written up as an appendix to a proposal to the Arms Control and Disarmament Agency, set up by President Kennedy as part of the State Department to handle planning and execution of certain treaties with the Soviet Union. This appendix was intended to provide a reasonable framework for arms control verification protocols (Lawvere 20).

At that time, Lawvere was working for a “think tank” in California, and the purpose of the proposal was to provide a means for verifying compliance with limitations on nuclear weapons. In the 1980’s, Michèle Giry was collaborating with another French mathematician at that time who was also working with the French intelligence agency, and she was able to obtain a copy of the appendix. Giry then developed and extended some of the ideas in the appendix (Giry 80)

Gian-Carlo Rota had also (somehow) obtained a copy of the appendix, which ended up in the library at The American Institute of Mathematics, and only became publicly available in 2012.

From Lawvere 20:

I’d like to say that the idea of the category of probabilistic mappings, the document corresponding to that was not part of a seminar, as some of the circulations say, essentially it was the document submitted to the arms control and disarmament agency after suitable checking that the Pentagon didn’t disagree with it. Because of the fact that for arms control agencies as a side responsibility the forming of arms control agreements and part of these agreements must involve agreed upon protocols of verification. So the idea of that paper did not provide such protocols, but it purported to provide reasonable framework within which such protocol can be formulated.

References

The idea originates with

and was picked up and published in:

  • Michèle Giry, A categorical approach to probability theory, Categorical aspects of topology and analysis (Ottawa, Ont., 1980), pp. 68–85, Lecture Notes in Math. 915 Springer 1982 (doi:10.1007/BFb0092872)

    (there are allegedly a few minor analytically incorrect points and gaps in proofs, observed by later authors).

Historical comments on the appearance of Lawvere 62 are made in

According to E. Burroni (2009), the Giry monad appears also in

  • O. de la Tullaye, L’intégration considérée comme l’algèbre d’un triple. Rapport de Stage de D.E.A. manuscrit 1971.

The article

shows, in effect, that the Giry monad restricted to countable measurable spaces (with the discrete σ\sigma-algebra) yields the restricted Giry functor G|:Meas cMeasG|: \mathbf{Meas}_c \rightarrow \mathbf{Meas} which has the codensity monad GG. This suggest that the natural numbers \mathbb{N} are ‘’sufficient’‘ in some sense. Indeed, the full subcategory of Polish spaces consisting of the single object NN of all natural numbers with the powerset σ\sigma-algebra is codense in PolPol - every continuous function PXXP{X} \rightarrow X is completely determined by its values on the countable dense subset of PXP{X}.

The article

views probability measures via double dualization, restricted to weakly averaging affine maps. A more satisfactory description of probability measures arises from recognizing the need for viewing them as weakly-averaging countably affine maps, obtained by double dualizing into \mathbb{R}_{\infty}, which then yields the characterization of GG-algebras summarized above, which is from the article

  • Kirk Sturtz, Deriving the Giry algebras on standard Borel spaces using \mathbb{R}_{\infty}-generalized points, [[arXiv:2409.14861]]

Some corrections from an earlier version of the Categorical Probability Theory article, were pointed out in

Apart from these papers, there are similar developments in

  • Franck van Breugel, The metric monad for probabilistic nondeterminism, features both the Lawvere/Giry monad and Panangaden’s monad.

  • Ernst-Erich Doberkat, Characterizing the Eilenberg-Moore algebras for a monad of stochastic relations (pdf)

  • Ernst-Erich Doberkat, Kleisli morphisms and randomized congruences, Journal of Pure and Applied Algebra Volume 211, Issue 3, December 2007, Pages 638-664 https://doi.org/10.1016/j.jpaa.2007.03.003

  • N. N. Cencov, Statistical decisions rules and optimal Inference, Translations of Math. Monographs 53, Amer. Math. Society 1982

(blog comment) Cencov’s “category of statistical decisions” coincides with Giry’s (Lawvere’s) category. I (\leftarrow somebody) have the sense that Cencov discovered this category independently of Lawvere although years later.

  • category cafe related to Giry monad: category theoretic probability, coalgebraic modal logic

  • Samson Abramsky et al. Nuclear and trace ideals in tensored ∗-Categories,arxiv:math/9805102, on the representation of probability theory through monads, which looks to work Giry’s monad into a context even more closely resembling the category of relations.

There is also relation with work of Jacobs et al.

J. Culbertson and K. Sturtz use the Giry monad in their categorical approach to Bayesian reasoning and inference (both articles contain further references to the categorical approach to probability theory):

  • Jared Culbertson and Kirk Sturtz, A categorical foundation for Bayesian probability, Applied Cat. Struc. 2013 (preprint as arXiv:1205.1488)

  • Jared Culbertson and Kirk Sturtz, Bayesian machine learning via category theory, 2013 (arxiv:1312.1445)

  • Elisabeth Burroni, Lois distributives. Applications aux automates stochastiques, TAC 22, 2009 pp.199-221 (journal page)

where she derives stochastic automata as algebras for a suitable distributive law on the monoid and Giry monads.

B. Fong has a section on the Giry monad in his paper on Bayesian networks:

  • Fong: Causal Theories - A Categorical Perspective on Bayesian Networks, (2013) arXiv:1301.6201

See also:

Discussion of the Giry monad extended to simplicial sets and used to characterize quantum contextuality via simplicial homotopy theory:

exposition:

category: probability

Last revised on March 9, 2026 at 00:59:15. See the history of this page for a list of all contributions to it.