nLab transport plan

Contents

Context

Measure and probability theory

Idea
Definition
Main constructions

Identity coupling
Independent coupling
Composition of couplings
Couplings induced by functions
Couplings induced by kernels
Bayesian inversion

Related concepts
References

Idea

One can view a probability measure $p$ on a space $(X,\mathcal{A})$ as a “pile of mass”, for example, of sand, on the space $X$ . Using this picture, given two probability spaces $(X,\mathcal{A},p)$ and $(Y,\mathcal{B},q)$ , there could be many ways of moving the mass from $X$ to $Y$ in such a way that the sand from the pile $p$ is arranged to form the pile $q$ . (The mass from which point goes to which point, or points?) This “way of moving the mass” is called a transport plan, and it is usually encoded by a joint distribution or by a Markov kernel (see below).

It is useful to keep track of in which way we are rearranging the mass $p$ to form $q$ , and we can see these different ways as different morphisms, between the objects $(X,\mathcal{A},p)$ and $(Y,\mathcal{B},q)$ , in a category of couplings.

Definition

Let $(X,\mathcal{A},p)$ and $(Y,\mathcal{B},q)$ be probability spaces. A coupling or transport plan between $(X,\mathcal{A},p)$ and $(Y,\mathcal{B},q)$ is a probability space $(X\times Y, \mathcal{A}\otimes\mathcal{B},r)$ where

$\mathcal{A}\otimes\mathcal{B}$ is the tensor product sigma-algebra on the product space $X\times Y$ (generated by the sets $A\times B$ with $A\in\mathcal{A}$ and $B\in\mathcal{B}$ );
the measure $r$ has $p$ and $q$ as marginals, in the sense that for all $A\in\mathcal{A}$ and $B\in\mathcal{B}$ ,

$r(A\times Y) \,=\, p(A) \;\; \text{and} \;\; r(X\times B) \,=\, q(B) \,.$

Main constructions

Identity coupling

Given a probability space $(X,\mathcal{A},p)$ , the identity coupling or diagonal coupling is given by the following measure on $\mathcal{A}\otimes\mathcal{A}$ :

\Delta_p (A\times A') = p(A\cap A')

for all $A,A'\in\mathcal{A}$ .

Intuitively, this is a copy of $p$ on $X$ concentrated on the diagonal subset $\{(x,x):x\in X\}\subseteq X\times X$ . (Whenever $(X,\mathcal{A})$ is standard Borel, the diagonal subset is measurable, and so this intuition can be made precise.)

This coupling gives the identity in the category of couplings. In terms of transport plans, this corresponds to not moving any mass (almost surely).

Independent coupling

Given probability spaces $(X,\mathcal{A},p)$ and $(Y,\mathcal{B},q)$ the independent coupling or product coupling or constant coupling is given by the product measure $p\otimes q$ , i.e.

(p\otimes q)(A\times B) = p(A)\,q(B)

for all $A\in\mathcal{A}$ and $B\in\mathcal{B}$ .

In terms of transport plans, this arranges the mass from almost all points of $X$ to a distribution proportional to $q$ , (almost surely) independently of the point of origin.

Composition of couplings

Let $(X,\mathcal{A},p)$ , $(Y,\mathcal{B},q)$ , $(Z,\mathcal{C},r)$ be standard Borel probability spaces, and consider transport plans $s$ from $p$ to $q$ and $t$ from $q$ to $r$ . The composite transport plan $t\circ s$ from $p$ to $r$ is defined as follows:

(t\circ s)(A\times C) = \int_Y s'(A|y)\,t'(C|y) q(dy)

for all $A\in\mathcal{A}$ and $C\in\mathcal{C}$ , and where $s'$ and $t'$ are the regular conditional distributions associated to $s$ and $t$ given $Y$ . The interpretation is that the mass in moved according to the plan $s$ and then according to the plan $t$ , and in case the transport is stochastic, the two transitions are taken independently.

This construction gives composition in the category of couplings. When the transport plans are induced by functions or kernels (see below), the composition of transport plans is given by the composition of functions or kernels.

In Kozen-Silva-Voogd’23, this construction was extended beyond the standard Borel case. (See there for the details.)

Couplings induced by functions

Let $f:(X,\mathcal{A},p)\to(Y,\mathcal{B},q)$ be a measure-preserving function. One can define the “deterministic” transport plan $r_f$ as follows,

r_f(A\times B) = p\big(A\cap f^{-1}(B)\big)

for all $A\in\mathcal{A}$ and $B\in\mathcal{B}$ . Intuitively, this maps all the mass at $x$ to the point $f(x)$ , for every $x\in X$ .

Note that in general there may exist no measure-preserving function between two probability spaces, for example, on the real line, if $p$ is a Dirac delta and $q$ is not. A construction that always exists is in terms of Markov kernels, see below.

Couplings induced by kernels

Let $k:(X,\mathcal{A},p)\to(Y,\mathcal{B},q)$ be a measure-preserving Markov kernel. One can define a transport plan $r_k$ as follows,

r_k(A\times B) = \int_A k(B|x)\,p(dx)

for all $A\in\mathcal{A}$ and $B\in\mathcal{B}$ . Intuitively, this maps all the mass at $x$ to a measure on $Y$ proportional to the measure $B\mapsto k(B|x)$ .

Note that in the formula above, the measure $B\mapsto k(B|x)$ is invoked only for almost all $x$ , and so it is insensitive to changes in $k$ on a $p$ -measure-zero set. In a certain sense, this transporting the mass of $p$ , more than the single points $x$ .

In many cases, such as if $(X,\mathcal{A})$ and $(Y,\mathcal{B})$ are standard Borel, every transport plan is in the form $r_k$ for some $k$ . See also the discussion at "category of couplings".

Bayesian inversion

Couplings are in some sense undirected, meaning that every transport plan from $X$ to $Y$ can also be seen as (and canonically induces) a transport plan from $Y$ to $X$ .

This makes the category of couplings canonically a dagger category.

For transport plans specified by kernels, this symmetry corresponds exactly to Bayesian inversion of kernels.

Related concepts

References

Cedric Villani, Optimal transport: old and new, Springer, 2008.
Fredrik Dahlqvist, Vincent Danos, Ilias Garnier, and Alexandra Silva, Borel kernels and their approximation, categorically, MFPS 2018. arXiv.
Dexter Kozen, Alexandra Silva, Erik Voogd, Joint Distributions in Probabilistic Semantics, MFPS 2023. (arXiv)
Paolo Perrone, Lifting couplings in Wasserstein spaces, 2021. (arXiv:2110.06591)

category: probability

Last revised on February 7, 2025 at 09:47:24. See the history of this page for a list of all contributions to it.