nLab random variable




A random variable, or stochastic variable, is a quantity that is subject to ‘random’ variation.


The formalization of this idea in modern probability theory (Kolmogorov 33, III) is to take a random variable to be a measurable function ff on a probability space (X,μ)(X,\mu) (e.g. Grigoryan 08, 3.2, Dembo 12, 1.2.1).

One thinks of XX as the space of all possible configurations (all the “possible worlds” with respect to the idealized situation under consideration), thinks of the measure μ(U)\mu(U) of any subset of it as the probability that one of the configurations xUXx \in U \subset X is randomly realized, and thinks of f(x)f(x) as the value of the given random variable in the situation of that configuration.

Accordingly for instance the expectation value of the random variable ff is the integral

f Xfμ \langle f \rangle \coloneqq \int_X f \cdot \mu

of ff against the probability measure, i.e. the average value of the random variable over all possible configuration, weighted by their probability.


Relation to type theoretic constructions

There is at least some similarity of the concept of random variables to usage of the function monad (“reader monad”) in the context of monads in computer science.

In Verdier 14 it says:

The intuition behind the Reader monad, for a mathematician, is perhaps stochastic variables. A stochastic variable is a function from a probability space to some other space. So we see a stochastic variable as a monadic value.

and in Toronto-McCarthy 10b, slide 35:

you could interpret this by regarding random variables as reader monad computations.

See also (Toronto-McCarthy 10b, slide 24). Toronto-McCarthy 10a, 2.2, Toronto 14 call the function monad the random variable idiom.

Random variables and Dedekind reals

Given a measure space (X,Σ,μ)(X,\Sigma,\mu), a random variable is also often defined as an equivalence class of measurable real-valued functions on XX where two such functions are identified when they differ only on a subset of measure zero.

In this context, it has been observed by P. Deligne1 that the po-set of measurable subsets Σ\Sigma can be equipped with a suitable Grothendieck topology.

In the resulting Grothendieck topos Meas(X,Σ,μ)Meas(X,\Sigma,\mu) the object of Dedekind real numbers R DR_D corresponds to the sheaf of random variables on XX in this sense.

A Dedekind real in a topos Sh(X)Sh(X) of sheaves on a topological space is just a continuous real-valued function on XX. This suggests the view that the sheaf-theoretic perspective on (X,Σ,μ)(X,\Sigma,\mu) sweeps the measure-theoretic details under the rug and brings out the conceptual essence of a random variable as simply a real-valued ‘function’ or ‘variable real number’ on XX and goes in the same direction as the connection to the function monad mentioned in the previous section.

The details of this example, due to D. Scott, are described in (Johnstone 1977, p.213).


The modern formal concept originates around

  • Andrey Kolmogorov, Grundbegriffe der Wahrscheinlichkeitsrechnung, Ergebnisse der Mathematik und Ihrer Grenzgebiete, Springer Berlin Heidelberg 1933.

Surveys and lecture notes include

  • Alexander Grigoryan, Measure theory and probability, 2008. (pdf)

  • Amir Dembo, Probability theory, 2012. (pdf)

For more information on the above topos-theoretic example consult

  • Peter Johnstone, Topos Theory , Academic Press New York 1977. (Dover reprint 2014)

Discussion from a point of view of type theory/computer science includes

  1. SGA4.I, p.412.

Last revised on April 12, 2021 at 13:50:55. See the history of this page for a list of all contributions to it.