In probability theory a conditional expectation value or conditional expectation, for short, is like an expectation value of some random variable/observable, but conditioned on the assumption that a certain event is assumed to have occured.
More technically: If is a probability space, the conditional expectation of a (measurable) random variable with respect to some sub--algebra is some measurable random variable which is a ββcoarsenedββ version of . We can think of as a random variable with the same domain but which is measured with a sigma algebra containing only restricted information on the original event since to some events in has been assigned probability or in a consistent way.
Let be a probability space, let be a measurable function into a measure space equipped with the pushforward measure induced by , let be a real-valued random variable.
Then for and there exists a essentially unique (two sets are defined to be equivalent if their difference is a set of measure ) integrable function such that the following diagram commutes:
where . Here ββcommutesββ shall mean that
(1) is -measurable.
(2) the integrals over and are equal.
In this case is called a version of the conditional expectation of provided .
In more detail (2) is equivalent to that for all we have
and to
(The equivalence of the last two formulas is given since we always have by the substitution rule.)
Note that it does not follow from the preceding definition that the conditional expectation exists. This is a consequence of the Radon-Nikodym theorem as will be shown in the following section. (Note that the argument of the theorem applies to the definition of the conditional expectation by random variables if we consider the pushforward measure as given by a sub--algebra of the original one. In this sense is a ββcoarsened versionββ of factored by the information (i.e. the -algebra) given by .)
Note that by construction of the pushforward-measure it suffices to define the conditional expectation only for the case where is a sub--algebra.
(Note that we loose information with the notation ; e.g is different from )
The diagram
is commutative (in our sense) iff
(a) is -measurable
(b) ,
We hence can write the conditional expectation as the equivalence class
An element of this class is also called a version.
exists and is unique almost surely.
Existence: By
is defined a measure on (if ; if not consider the positive part and the negative part of separate and use linearity of the integral). Let be the restriction of to . Then
meaning: for all . This is the condition of the theorem of Radon-Nikodym (the other condition of the theorem that is -finite is satisfied since is a probability measure). The theorem implies that has a density w.r.t which is .
Uniqueness: If and are candidates, by linearity the integral over their difference is zero.
From elementary probability theory we know that .
For we call the conditional probability of provided .
In probability theory and statistics, a stochastic kernel is the transition function of a stochastic process. In a discrete time process with continuous probability distributions, it is the same thing as the kernel of the integral operator that advances the probability density function.
An integral transform is an assignation of the form
where the function of two variables is called integral kernel of the transform .
Let be a measure space, let be a measurable space.
A map satisfying
(1) is measurable
(2) is a probability measure on ,
is called a stochastic kernel or transition kernel (or Markov kernel - which we avoid since it is confusing) from to .
Then induces a function between the classes of measures on and on
If is a probability measure, then so is . The symbol is sometimes written as in optical proximity to a conditional probability.
The stochastic kernel is hence in particular an integral kernel.
In a discrete stochastic process (see below) the transition function is a stochastic kernel (more precisely it is the function induced by a kernel ).
Let be a probability space, let be a measure space, let be a stochastic kernel from to .
Then by
is defined a probability measure on which is called coupling. is unique with the property
Let (with the above settings) be -measurable, let be a -dimensional random vector.
Then there exists a stochastic kernel from to such that
and is (a version of) the conditional distribution of provided , i.e.
This theorem says that that (more precisely ) fits in the diagram
and .
In the discrete case, i.e. if and are finite- or enumerable sets, it is possible to reconstruct by just considering one-element sets in and the related probabilities
called transition probabilities encoding assemble to a (perhaps countably infinite) matrix called transition matrix of resp. of . Note that is the probability of the transition of the state (aka. elementary event or one-element event) to the event (which in this case happens to have only one element, too). We have forall .
If is a counting density on , then
is a counting density on .
The conditional expectation plays a defining role in the theory of martingales which are stochastic processes such that the conditional expectation of the next value (provided the previous values) equals the present realized value.
The terminology of stochastic processes is a special interpretation of some aspects of infinitary combinatorics? in terms of probability theory.
Let be a total order (i.e. transitive, antisymmetric, and total).
A stochastic process is a diagram where is the class of random variables such that is a random variable. Often one considers the case where all are equal; in this case is called state space of the process .
If all are equal and the class of -algebras is filtered i.e.
and all are measurable, the process is called adapted process.
For example the natural filtration where gives an adapted process.
In terms of a diagram we have for
and where is the transition probability for the passage from state to state .
An adapted stochastic process with the natural filtration in discrete time is called a martingale if all and .
(β¦)
An adapted stochastic process satisfying
is called a Markow process.
For a Markow process the Chapman-Kolmogorow equation encodes the statement that the transition probabilities of the process form a semigroup.
If in the notation from above is a family of stochastic kernels such that all are probabilities, then is called transition semigroup if
where
In the dual algebraic formulation of probability theory known as noncommutative probability theory or quantum probability theory, where the concept of expectation value is primitive (while that of the corresponding probability space (if it exists) is a derived concept), the concept of conditional expection appears as follows (e.g. Redei-Summers 06, section 7.3):
Let be a quantum probability space, hence a complex star algebra of quantum observables, and a state on a star-algebra .
This means that for any observable, its expectation value in the given state is
More generally, if is a real idempotent/projector
thought of as an event, then for any observable the conditional expectation value of , conditioned on the observation of , is
See also
Discussion form the point of view of quantum probability is in
Last revised on July 21, 2024 at 16:22:26. See the history of this page for a list of all contributions to it.