In probability theory, the empirical distribution is the probability distribution formed by taking empirical frequencies of a phenomenon, and dividing by the total number of cases.
For example, if we flip a coin 5 times, the empirical frequency is the probability distribution on the space given by
For instance, if we have obtained “heads” 3 times and “tails” 2 times, we have
The name empirical distribution denotes both the distribution obtained by sampling a finite amount of data, as well as the limit (when it exists) resulting from an infinite sequence of observations, usually generated from a stochastic process.
In statistics it is used as an estimator? of the distribution of a random variable whenever it is possible to take iid samples.
Let be a measurable space. For each , denote by the Dirac delta distribution given by
for all measurable .
Let now be a finite set. We can view the product space as the space of finite sequences of elements of . The empirical distribution of a finite sequence is the probability measure on given by
meaning that it assigns to each measurable the value
Similarly, we can view the countable product as the space of infinite sequences of elements of . The empirical distribution of a sequence is the probability measure on given by the limit, if it exists,
If the are random variables, and so they form a stochastic process (for example, if they are coin flips), the empirical distribution, if it exists, is a random variable as well.
(…)
Last revised on July 15, 2024 at 16:47:57. See the history of this page for a list of all contributions to it.