nLab relative entropy

Contents

Context

Measure and probability theory

Idea
Definition

For states on finite probability spaces
For states on classical probability spaces
For states on quantum probability spaces (von Neumann algebras)

Relation to machine learning
Related concepts
References

Idea

The notion of relative entropy of states is a generalization of the notion of entropy to a situation where the entropy of one state is measured “relative to” another state.

It is also called

Kullback-Leibler divergence,
information divergence, or
information gain.

Definition

For states on finite probability spaces

For two finite probability distributions $(p_i)$ and $(q_i)$ , their relative entropy is

S(p/q) := \sum_{k = 1}^n p_k(log p_k - log q_k) \,.

Alternatively, for $\rho, \phi$ two density matrices, their relative entropy is

S(\rho/\phi) := tr \rho(log \rho - log \phi) \,.

For states on classical probability spaces

Definition

For $X$ a measurable space and $P$ and $Q$ two probability measures on $X$ , such that $Q$ is absolutely continuous with respect to $P$ , their relative entropy is the integral

S(Q|P) = \int_X log \frac{d Q}{d P} d P \,,

where $d Q / d P$ is the Radon-Nikodym derivative of $Q$ with respect to $P$ .

For states on quantum probability spaces (von Neumann algebras)

Let $A$ be a von Neumann algebra and let $\phi$ , $\psi : A \to \mathbb{C}$ be two states on it (faithful, positive linear functionals).

Definition

The relative entropy $S(\phi/\psi)$ of $\psi$ relative to $\phi$ is

S(\phi/\psi) := - (\Psi, (log \Delta_{\Phi,\Psi}) \Psi) \,,

where $\Delta_{\Phi,\Psi}$ is the relative modular operator? of any cyclic and separating vector representatives $\Phi$ and $\Psi$ of $\phi$ and $\psi$ .

This is due to (Araki).

Proposition

This definition is independent of the choice of these representatives.
In the case that $A$ is finite dimensional and $\rho_\phi$ and $\rho_\psi$ are density matrices of $\phi$ and $\psi$ , respectively, this reduces to the above definition.

Relation to machine learning

The machine learning process has been characterized as a minimization of relative entropy (Ackley, Hinton and Sejnowski 1985).

Related concepts

References

Relative entropy of states on von Neumann algebras was introduced in:

Huzihiro Araki, Relative Entropy of States of von Neumann Algebras, Publications of the Research Institute for Mathematical Sciences, 11 3 (1976), 809-833 (pdf, doi:10.2977/prims/1195191148)

A characterization of relative entropy on finite-dimensional C-star algebras is given in

D. Petz, Characterization of the relative entropy of states of matrix algebras, Acta Mathematica Hungarica, 59 (1992), 3-4. (doi:10.1007/bf00050907)

A survey of entropy in operator algebras is in

Erling Størmer, Entropy in operator algebras (pdf)

A characterization of machine learning as a process minimizing relative entropy is proposed in

David H. Ackley, Geoffrey E. Hilton, Terrence J. Sejnowski. A learning algorithm for Boltzmann machines, Cognitive Science, 9 (1985), 147–169. (web)

Last revised on December 30, 2023 at 21:36:41. See the history of this page for a list of all contributions to it.

nLab relative entropy

Context

Measure and probability theory

Measure theory

Probability theory

Information geometry

Thermodynamics

Theorems

Applications

Contents

Idea

Definition

For states on finite probability spaces

For states on classical probability spaces

Definition

For states on quantum probability spaces (von Neumann algebras)

Definition

Proposition

Relation to machine learning

Related concepts

References