nLab relative entropy




The notion of relative entropy of states is a generalization of the notion of entropy to a situation where the entropy of one state is measured “relative to” another state.

is also called

  • Kullback-Leibler divergence

  • information divergence

  • information gain .


For states on finite probability spaces

For two finite probability distributions (p i)(p_i) and (q i)(q_i), their relative entropy is

S(p/q):= k=1 np k(logp klogq k). S(p/q) := \sum_{k = 1}^n p_k(log p_k - log q_k) \,.

Alternatively, for ρ,ϕ\rho, \phi two density matrices, their relative entropy is

S(ρ/ϕ):=trρ(logρlogϕ). S(\rho/\phi) := tr \rho(log \rho - log \phi) \,.

For states on classical probability spaces


For XX a measurable space and PP and QQ two probability measures on XX, such that QQ is absolutely continuous with respect to PP, their relative entropy is the integral

S(Q|P)= XlogdQdPdP, S(Q|P) = \int_X log \frac{d Q}{d P} d P \,,

where dQ/dPd Q / d P is the Radon-Nikodym derivative of QQ with respect to PP.

For states on quantum probability spaces (von Neumann algebras)

Let AA be a von Neumann algebra and let ϕ\phi, ψ:A\psi : A \to \mathbb{C} be two states on it (faithful, positive linear functionals).


The relative entropy S(ϕ/ψ)S(\phi/\psi) of ψ\psi relative to ϕ\phi is

S(ϕ/ψ):=(Ψ,(logΔ Φ,Ψ)Ψ), S(\phi/\psi) := - (\Psi, (log \Delta_{\Phi,\Psi}) \Psi) \,,

where Δ Φ,Ψ\Delta_{\Phi,\Psi} is the relative modular operator? of any cyclic and separating vector representatives Φ\Phi and Ψ\Psi of ϕ\phi and ψ\psi.

This is due to (Araki).

  • This definition is independent of the choice of these representatives.

  • In the case that AA is finite dimensional and ρ ϕ\rho_\phi and ρ ψ\rho_\psi are density matrices of ϕ\phi and ψ\psi, respectively, this reduces to the above definition.

Relation to machine learning

The machine learning process has been characterized as a minimization of relative entropy (Ackley, Hinton and Sejnowski 1985).


Relative entropy of states on von Neumann algebras was introduced in:

A characterization of relative entropy on finite-dimensional C-star algebras is given in

  • D. Petz, Characterization of the relative entropy of states of matrix algebras, Acta Mathematica Hungarica, 59 (1992), 3-4. (doi:10.1007/bf00050907)

A survey of entropy in operator algebras is in

  • Erling Størmer, Entropy in operator algebras (pdf)

A characterization of machine learning as a process minimizing relative entropy is proposed in

  • {AckleyHintonSejnowski} David H. Ackley, Geoffrey E. Hilton, Terrence J. Sejnowski. A learning algorithm for Boltzmann machines, Cognitive Science, 9 (1985), 147–169. (web)

Last revised on August 16, 2021 at 07:21:36. See the history of this page for a list of all contributions to it.