nLab Bell's inequality



Measure and probability theory

Quantum systems

quantum logic

quantum physics

quantum probability theoryobservables and states

quantum information

quantum computation

quantum algorithms:



What came to be called Bell’s inequality (Bell 1964) is an inequality satisfied by the three pairwise correlation functions between three random variables defined on one and the same classical probability space. As such, it is an elementary statement about classical probability theory which as been argued (Pitowsky 1989a) to have been known already to Boole (1854).

The point of the argument by Bell 1964 was to highlight that when taking these three random variables to be the results of quantum measurements of the spin of an electron along three pairwise non-orthogonal axes (as in the Stern-Gerlach experiment) then quantum theory predicts that this inequality is violated – implying that there is no single classical probability space (called a hidden variable in the context of interpretations of quantum mechanics) on which these three quantum measurement-results are jointly random variables.

A number of experiments have sought to check Bell’s inequalities in quantum physics (“Bell tests”) and all claim to have verified that it is indeed violated in nature (see Aspect 2015), as predicted by quantum theory.

Bell’s inequality has been and is receiving an enormous amount of attention, first in discussions of interpretations of quantum mechanics, but more recently and more concretely also in the context of quantum information theory.


A transparent and compact way to derive the actual inequality of Bell 1964 (adjusting the original argument only slightly for mathematical elegance) is reviewed in Khrennikov 2008, §10.1, which we broadly follow:



  1. a probability space (Λ,dρ)(\Lambda, d\rho) with

  2. three random variables taking values in {±1}\{\pm 1\} (regarded inside the real numbers):

    (1)S i:X{±1},i{1,2,3} S_i \;\colon\; X \longrightarrow \{\pm 1\} \hookrightarrow \mathbb{R} \,, \;\;\;\; i\,\in\, \{1,2,3\}

then the correlation functions

(2)S iS j ΛS i(λ)S j(λ)dρ(λ) \langle S_{i} \, S_j\rangle \;\coloneqq\; \int_{\Lambda} \; S_i(\lambda) \, S_j(\lambda) \; d\rho(\lambda)

satisfy this inequality:

(3)|S 1S 2S 3S 2|1S 1S 3. \big\vert \langle S_1 S_2\rangle - \langle S_3 S_2\rangle \big\vert \;\leq\; 1 - \langle S_1 S_3\rangle \,.

(where ||\left\vert-\right\vert denotes the absolute value)


Recall that the expectation value of a random variable P:ΛP \,\colon\, \Lambda \longrightarrow \mathbb{R} is given by its Lebesgue integral against the probability measure:

P ΛP(λ)dρ(λ), \langle P \rangle \;\coloneqq\; \int_\Lambda P(\lambda) \, d\rho(\lambda) \,,

and that dρd\rho being a probability measure implies the normalization

(4)1 Λ1dρ(λ)=1. \langle 1 \rangle \;\equiv\; \int_\Lambda 1 \, d\rho(\lambda) \;=\; 1 \,.

Moreover, the assumption (1) that the random variables S iS_i take values in {±1}\{\pm 1\} immediately implies for all i,jin{1,2,3}i,j \,in\, \{1,2,3\} that

(5)(S iS i)=1,i.e.λΛS i(λ)S i(λ)=(±1) 2=1. \big( S_i \cdot S_i \big) \,=\, 1 \,, \;\;\;\; \text{i.e.} \;\;\; \underset{\lambda \,\in\, \Lambda}{\forall} S_i(\lambda) \, S_i(\lambda) \,=\, (\pm 1)^2 \,=\, 1 \,.

Together this implies – by repeatedly using the Cauchy-Schwarz inequality – the bounds:

|S i|1,|S iS j|1 \big\vert \langle S_i \rangle \big\vert \;\leq\; 1 \,, \;\;\;\;\;\;\; \big\vert \langle S_i S_j\rangle \big\vert \;\leq\; 1

and thus, in particular:

(6)|PS iS j||P|, \big\vert \langle P \, S_i \, S_j \rangle \big\vert \;\leq\; \big\vert \langle P \rangle \big\vert \,,

for any random variable P:ΛP \,\colon\, \Lambda \to \mathbb{R}.

Using these (evident) ingredients, we directly compute as follows

|S 1S 2S 3S 2| =| ΛS 1(λ)S 2(λ)dρ(λ) ΛS 3(λ)S 2(λ)dρ(λ)| by(2) =| Λ(S 1(λ)S 3(λ))S 2(λ)dρ(λ)| by linearity of the integral =| Λ(1S 1(λ)S 3(λ))S 1(λ)S 2(λ)dρ(λ)| by(5) | Λ(1S 1(λ)S 3(λ))dρ(λ)| by(6) =1S 1S 3 by(4)and(2) \begin{array}{ll} \big\vert \langle S_1 S_2\rangle - \langle S_3 S_2\rangle \big\vert & \\ \;=\; \Big\vert \int_{\Lambda} S_1(\lambda) \, S_2(\lambda) \, d\rho(\lambda) - \int_{\Lambda} S_3(\lambda) \, S_2(\lambda) \, d\rho(\lambda) \Big\vert & \text{by}\;\text{(2)} \\ \;=\; \Big\vert \int_{\Lambda} \big( S_1(\lambda) - S_3(\lambda) \big) \, S_2(\lambda) \, d\rho(\lambda) \Big\vert & \text{by linearity of the integral} \\ \;=\; \Big\vert \int_{\Lambda} \big( 1 - S_1(\lambda) \, S_3(\lambda) \big) S_1(\lambda) \, S_2(\lambda) \, d\rho(\lambda) \Big\vert & \text{by}\;\text{(5)} \\ \;\leq\; \Big\vert \int_{\Lambda} \big( 1 - S_1(\lambda) \, S_3(\lambda) \big) \, d\rho(\lambda) \Big\vert & \text{by}\;\text{(6)} \\ \;=\; 1 - \langle S_1 S_3\rangle & \text{by}\;\text{(4)}\;\text{and}\; \text{(2)} \end{array}

This is the inequality (3).



The original article:


  • Greg Kuperberg, section 1.6.2 of A concise introduction to quantum probability, quantum mechanics, and quantum computation, 2005 (pdf)


See also:

Probabilistic opposition

Identification of Bell’s inequalities with much older inequalities in classical probability theory, due to George Boole‘s The Laws of Thought, was pointed out by (among others, called the “probabilistic opposition” in Khrennikov 2007, p. 3) by:

reviewed in:

Last revised on September 8, 2022 at 11:10:12. See the history of this page for a list of all contributions to it.