nLab Bell's theorem


This entry needs to be merged with Bell's inequalities.



physics, mathematical physics, philosophy of physics

Surveys, textbooks and lecture notes

theory (physics), model (physics)

experiment, measurement, computable physics



Bell’s theorem is the collective name for a family of results, all showing the impossibility of a Local Realistic (hidden variable) interpretation of quantum mechanics. As such, it is a form of a no-go theorem and is related to

Original derivation

The following tries to be a fairly verbatim recap of the argument in Bell 1964. For a streamlined statement and proof see here at Bell's inequality.

Let us denote the result A of a measurement that is determined by a unit vector, a\vec{a}, and some parameter λ\lambda as A(a,λ)=±1A(\vec{a},\lambda)=\pm 1 where we further suppose that the outcome of the measurement is either +1 or -1. Likewise, we may do the same for the result B of a second measurement, i.e. B(b,λ)B(\vec{b},\lambda). We further make the vital assumption that the result B does not depend on a\vec{a} and likewise A does not depend on b\vec{b}.

Before proceeding, we should note that λ\lambda here plays the role of a “hidden” parameter or variable. We say it is “hidden” because its precise nature is not known. However, it is still a very real parameter with a probability distribution ρ(λ)\rho(\lambda). The expectation value of the product of the two measurements is

P(a,b)=dλρ(λ)A(a,λ)B(b,λ). P(\vec{a},\vec{b})=\int d\lambda\rho(\lambda)A(\vec{a},\lambda)B(\vec{b},\lambda).

Because ρ\rho is a normalized probability distribution,

dλρ(λ)=1 \int d\lambda \rho(\lambda) = 1

and because A(a,λ)=±1A(\vec{a},\lambda)=\pm 1 and B(b,λ)=±1B(\vec{b},\lambda)=\pm 1, P cannot be less than -1. It can be equal to -1 at a=b\vec{a}=\vec{b} only if A(a,λ)=±1=B(a,λ)=±1A(\vec{a},\lambda)=\pm 1 = -B(\vec{a},\lambda)=\pm 1 except at a set of points λ\lambda of zero probability. Thus we can write (1) as

P(a,b)=dλρ(λ)A(a,λ)A(b,λ). P(\vec{a},\vec{b})=-\int d\lambda\rho(\lambda)A(\vec{a},\lambda)A(\vec{b},\lambda).

If we introduce a third unit vector c\vec{c} we can find the difference between the correlation of a\vec{a} to the two other unit vectors,

P(a,b)P(a,c)=dλρ(λ)[A(a,λ)A(b,λ)A(a,λ)A(c,λ)]. P(\vec{a},\vec{b})-P(\vec{a},\vec{c})=-\int d\lambda\rho(\lambda)[A(\vec{a},\lambda)A(\vec{b},\lambda)-A(\vec{a},\lambda)A(\vec{c},\lambda)].

Rearranging this we may write (3) as

P(a,b)P(a,c)=dλρ(λ)A(a,λ)A(b,λ)[A(b,λ)A(c,λ)1]. P(\vec{a},\vec{b})-P(\vec{a},\vec{c})=-\int d\lambda\rho(\lambda)A(\vec{a},\lambda)A(\vec{b},\lambda)[A(\vec{b},\lambda)A(\vec{c},\lambda)-1].

Given the limitations we have placed on the value of A, we may write

|P(a,b)P(a,c)|dλρ(λ)[1A(b,λ)A(c,λ)]. |P(\vec{a},\vec{b})-P(\vec{a},\vec{c})| \le \int d\lambda\rho(\lambda)[1-A(\vec{b},\lambda)A(\vec{c},\lambda)].

But the second term on the right is simply P(b,c)P(\vec{b},\vec{c}) and thus

1+P(b,c)|P(a,b)P(a,c)| 1 + P(\vec{b},\vec{c}) \ge |P(\vec{a},\vec{b})-P(\vec{a},\vec{c})|

which is the original form of Bell’s inequality. Note that this may be written in terms of correlation coefficients,

1+C(b,c)|C(a,b)C(a,c)| 1 + C(b,c) \ge |C(a,b)-C(a,c)|

where a, b, and c are now settings on the measurement apparatus.

Quantum mechanical violations

The original derivation of Bell’s inequalities involved the use of a Stern-Gerlach device that measures spin along an axis. Suppose σ 1\sigma_{1} and σ 2\sigma_{2} are spins. The result, A, of measuring σ 1a\sigma_{1}\cdot\vec{a} is then interpreted as being entirely determined by a\vec{a} and λ\lambda. Likewise for B and σ 2b\sigma_{2}\cdot\vec{b}. It is also important to remember that the result B does not depend on a\vec{a} and likewise A does not depend on b\vec{b}.

For a singlet state (that is a state with total spin of zero), the quantum mechanical expectation value of measurements along two different axes (see the Wigner derivation below for a more intuitive explanation of the physical nature of this) is

σ 1a,σ 2b=ab. \langle\sigma_{1}\cdot\vec{a},\sigma_{2}\cdot\vec{b}\rangle = - \vec{a}\cdot\vec{b}.

In theory this ought to equal P(a,b)P(\vec{a},\vec{b}) but in practice it does not. It is important to remember that we are using classical reasoning throughout our derivations of the various forms of Bell’s inequalities.

The setup envisioned here consists of pairs of spin-1/2 particles produced in singlet states that then each pass through separate Stern-Gerlach (SG) devices. Since they are in singlet states, if we measured the first particle of a pair to be aligned with a given axis, say a\vec{a}, then the second should be measured to be anti-aligned with that same axis, giving a total spin of zero.

In practice we are dealing with beams of particles and thus we can never be absolutely certain that correlated pairs are measured simultaneously and so we ultimately are making statistical predictions. Nevertheless, in a given sample consisting of a large-enough number of randomly distributed spin-1/2 particles, we can be certain that, for example, a definite number are aligned with an axis a\vec{a} while a definite number are aligned with an axis b\vec{b}.

Now take an individual particle and suppose that, for this particle, if we measured σa\sigma\cdot\vec{a} we would obtain a +1 with certainty (meaning it is aligned with a\vec{a}) but if we instead chose to measure σb\sigma\cdot\vec{b} we would obtain a -1 with certainty (meaning it is anti-aligned with b\vec{b}). Notationally we refer to such a particle as belonging to type (a+,b)(\vec{a}+,\vec{b}-). Clearly for a given pair of particles in a singlet state, if particle 1 is of type (a+,b)(\vec{a}+,\vec{b}-), then particle 2 must be of type (a,b+)(\vec{a}-,\vec{b}+).


For beams of correlated particles measuring along only two axes, we should expect to get a roughly evenly balanced distribution of types as follows:

Particle 1 Particle 2 (a+,b) (a,b+) (a+,b+) (a,b) (a,b) (a+,b+) (a,b+) (a+,b) \array{ \text{ Particle 1 } & & \text{ Particle 2 } \\ (\vec{a}+,\vec{b}-) & \leftrightarrow & (\vec{a}-,\vec{b}+) \\ (\vec{a}+,\vec{b}+) & \leftrightarrow & (\vec{a}-,\vec{b}-) \\ (\vec{a}-,\vec{b}-) & \leftrightarrow & (\vec{a}+,\vec{b}+) \\ (\vec{a}-,\vec{b}+) & \leftrightarrow & (\vec{a}+,\vec{b}-) }

There is a very important assumption implied here. Suppose a particular pair belongs to the first grouping, that is if an observer A decides to measure the spin along a\vec{a} for particle 1, he or she necessarily obtains a plus sign (corresponding to it being aligned with a\vec{a}) regardless of any measurement observer B may make on particle 2. This is the principle of locality: A’s result is predetermined independently of B’s choice of what to measure.

Wigner’s derivation

Now suppose we introduce a third axis, c\vec{c}, so that we can have, for example, particles of type (a+,b+,c)(\vec{a}+,\vec{b}+,\vec{c}-) corresponding to being aligned if measured on a\vec{a} and b\vec{b} and anti-aligned on c\vec{c}. Further let us “count” the pairs that fall into the various groupings and label the populations as follows:

Population Particle 1 Particle 2 N 1 (a+,b+,c+) (a,b,c) N 2 (a+,b+,c) (a,b,c+) N 3 (a+,b,c+) (a,b+,c) N 4 (a+,b,c) (a,b+,c+) N 5 (a,b+,c+) (a+,b,c) N 6 (a,b+,c) (a+,b,c+) N 7 (a,b,c+) (a+,b+,c) N 8 (a,b,c) (a+,b+,c+) \array{ \text{ Population } & \text{ Particle 1 } & \text{ Particle 2 } \\ N_{1} & (\vec{a}+,\vec{b}+, \vec{c}+) & (\vec{a}-,\vec{b}-,\vec{c}-) \\ N_{2} & (\vec{a}+,\vec{b}+, \vec{c}-) & (\vec{a}-,\vec{b}-,\vec{c}+) \\ N_{3} & (\vec{a}+,\vec{b}-, \vec{c}+) & (\vec{a}-,\vec{b}+,\vec{c}-) \\ N_{4} & (\vec{a}+,\vec{b}-, \vec{c}-) & (\vec{a}-,\vec{b}+,\vec{c}+) \\ N_{5} & (\vec{a}-,\vec{b}+, \vec{c}+) & (\vec{a}+,\vec{b}-,\vec{c}-) \\ N_{6} & (\vec{a}-,\vec{b}+, \vec{c}-) & (\vec{a}+,\vec{b}-,\vec{c}+) \\ N_{7} & (\vec{a}-,\vec{b}-, \vec{c}+) & (\vec{a}+,\vec{b}+,\vec{c}-) \\ N_{8} & (\vec{a}-,\vec{b}-, \vec{c}-) & (\vec{a}+,\vec{b}+,\vec{c}+) }

Let’s suppose that observer A finds particle 1 is aligned with a\vec{a}, i.e. a+\vec{a}+, and that observer B finds particle 2 is aligned with b\vec{b}, i.e. b+\vec{b}+. From the above table it is clear that the pair belong to either population 3 or 4. Note that because N iN_{i} is positive semi-definite we must be able to construct relations like, for instance,

N 3+N 4(N 3+N 7)+(N 4+N 2). N_{3} + N_{4} \le (N_{3} + N_{7}) + (N_{4} + N_{2}).

Now let P(a+;b+)P(\vec{a}+;\vec{b}+) be the probability that, in a random selection, A finds particle 1 to be a+\vec{a}+ and B finds particle 2 to be b+\vec{b}+. In terms of populations, we have

P(a+;b+)=(N 3+N 4) i 8N i. P(\vec{a}+;\vec{b}+) = \frac{(N_{3} + N_{4})}{\sum_{i}^{8}N_{i}}.

Similarly we have

P(a+;c+)=(N 2+N 4) i 8N i P(\vec{a}+;\vec{c}+) = \frac{(N_{2} + N_{4})}{\sum_{i}^{8}N_{i}}


P(c+;b+)=(N 3+N 7) i 8N i. P(\vec{c}+;\vec{b}+) = \frac{(N_{3} + N_{7})}{\sum_{i}^{8}N_{i}}.

The positivity condition (9) then becomes

P(a+;b+)P(a+;c+)+P(c+;b+). P(\vec{a}+;\vec{b}+) \le P(\vec{a}+;\vec{c}+) + P(\vec{c}+;\vec{b}+).

This is Wigner’s form of Bell’s inequality.

Violations and geometry

As we mentioned before, we have used purely classical reasoning to derive the two forms of Bell’s inequality that we have thusfar encountered. Recall that the context within which the above were derived was the Stern-Gerlach experiment are we are measuring along axes of the magnetic field. As such, there are angles between these various axes. Thus the quantum mechanically-derived probabilities corresponding to (10), (11), and (12) are

P(a+;b+)=12sin 2(θ ab2), P(\vec{a}+;\vec{b}+) = \frac{1}{2}sin^{2}\left(\frac{\theta_{ab}}{2}\right),
P(a+;c+)=12sin 2(θ ac2), P(\vec{a}+;\vec{c}+) = \frac{1}{2}sin^{2}\left(\frac{\theta_{ac}}{2}\right),


P(c+;b+)=12sin 2(θ cb2), P(\vec{c}+;\vec{b}+) = \frac{1}{2}sin^{2}\left(\frac{\theta_{cb}}{2}\right),

respectively. Bell’s inequality, (13), then becomes

12sin 2(θ ab2)12sin 2(θ ac2)+12sin 2(θ cb2). \frac{1}{2}sin^{2}\left(\frac{\theta_{ab}}{2}\right) \le \frac{1}{2}sin^{2}\left(\frac{\theta_{ac}}{2}\right) + \frac{1}{2}sin^{2}\left(\frac{\theta_{cb}}{2}\right).

From a geometric point of view, this inequality is not always possible. For example, suppose, for simplicity that a\vec{a}, b\vec{b}, and c\vec{c} lie in a plane and suppose that c\vec{c} bisects a\vec{a} and b\vec{b}, i.e.

θ ab=2θ and θ ac=θ cb=θ. \array{ \theta_{ab} = 2\theta & \text{ and } & \theta_{ac}=\theta_{cb}=\theta. }

Then (14) is violated for 0<θ<π20 \lt \theta \lt \frac{\pi}{2}. For example, if θ=π4\theta = \frac{\pi}{4}, (14) would become 0.5000.2920.500 \le 0.292 which is absurd!

Other theorems about the foundations and interpretation of quantum mechanics include:


The original paper outlining Bell's theorem is

A detailed discussion is found here:

  • Stanford Encyclopedia of Philosophy, Bell’s theorem (url)

Last revised on September 8, 2022 at 11:44:02. See the history of this page for a list of all contributions to it.