nLab Hölder's inequality




Hölder’s inequality is a basic inequality in analysis, which can be interpreted as saying that the C\mathbf{C}-graded *-algebra of L^p-spaces

L 1/p(X,μ)L^{1/p}(X,\mu)

(the unfortunate reciprocal in the grading is explained in the article Lebesgue space) is a C\mathbf{C}-graded normed *-algebra. That is, the canonical bilinear multiplication map

L 1/p(X,μ)L 1/q(X,μ)L 1/(p+q)(X,μ)L^{1/p}(X,\mu)\otimes L^{1/q}(X,\mu)\to L^{1/(p+q)}(X,\mu)

is a contractive map, i.e., if fL 1/p(X,μ)f\in L^{1/p}(X,\mu) and gL 1/q(X,μ)g\in L^{1/q}(X,\mu), then

fgL 1/(p+q)(X,μ)f g \in L^{1/(p+q)}(X,\mu)


fgfg,\|f g \|\le \|f\|\cdot\|g\|,

where the norms are p-norms for the corresponding values of pp.

As a consequence, it implies that the canonical pairing

L 1/p(X,μ)L 1/(1p)(X,μ)L 1(X,μ)CL^{1/p}(X,\mu)\otimes L^{1/(1-p)}(X,\mu)\to L^1(X,\mu)\to \mathbf{C}

exhibits L 1/(1p)(X,μ)L^{1/(1-p)}(X,\mu) as the Banach space dual of the Banach space L 1/p(X,μ)L^{1/p}(X,\mu).


Let (X,μ)(X, \mu) be a measure space, and for pCp \in \mathbf{C} a complex number with a nonnegative real part let L 1/pL^{1/p} denote L 1/p(X,μ)L^{1/p}(X, \mu), the Banach space of complex-valued functions on XX with finite 1/p-norm modulo equality almost everywhere.

Suppose p,q,rCp,q,r\in \mathbf{C} have nonnegative real parts and p+q=rp+q=r. Then Hölder’s inequality states that for any fL 1/pf \in L^{1/p}, gL 1/qg \in L^{1/q} we have

fgL 1/r(X,μ)f g \in L^{1/r}(X,\mu)


fg 1/rf 1/pg 1/q.\|f g\|_{1/r}\le\|f\|_{1/p}\cdot \|g\|_{1/q}.


In particular, if r=1r=1 we have

X|fg|f 1/pg 1/q\int_X \left| f g \right| \leq {\|f\|_{1/p}} {\|g\|_{1/q}}

(in particular, fgf g is an L 1L^1 function).

The “nPOV” meaning is this: in this situation there is a canonical pairing ,\langle -, - \rangle between L 1/pL^{1/p} and L 1/qL^{1/q},

L 1/p×L 1/q:(f,g)f,g Xfg,L^{1/p} \times L^{1/q} \to \mathbb{C}: (f, g) \mapsto \langle f, g \rangle \coloneqq \int_X f \cdot g,

which gives a bounded linear map

L 1/pL 1/qL^{1/p} \otimes L^{1/q} \to \mathbb{C}

between Banach spaces. The point of Hölder’s inequality is that this pairing is a short map, i.e., a map of norm bounded above by 11. In other words, this is morphism in the symmetric monoidal closed category Ban consisting of Banach spaces and short linear maps between them. Accordingly, the map

L 1/pL 1/qL^{1/p} \otimes L^{1/q} \to \mathbb{C}

induces (by currying) a map from L 1/pL^{1/p} to the Banach dual of L 1/(1p)L^{1/(1-p)}:

L 1/p(L 1/q) *[L 1/q,]L^{1/p} \to (L^{1/q})^\ast \coloneqq [L^{1/q}, \mathbb{C}]

(again a short map of course), and reciprocally a map L 1/q(L 1/p) *L^{1/q} \to (L^{1/p})^\ast.

It is a short step to prove that in fact the norm of the pairing L 1/pL 1/qL^{1/p} \otimes L^{1/q} \to \mathbb{C} is exactly 11, and even better that the maps L 1/p(L 1/q) *L^{1/p} \to (L^{1/q})^\ast and L 1/q(L 1/p) *L^{1/q} \to (L^{1/p})^\ast are in fact isometric embeddings.

If the real parts of pp and qq are nonzero, then with a little more work (with the help of the Radon-Nikodym theorem; see for example here), one sees these maps are surjective and thus isomorphisms in BanBan.


It is true also that (L 1) *L (L^1)^\ast \cong L^\infty, but it is not true that (L ) *(L^\infty)^\ast is isomorphic to L 1L^1. Or, it is at least not true in ZFC, although it may be true in dream mathematics.

The noncommutative case

Remarkably, L^p-spaces can be defined for arbitrary von Neumann algebras (Haagerup, 1979) and the Hölder inequality continues to hold in this generality (Kosaki, 1984).

The L 1/pL^{1/p}-spaces are now bimodules over the underlying von Neumann algebra.

If the real part of pp is zero and the von Neumann algebra is commutative, than the space L 1/pL^{1/p} is (noncanonically) isomorphic to L L^\infty.

In the noncommutative case this is no longer true, which is the starting point of the Tomita–Takesaki theory.

Proof of Hölder’s inequality in the commutative case

The proof is remarkably simple. First, if p,q>0p, q \gt 0 and 1p+1q=1\frac1{p} + \frac1{q} = 1, then we have Young's inequality, viz. for a,b>0a, b \gt 0

aba pp+b qq, a b \;\leq\; \frac{a^p}{p} + \frac{b^q}{q} \,,

with equality precisely when a p=b qa^p = b^q. This is quickly derived from the (strict) convexity of the exponential function, that 0t10 \leq t \leq 1 implies

e tx+(1t)yte x+(1t)e y, e^{t x + (1-t)y} \leq t e^x + (1-t)e^y \,,

where equality holds iff e x=e ye^x = e^y. All one has to do is put t=1pt = \frac{1}{p} and arrange x,yx, y so that e x=a pe^x = a^p and e y=b qe^y = b^q.

Then, to prove |f,g|f pg q{|\langle f, g \rangle|} \leq {\|f\|_p} {\|g\|_q}, we may assume f,gf, g nonzero (so their norms are positive) and normalize them to unit vectors u=f/f p,v=g/g qu = f/{\|f\|_p}, v = g/{\|g\|_q}, so that now the object is to prove

X|u||v|1.\int_X {|u|} \cdot {|v|} \leq 1.

But since we are dealing with unit vectors, we have X|u| p=1\int_X {|u|^p} = 1 and X|v| q=1\int_X {|v|^q} = 1, and now what we want follows straightaway from Young’s inequality applied to integrands:

X|u||v| X|u| pp+|v| qq=1p+1q=1\int_X {|u|} \cdot {|v|} \leq \int_X \frac{|u|^p}{p} + \frac{|v|^q}{q} = \frac1{p} + \frac1{q} = 1

and so the proof of Hölder’s inequality is complete.

To prove that the norm of the pairing ,\langle -, - \rangle is exactly 11 (is not less than 11), it’s enough to take any uL pu \in L^p of norm 11, so f=|u|f = {|u|} is a nonnegative function of norm 11, and then put g=f p1=f p/qg = f^{p-1} = f^{p/q}. We then have f p=g qf^p = g^q (almost) everywhere, where we then have fg=f pp+g qqf g = \frac{f^p}{p} + \frac{g^q}{q}, and now

Xfg= Xf pp+g qq=1p+1q=1.\int_X f g = \int_X \frac{f^p}{p} + \frac{g^q}{q} = \frac1{p} + \frac1{q} = 1.

Actually these calculations do a little better: they show that upon currying, the map

L p [L q,] f λg.f,g \begin{array}{ccc} L^p &\longrightarrow& [L^q, \mathbb{C}] \\ f &\mapsto& \lambda g. \langle f, g \rangle \end{array}

preserves the norm, so that L pL^p isometrically embeds into (L q) *(L^q)^\ast.


Relation to Minkowski’s inequality

Recall that Minkowski's inequality is just the triangle inequality in the context of L-p space. There is a well-known trick, covered in just about every functional analysis text, that allows one to deduce Minkowski’s inequality as a corollary of Hölder’s inequality. You can look it up for instance in the English Wikipedia, here.

What seemingly most such presentations lack is motivation for the trick, so let us try to say something about this.

First, Minkowski's inequality can be restated as asserting the convexity of the unit ball B={fL p:f p1}B = \{f \in L^p: {\|f\|_p} \leq 1\} of L pL^p. If we place ourselves for a moment in the context of L pL^p real-valued functions, then it suffices to show that BB is the intersection of a collection of affine half-spaces, say H λ={fL p:λ(f)1}H_\lambda = \{f \in L^p: \lambda(f) \leq 1\} where λ:L p\lambda: L^p \to \mathbb{R} is a (continuous) linear functional. But with hindsight into the meaning of Hölder’s inequality, seen as paving the way to characterizing linear functionals on L pL^p as those of the form λ(g)=(ff,g)\lambda(g) = (f \mapsto \langle f, g \rangle) for some gL qg \in L^q, it’s only natural to see whether we can find a sufficiently large collection BB' of such gg such that B= gBH λ(g)B = \bigcap_{g \in B'} H_{\lambda(g)}, and in fact the intuition is that the unit ball BB' in L qL^q ought to work.

Thus the idea is clear, and it’s just a matter of technique from here. We let the relation |f,g|1{|\langle f, g \rangle|} \leq 1 on L p×L qL^p \times L^q set up a Galois connection between subsets of L pL^p and subsets of L qL^q. The connection takes the unit ball BB' in L qL^q to

(B) {fL p:( gB)|f,g|1}, (B')^\perp \coloneqq \big\{ f \in L^p \colon (\forall_{g \in B'}) \; {|\langle f, g \rangle|} \leq 1 \big\} \,,

which is clearly convex, being an intersection of convex sets {f:|f,g|1}\{f: {|\langle f, g \rangle|} \leq 1\}, one for each gBg \in B'. Hölder’s inequality itself just asserts the containment B(B) B \subseteq (B')^\perp. If we show the other inclusion (B) B(B')^\perp \subseteq B, then B=(B) B = (B')^\perp is convex. So we want to show that if |f,g|1{|\langle f, g \rangle|} \leq 1 whenever g q1{\|g\|_q} \leq 1, then f p1{\|f\|_p} \leq 1. But we already did that calculation when we proved L p(L q) *L^p \hookrightarrow (L^q)^\ast is an isometry. Explicitly: take h=|f| p/fh = {|f|^p}/f (with h=0h = 0 where f=0f = 0). Then |h|=|f| p1=|f| p/q{|h|} = {|f|^{p-1}} = {|f|^{p/q}}, so |h| q=|f| p{|h|^q} = {|f|^p} whence h q q=f p p{\|h\|_q^q} = {\|f\|_p^p}. Put g=hh qg = \frac{h}{{\|h\|_q}}; since g q1{\|g\|_q} \leq 1, it follows by the hypothesis on ff that 1|f,g|1 \geq {|\langle f, g \rangle|}. But this gives

11h q Xfh=1f p p/q X|f| p=f p pp/q=f p1 \geq \frac1{{\|h\|_q}} \int_X f h = \frac1{{\|f\|_p^{p/q}}} \int_X {|f|^p} = {\|f\|_p^{p - p/q}} = {\|f\|_p}

as was to be shown.

The standard derivation of Minkowski’s inequality from Hölder’s inequality is nothing more than a very tidied-up rendering of this argument, but without the additional conceptual explanation given here.

Relation to Log-convex functions

Let DD be a convex space, e.g. an affine space. We say that a function f:D(0,)f: D \to (0, \infty) is log-convex if its logarithm log(f)\log(f) is a convex function.

Hölder’s inequality is closely related to the notion of log-convexity. On the one hand, we saw that the inequality follows from the convexity of the exponential function, which is the most basic log-convex function of all. On another hand, we have the following result which uses Hölder’s inequality.


The collection of log-convex positive functions on a convex domain DD is closed under pointwise multiplication, pointwise addition, and pointwise sups.


The statement for multiplication is clear since log(fg)=log(f)+log(g)\log(f \cdot g) = \log(f) + \log(g) and any sum of convex functions is convex.

Similarly, log:(0,)\log: (0,\infty) \to \mathbb{R} is an isomorphism of partially ordered sets and so log(sup{f i})=sup{log(f i)}\log (\sup\{f_i\}) = \sup\{\log(f_i)\}. It thus suffices to show that if f if_i is a collection of convex functions on DD, then so is sup{f i}\sup\{f_i\}. For x,yDx, y \in D and a,b0a, b \geq 0 such that a+b=1a + b = 1, we must show

sup{f i}(ax+by)asup{f i}(x)+bsup{f i}(y);\sup\{f_i\}(a x + b y) \leq a \sup\{f_i\}(x) + b\sup\{f_i\}(y);

letting cc denote the right side, this holds iff f i(ax+by)cf_i(a x + b y) \leq c for all ii (by definition of sup\sup). But

f i(ax+by) af i(x)+bf i(y) sincef iisconvex asup{f i}(x)+bsup{f i}(y) .\array{ f_i(a x + b y) & \leq & a f_i(x) + b f_i(y) & since\; f_i\; is\; convex \\ & \leq & a\sup\{f_i\}(x) + b\sup\{f_i\}(y) & }.

Finally, for the sum f+gf + g, in order to show log(f+g)\log(f + g) is convex, it suffices to show that

(1)(f+g)(1px+1qy)(f+g)(x) 1p(f+g)(y) 1q (f + g)\left(\frac1{p}x + \frac1{q}y\right) \leq (f+g)(x)^{\frac1{p}} (f+g)(y)^{\frac1{q}}

for p,q>1p, q \gt 1 such that 1p+1q=1\frac1{p} + \frac1{q} = 1. But setting

s=f(x) 1p,t=g(x) 1p,u=f(y) 1q,v=g(y) 1q,s = f(x)^{\frac1{p}}, \qquad t = g(x)^{\frac1{p}}, \qquad u = f(y)^{\frac1{q}}, \qquad v = g(y)^{\frac1{q}},

the right side of (1) is (s p+t p) 1p(u q+v q) 1q(s^p + t^p)^{\frac1{p}} \cdot (u^q + v^q)^{\frac1{q}}. By Hölder’s inequality, this is greater than or equal to

su+tv = f(x) 1pf(y) 1q+g(x) 1pg(y) 1q f(1px+1qy)+g(1px+1qy)\array{ s u + t v & = & f(x)^{\frac1{p}} f(y)^{\frac1{q}} + g(x)^{\frac1{p}} g(y)^{\frac1{q}} \\ & \geq & f\left(\frac1{p} x + \frac1{q} y\right) + g\left(\frac1{p} x + \frac1{q} y\right) }

where the last inequality is by log-convexity of ff and gg.

This last theorem enters into the account of Artin (1931) of the basic theory of the Gamma function:


The Gamma function

Γ(x)= 0 t xe tdtt\Gamma(x) = \int_0^\infty t^x e^{-t} \frac{d t}{t}

is log-convex over the domain x>0x \gt 0.


The function xt x1x \mapsto t^{x-1} is log-linear, hence log-convex. Hence the integral defining Γ(x)\Gamma(x) over x>0x \gt 0 is a sup over suitable Riemann sums that are positive-linear combinations of the form

i=1 nt i x1e t iΔt i\sum_{i=1}^n t_i^{x-1} e^{-t_i} \Delta t_i

and these, and together with their sup, are log-convex by the Theorem.

The main fact underlying Artin’s 1931 discussion is the Bohr-Haagerup theorem (Artin (1931), Thm. 2.1):


The Gamma function is characterized as the unique function Γ:{x|x}\Gamma: \{x \in \mathbb{R}|\; -x \notin \mathbb{N}\} \to \mathbb{R} satisfying the following conditions:

  • Γ(1)=1\Gamma(1) = 1,

  • Γ(x+1)=xΓ(x)\Gamma(x+1) = x\Gamma(x),

  • Γ\Gamma is log-convex over (0,)(0, \infty).


  • Emil Artin, Einführung in die Theorie der Gammafunktion, Hamburger Mathematische Einzelschriften

    l. Heft, Verlag B. G. Teubner, Leipzig (1931)

    English translation by Michael Butler: The Gamma Function, Holt, Rinehart and Winston (1931) [pdf]

Last revised on December 27, 2022 at 21:48:59. See the history of this page for a list of all contributions to it.