Lagrange multiplier


Idea following Loomis-Sternberg

XX – finite dimensional real vector space

UXU\subset X open

F:URF: U\to \mathbf{R} differentiable function

SUS\subset U a smooth submanifold, which can be represented as a zero set of a differentiable map G:UYG: U\to Y, whre YY is a real vector space and such that dG xd G_x is surjective for each xSx\in S.

We want to minimize F(x)F(x) for xSx\in S. It won’t work to set dF x=0d F_x = 0 and solving for xx as xx will not be a critical point of FF in general. The Lagrange multipliers are used to define another function LL such that solving dL x=0d L_x = 0 gives extrema of the constrained extremization problem.

Theorem (Loomis-Sternberg 3.12.2) Suppose FF has a maximum on SS at xx. Then there is a function(al) ll in Y *Y^* such that xx is a critical point of the function FlGF - l\circ G.

The proof uses implicit function theorem and the usual extremization arguments.

To get to a more familiar form of Lagrange multipliers, one uses the local coordinates (x 1,,x n)(x_1,\ldots,x_n) on UU and sets Y=R mY = \mathbf{R}^m, so that G=(g 1,,g n)G = (g^1,\ldots, g^n). Now l:YRl: Y\to\mathbf{R} will be of the form l(y 1,,y m)= i=1 mλ iy il(y_1,\ldots,y_m) = \sum_{i = 1}^m \lambda_i y_i and FlG=F i=1 mλ ig iF - l\circ G = F - \sum_{i = 1}^m \lambda_i g^i and d(FlG)=0d (F - l\circ G) = 0 gives

Fx j i=1 mλ ig ix j=0,j=1,,n. \frac{\partial F}{\partial x_j} - \sum_{i = 1}^m\lambda_i\frac{\partial g^i}{\partial x_j} = 0,\,\,\,\,\,\,\,\,j = 1,\ldots, n.

This is nn equations, which together with mm equations G=(g 1,,g n)=0G = (g^1,\ldots, g^n) = 0 for SS give m+nm+n equations for m+nm+n unknowns x 1,,x n,λ 1,,λ mx_1,\ldots, x_n, \lambda_1,\ldots,\lambda_m. The last mm variables here are the Lagrange multipliers.


To spectral theory

The method of Lagrange multipliers affords an elementary proof of the spectral theorem for finite-dimensional real vector spaces, one which does not involve passage to the complex numbers and the fundamental theorem of algebra.


Let AA be a real symmetric n×nn \times n matrix. Then AA is diagonalizable over the real numbers.


Consider the problem of maximizing the function f(x)=x|A|xf(x) = \langle x \vert A \vert x \rangle where x nx \in \mathbb{R}^n is subject to the constraint x|x=1\langle x \vert x \rangle = 1. (Such an extreme point exists, say by compactness.) By the symmetry of AA, the gradient of ff is easily calculated to be f(x)=2Ax\nabla f (x) = 2 A x, whereas the gradient of the Euclidean norm x|x\langle x \vert x \rangle is 2x2 x. At a point xx where a maximum is attained, we have f(x)=2Ax=λ(2x)\nabla f(x) = 2 A x = \lambda (2 x) for some Lagrange multiplier λ\lambda. Thus xx is an eigenvector of AA with eigenvalue λ\lambda. The usual arguments show that AA restricts to a self-adjoint operator on the hyperplane orthogonal to xx; by picking an orthonormal basis of this hyperplane, we may represent this restriction of AA by a real symmetric matrix of size (n1)×(n1)(n-1) \times (n-1), and the argument repeats.


Last revised on October 12, 2017 at 18:15:55. See the history of this page for a list of all contributions to it.