nLab
Gram-Schmidt process

Contents

Idea

The Gram–Schmidt process is an algorithm which takes as input an ordered basis? of an inner product space and produces as output an ordered orthonormal basis?.

In terms of matrices?, the Gram–Schmidt process is a procedure of factorization of a invertible matrix M in the general linear group GL n() (or GL n()) as a product M=UT where T is an upper triangular matrix and U is an orthonormal (or unitary) matrix; as such it is a special case of the more general Iwasawa decomposition? for a (connected) semisimple Lie group. Since the factorization depends smoothly on the parameters, the Gram–Schmidt procedure enables the reduction of the structure group of an inner product bundle (e.g., the tangent bundle of a Riemannian manifold or a Kähler manifold) from GL n to orthogonal group O n (or the unitary group U n).

Gram–Schmidt process on Hilbert spaces

In this section, “basis” is understood to signify an ordered independent set whose linear span is dense in a Hilbert space H seen as a metric space. We will describe the Gram–Schmidt process as applied to a countable basis v 1,v 2, of an infinite-dimensional separable Hilbert space, although it would apply just as well to a well-ordered basis of any Hilbert space, separable or not.

The orthonormal basis u 1,u 2, produced as output is defined recursively as follows. We denote the normalization v/v of a vector vH by N(v). Put

u n=N(v n k<nv n,u ku k)u_n = N(v_n - \sum_{k \lt n} \langle v_n, u_k\rangle u_k)

where the sum on the right is understood to be empty (equal to 0) in the case n=1. A simple inductive argument shows that the u i are unit vectors orthogonal to each other, and that the span of u 1,,u n is equal to the span of v 1,,v n. Therefore u 1,u 2, is an orthonormal basis of H.

Example of Legendre polynomials

A classic illustration of Gram–Schmidt is the production of the Legendre polynomials.

Let H be the Hilbert space H=L 2([1,1]), equipped with the standard inner product defined by

f,g= 1 1f(x)¯g(x)dx\langle f, g\rangle = \int_{-1}^1 \bar{f(x)} g(x) d x

By the Stone-Weierstrass theorem, the space of polynomials [x] is dense in H according to its standard inclusion, and so the polynomials 1,x,x 2, form an ordered basis of H.

Applying the Gram–Schmidt process, one readily computes the first few orthonormal functions:

u 1(x)=N(1)=1/2u_1(x) = N(1) = 1/2
u 2(x)=N(x0)=3/2xu_2(x) = N(x - 0) = \sqrt{3/2} x
u 3(x)=N(x 2x 2,1/21/20)=N(x 21/3)=35/2/2(x 21/3)u_3(x) = N(x^2 - \langle x^2, 1/2\rangle 1/2 - 0) = N(x^2 - 1/3) = 3\sqrt{5/2}/2(x^2 - 1/3)

The classical Legendre polynomials P n(x) are scalar multiplies of the functions u n, adjusted so that P n(1)=1; they satisfy the orthogonality relations

P n,P m=22n+1δ m,n\langle P_n, P_m\rangle = \frac{2}{2n + 1}\delta_{m, n}

where δ m,n is the Kronecker delta?.

Application to non-bases

If we apply the Gram–Schmidt process to a well-ordered independent set whose closed linear span S is not all of H, we still get an orthonormal basis of the subspace S. If we apply the Gram–Schmidt process to a dependent set, then we will eventually run into a vector v whose norm is zero, so we will not be able to take N(v). In that case, however, we can simply remove v from the set and continue; then we will still get an orthonormal basis of the closed linear span. (This conclusion is not generally valid in constructive mathematics, since it relies on excluded middle applied to the statement that v0. However, it does work to discrete fields, such as the algebraic closure of the rationals, as seen in elementary undergraduate linear algebra.)

Categorified Gram–Schmidt process

Many aspects of the Gram–Schmidt process can be categorified so as to apply to 2-Hilbert spaces. We will illustrate the basic idea with an example that was suggested to us by James Dolan.

Consider the category of complex representations of the symmetric group S n. (As a running example, we consider S 4; up to isomorphism, there are five irreducible representations

U (4),U (31),U (22),U (211),U (1111)U_{(4)}, \, U_{(3 1)}, \, U_{(2 2)}, \, U_{(2 1 1)}, \, U_{(1 1 1 1)}

classified by the five Young diagrams of size 4. To save space, we denote these as U 1, U 2, U 3, U 4, U 5.) The irreducible representations U i of S n form a 2-orthonormal basis in the sense that any two of them U i,U j satisfy the relation

hom(U i,U j)δ ijhom(U_i, U_j) \cong \delta_{i j} \cdot \mathbb{C}

(where n indicates a direct sum of n copies of ). In fact, the irreducible representations are uniquely determined up to isomorphism by these relations.

There is however another way of associating representations to partitions or Young diagrams. Namely, consider the subgroup of permutations which take each row of a Young diagram or Young tableau of size n to itself; this forms a parabolic subgroup? of S n, conjugate to one of type P (n 1n k)=S n 1××S n k where n i is the length of the i th row of the Young diagram. The group S n acts transitively on the orbit space of cosets

S n/P (n 1n k)S_n/P_{(n_1 \ldots n_k)}

and these actions give permutation representations of S n. Equivalently, these are representations V i which are induced from the trivial representation along inclusions of parabolic subgroups. We claim that these representations form a -basis of the representation ring, and we may calculate their characters using a categorified Gram–Schmidt process.

Given two such parabolic subgroups P, Q in G=S n, the 2-inner product

hom G([G/P],[G/Q])hom_G(\mathbb{C}[G/P], \mathbb{C}[G/Q])

may be identified with the free vector space on the set of double cosets P\G/Q. One may count the number of double cosets by hand in a simple case like G=S 4. That is, for the 5 representations V 1,,V 5 induced from the 5 parabolic subgroups P i corresponding to the 5 Young diagrams listed above, the dimensions of the 2-inner products hom(V i,V j) are the sizes of the corresponding double coset spaces P i\S 4/P j. These numbers form a matrix as follows (following the order of the 5 partitions listed above):

(1 1 1 1 1 1 2 2 3 4 1 2 3 4 6 1 3 4 7 12 1 4 6 12 24)\left( \array {1 & 1 & 1 & 1 & 1 \\ 1 & 2 & 2 & 3 & 4 \\ 1 & 2 & 3 & 4 & 6 \\ 1 & 3 & 4 & 7 & 12 \\ 1 & 4 & 6 & 12 & 24 }\right)

To reiterate: this matrix is the decategorification (a matrix of dimensions) of a matrix of 2-inner products where the (ij)-entry is of the form

hom G(V i,V j)V i * GV jhom_G(V_i, V_j) \cong V_i^* \otimes_G V_j

where the V i are induced from inclusions of parabolic subgroups. The V i are -linear combinations of irreducible representations U i which form a 2-orthonormal basis, and we may perform a series of elementary row operations which convert this matrix into an upper triangular matrix, and which will turn out to be the decategorified form of the 2-matrix with entries

hom G(U i,V j)U i * GV jhom_G(U_i, V_j) \cong U_i^* \otimes_G V_j

where U i is the irreducible corresponding to the ith Young diagram (as listed above). The upper triangular matrix is

(1 1 1 1 1 0 1 1 2 3 0 0 1 1 2 0 0 0 1 3 0 0 0 0 1)\left( \array {1 & 1 & 1 & 1 & 1 \\ 0 & 1 & 1 & 2 & 3 \\ 0 & 0 & 1 & 1 & 2 \\ 0 & 0 & 0 & 1 & 3 \\ 0 & 0 & 0 & 0 & 1} \right)

and we read off from the columns the following decompositions into irreducible components:

V 1U 1V_1 \cong U_1
V 2U 1+U 2V_2 \cong U_1 + U_2
V 3U 1+U 2+U 3V_3 \cong U_1 + U_2 + U_3
V 4U 1+2U 2+U 3+U 4V_4 \cong U_1 + 2 U_2 + U_3 + U_4
V 5U 1+3U 2+2U 3+3U 4+U 5V_5 \cong U_1 + 3 U_2 + 2 U_3 + 3 U_4 + U_5

The last representation V 5 is the regular representation of S 4 (because the parabolic subgroup is trivial). Since we know from general theory that the multiplicity of the irreducible U i in the regular representation is its dimension, we get as a by-product the dimensions of the U i from the expression for V 5:

dim(U 1)=1,dim(U 2)=3,dim(U 3)=2,dim(U 4)=3,dim(U 5)=1dim(U_1) = 1, \, dim(U_2) = 3, \, dim(U_3) = 2, \, dim(U_4) = 3, \, dim(U_5) = 1

(the first of the U i is the trivial representation, and the last U 5 is the alternating representation).

The row operations themselves can be assembled as the lower triangular matrix

(1 0 0 0 0 1 1 0 0 0 0 1 1 0 0 1 1 1 1 0 2 1 2 0 1)\left( \array{1 & 0 & 0 & 0 & 0 \\ -1 & 1 & 0 & 0 & 0 \\ 0 & -1 & 1 & 0 & 0 \\ 1 & -1 & -1 & 1 & 0 \\ 2 & -1 & -2 & 0 & 1 } \right)

and from the rows we read off the irreducible representations as “virtual” (i.e., -linear) combinations of the parabolically induced representations V i:

U 1V 1U_1 \cong V_1
U 2V 1+V 2U_2 \cong -V_1 + V_2
U 3V 2+V 3U_3 \cong -V_2 + V_3
U 4V 1V 2V 3+V 4U_4 \cong V_1 - V_2 - V_3 + V_4
U 52V 1V 22V 3+V 5U_5 \cong 2 V_1 - V_2 - 2 V_3 + V_5

which can be considered the result of the categorified Gram–Schmidt process.

It follows from these representations that the V i form a -linear basis of the representation ring Rep(S 4). Analogous statements hold for each symmetric group S n.