Basic structures
Generating functions
Proof techniques
Combinatorial identities
Polytopes
Catalan numbers are a specific sequence of nonnegative integers, here denoted , with a ubiquitous combinatorial interpretations. In terms of binomial coefficients they are given by
The Catalan numbers count a myriad of different families of objects, including:
nonisomorphic (rooted planar) binary trees with internal vertices;
ways of unambiguously defining the product of elements within an arbitrary (not necessarily associative) magma, or equivalently, ways of fully parenthesizing a string of letters;
elements of the free magma generated by a singleton (called words) that have instances of the letter ;
strings consisting of well-balanced pairs of parentheses (also known as “Dyck words”), or equivalently, surjective monotone functions such that for all (where we write for the finite ordinal );
monotonic lattice paths in an grid which do not cross below the diagonal, or equivalently, injective monotone functions such that for all (where we write for the non-empty finite ordinal );
monotone functions such that and for all , or in other words, 1-cells in (the sub-2-category of the simplex 2-category spanned by the first-element-preserving functions) which are pointed in the sense of being equipped with a 2-cell ;
“mountain ranges”, i.e., functions such that and for (so called because graphs of such depict mountain ranges);
“stay-ahead races”, i.e., functions such that for , and (see below for explanation of terminology);
and many, many more besides (see the Stanley references and OEIS A000108).
In this section we describe some “canonical” isomorphisms between various structures named above. Most of these can be seen as isomorphisms between various species of structures, which collectively might be called “the Catalan species”.
Each (finite) rooted planar binary tree with internal vertices (i.e., vertices that have a left child and a right child) either consists of only a root (the case ), or has a left subtree and a right subtree adjacent to the root. The associated magma word is defined recursively according to these cases:
in the case ;
[expressing the magma product by simple juxtaposition] in the case .
Put slightly differently, the collection of isomorphism classes of rooted planar binary trees carries a magma structure, defined by using the notation above, and we have the following result.
The mapping defines a bijection from the set of isomorphism clases of rooted planar binary trees with internal vertices to the set of magma words with instances of .
The inverse is given by the unique magma map that takes to the isomorphism class of trees consisting of just a root.
It is often suggestive to use exponential notation to denote the magma operation: . For each element of the free magma , there is thus an unary operation . If denotes the free monoid on the set , there is a unique monoid map
that carries to the unary operation . For , it is reasonable to denote the composite
by . Denote the monoid product in by simple juxtaposition. Under these conventions, we have for the exponential equation
If we regard the right-hand side of the exponential equation as a reduction of the left-hand side, then for any formal magma expression , either , or we may write and then reduce to .
Continuing recursively in this manner, every word can be eventually expressed as a “reduced form” (where no further reductions are possible), where we formally define a reduced form as follows:
The expression is a reduced form;
If are reduced forms, then so is .
The reader who is following where this is going will have observed that this recursive description of reduced forms matches precisely the recursive description of rooted planar trees, formalized by saying the set of isomorphism classes of rooted planar trees is the least fixed point, or initial algebra, of the endofunctor on .
To put it more visually: let us designate the term appearing as the base of a reduced form as the root of a corresponding tree. (If the reduced form is just , then that is designated the root of a tree consisting of only a root.) The reduced forms themselves have roots, and we draw an edge connecting those roots to the root appearing in the base. Continuing this pattern, we climb recursively up the stacked exponentials, eventually obtaining a structure of rooted planar tree from a reduced form, whose vertices are just the instances of appearing in the reduced form.
In fact, there is no harm in identifying rooted planar trees with reduced forms, and we will repeatedly use this identification below.
Letting denote the reduced form of a magma word , we thus have the following result.
The mapping defines a bijection from to , i.e., from magma words having instances of to isomorphism classes of rooted planar trees with edges.
The inverse takes a reduced form to the magma word defined recursively by
and .
To each rooted planar tree or reduced form with edges, we associate a mountain range as follows.
In the first place, there is a monoid structure on the set of mountain ranges, essentially given by juxtaposition. Formally, if and are mountain ranges, then their product
is defined by if , and if , noting that in either case by definition of mountain range ().
We define , from the free monoid on trees to mountain ranges, by the following recursive rules:
takes the value .
If have been defined for reduced forms , then
, as a function for some , in which case
takes for to .
Then define to be the mapping that takes a tree to .
The mapping defines a bijection from to , consisting of mountain ranges of the form .
The inverse mapping can be described as follows. For each mountain range , we may subdivide the interval into subintervals according to subdivision points where . If there is at least one internal subdivision point given by an such that , then the restriction of to the subintervals (before and after ) gives mountain ranges , which by strong induction on have assigned reduced trees , and then we put . If there are no internal subdivision points, then on the subinterval . Notice moreover that , using the mountain range conditions and . This means the function defines a mountain range over the subinterval ; by induction this has an assigned reduced tree , and then we define . This completes the description of the inverse mapping.
The terminology “stay-ahead” race is suggested by the following scenario: a poll race is held between a proponent P and an opponent O. A vote for P is indicated by the value , a vote for O by . Votes are counted one after another (tracked by a function ) and partial tallies are recorded. The stay-ahead conditions are that each partial tally has P ahead of O (i.e., for each ), and that P wins in the end by just one vote: .
To each stay-ahead race we associate a mountain range by defining .
The mapping defines a bijection from stay-ahead races to mountain ranges .
The inverse mapping takes a mountain range to the stay-ahead race defined by the function for , making use of a nonce convention .
A mountain range (or more precisely, the graph of a mountain range) can easily be converted, by a linear transformation, to an injective poset mapping satisfying the diagonal conditions , , and .
To wit: we remark that for a mountain range , each value has the same parity as , and for all . (Both are simple consequences of the mountain range conditions.) Define
,
.
Denote the poset map thus defined by . Let denote the set of monotonic lattice paths satisfying the diagonal conditions.
The mapping defines a bijection .
The inverse mapping takes an element to the mountain range defined by . The injectivity of forces each step of the path to go just one step (or block) north or just one block east, corresponding to the fact that the difference between and always lies in .
Each injective map determines and is determined by a map defined as follows: if , then . (The injectivity of implies that the lattice path crosses each line .)
The mapping defines a bijection from to the set consisting of maps such that and for all .
Indeed, the pointed inflationary conditions on are direct consequences of the diagonal conditions on . The inverse mapping takes such a function to the lattice path that proceeds north and east, interpolating through the points
The bijections of the last section indicate that all of the structures named above are counted by the same quantity . There are several ways of establishing the binomial expressions for mentioned above.
A tried-and-true approach (explained in Baez-Dolan) is by means of generating functions. To enumerate rooted planar binary trees with internal indices, use the fact that there is just one such tree where (i.e., ), and that in the case , each tree is determined by a pair consisting of a left tree and right tree adjacent to the root; if has internal vertices and has internal vertices, then . Hence for we have the recursive equation
Letting be the generating function (really a formal power series), the recursion leads to
which we solve as in the formal power series sense. Starting with the binomial theorem to expand as a power series, a series of (somewhat tedious and unilluminating) manipulations eventually leads to the calculation
A more structural derivation can be obtained by counting stay-ahead races, in the following manner. Let be the set of functions for which . Each such takes the value a total of times, and the value a total of times, so the cardinality of is .
We let the cyclic group act on , by precomposing functions with powers of the cyclic permutation
The number of stay-ahead races in is seen to be , as soon as we establish the following result.
Each orbit of the cyclic group action on contains exactly one stay-ahead race, and has exactly elements.
Given , form the unique extension that is periodic with period . Let be the unique function such that and
for all . For example, for all .
Observe that for all , so that the line with slope that passes through also passes through , but passes through no other for since is integral but is not, being strictly between and . This implies that the lines for ranging over a period block are all distinct, i.e., the cyclic group action on these according to the formula is faithful.
Pick so that the line is “lowest”; all the lines have the same slope , so lowest just means is minimal. Minimality implies that that for every other within a periodic block, the point lies above the line . Translating this picture horizontally by by considering , and then vertically by subtracting , i.e., putting , this means that all points for stay above the line . In other words, these values are all positive and : precisely the stay-ahead conditions for . Thus the orbit of any contains a stay-ahead race. Moreover, this is the only stay-ahead race in the orbit since the stay-ahead positivity condition is equivalent to the minimality condition on , and all the are distinct.
Finally, the stabilizer of must be trivial, else we would have for a non-multiple of : as we saw before, this can’t happen.
Yet another proof of the formula is based on Jean-Luc Rémy’s algorithm for generating binary trees uniformly at random (see Knuth TACP 4A, Section 7.2.1.6), which can be described as iterations of the following simple rule:
Pick any edge of the tree.
Create a new internal node by splitting the edge in two and budding off a freshly-labelled leaf, either to the left or to the right of the new internal node.
Starting from the trivial binary tree (containing no internal nodes and a single leaf labelled “0”), this algorithm will generate a binary tree with internal nodes, together with a labelling of its leaves by distinct labels in . Moreover, it does so exhaustively and without repetition, that is, every binary tree equipped with a permutation of its leaves will be generated exactly once. (An easy way to see this is by thinking of applying the algorithm “in reverse” to an arbitrary leaf-labelled binary tree, repeatedly pruning off the leaf with the largest label while recording whether it is a left leaf or a right leaf, until only the root edge remains.)
Since at the th iteration of the algorithm there are exactly possible choices of an edge and a direction, after iterations we obtain the identity:
(where is the double factorial), which implies that
As pointed out for example in Kapranov-Voevodsky, another description of a Catalan species is by way of subdivisions of a regular polygon with points (labeled by elements in in clockwise order) into triangles. Thus there are distinct subdivisions of a square, possible subdivisions of a pentagon, and in general possible subdivisions of an -gon.
(To be continued.)
General references:
Other references:
Donald E. Knuth, The Art of Computer Programming, Volume 4A: Combinatorial Algorithms, Part 1, Addison-Wesley, 2011. See Section 7.2.1.6 on “Generating all trees” (also available as a pre-fascicle).
For an interpretation in terms of species (or structure types) see
Last revised on September 12, 2018 at 10:46:45. See the history of this page for a list of all contributions to it.