nLab pregroup grammar

Contents

Contents

Overview

Pregroup grammar is a mathematical model of natural language grammar introduced by Lambek (1999); it is part of the categorial grammar tradition.

Pregroups

Definition

A preordered monoid is a tuple (P,,,1)(P, \leq, \cdot, 1) where (P,)(P, \leq) is a preorder and (P,,1)(P, \cdot, 1) is a monoid such that the multiplication is monotone, i.e.

  • acbdabcda \leq c \wedge b \leq d \implies a \cdot b \leq c \cdot d \quad (monotony)

or equivalently:

  • bdabcadcb \leq d \implies a \cdot b \cdot c \leq a \cdot d \cdot c \quad (substitution)

for all a,b,c,dPa, b, c, d \in P. A pomonoid is a preordered monoid that is furthermore a poset, i.e.

  • abaa=ba \leq b \leq a \implies a = b \quad (antisymmetry)

A pregroup is a pomonoid such that every object tPt \in P has left and right adjoints t l,t rPt^l, t^r \in P. Explicitly we have the following four axioms:

  • t lt1,tt r1t^l \cdot t \leq 1, \quad t \cdot t^r \leq 1 \quad (contraction)
  • 1tt l,1t rt1 \leq t \cdot t^l, \quad 1 \leq t^r \cdot t \quad (expansion)

In other words, a pregroup is a rigid monoidal category which is thin and skeletal. Note that a commutative pregroup is simply an ordered group. Here are a few variants of pregroups that appear in the literature, see Coecke 2013:

  • If we drop the anti-symmetry axiom for posets, we get a quasi-pregroup. These are useful as they avoid the equality aaa laaa=aa laa \leq a \cdot a^l \cdot a \leq a \implies a = a \cdot a^l \cdot a for all aPa \in P.
  • If we drop the two expansion axioms, we get a protogroup.
  • If we drop the axioms for the right adjoints, we get a left pregroup. Symmetrically, if we drop the left adjoints we get a right pregroup.

Monotone unbounded functions

Take the natural numbers \mathbb{N} and define the set PP of monotone functions f:f : \mathbb{N} \to \mathbb{N} that are furthermore unbounded, i.e. for all nn \in \mathbb{N} there is an mm \in \mathbb{N} with nf(m)n \leq f(m). PP is a pomonoid with composition as product, identity as unit and fgf \leq g iff f(n)g(n)f(n) \leq g(n) for all nn \in \mathbb{N}. To show that PP is a left pregroup, define:

f l(m)=min{n|mf(n)}f^l(m) = \text{min} \{n \in \mathbb{N} \vert m \leq f(n)\}

we get:

(f lf)(m)=f l(f(m))=min{n|f(m)f(n)}m=1(m)(f^l \cdot f)(m) = f^l(f(m)) = \text{min}\{n \in \mathbb{N} \vert f(m) \leq f(n)\} \leq m = 1(m)

and

1(m)=mf(min{n|mf(n)})=f(f l(m))=(ff l)(m)1(m) = m \leq f(\text{min} \{ n \in \mathbb{N} \vert m \leq f(n)\}) = f(f^l(m)) = (f \cdot f^l) (m)

for all mm \in \mathbb{N}, i.e. f lf1ff lf^l \cdot f \leq 1 \leq f \cdot f^l.

If we replace the natural numbers \mathbb{N} by the integers \mathbb{Z} then the set of unbounded monotone functions f:f : \mathbb{Z} \to \mathbb{Z} is both a left and a right pregroup where:

f r(m)=max{n|f(n)m}f^r(m) = \text{max} \{n \in \mathbb{N} \vert f(n) \leq m\}

For example, if f(n)=2nf(n) = 2 n then f l(n)=n+12n2=f r(n)f^l(n) = \lceil \frac{n + 1}{2} \rceil \neq \lceil \frac{n}{2} \rceil = f^r(n).

Proposition (Buszkowski 2001): Every pregroup is a pregroup of functions, i.e. a sub-pomonoid of XXX \to X for some poset XX.

Free pregroups and the switching lemma

In linguistic applications, one starts with a poset B={s,n,}B = \{ s, n, \dots \} of basic types (e.g. sentence and noun) and then takes the free pregroup P BP_B generated by BB. For a basic type bBb \in B, iterated adjoints ,b ll,b l,b,b r,b rr,P B\dots, b^{ll}, b^l, b, b^r, b^{rr}, \dots \in P_B are called simple types; an arbitrary type tP Bt \in P_B may then be given as a sequence of simple types. The following lemma makes the parsing problem for free pregroups decidable — i.e. given a sentence type sP Bs \in P_B and the types for a sequence of words t 1,,t nP Bt_1, \dots, t_n \in P_B, is t 1t nst_1 \dots t_n \leq s?

Switching lemma (Lambek 1999): For any pair of types t,tP Bt, t' \in P_B, if ttt \leq t' then there is a type tP Bt'' \in P_B such that ttt \leq t'' without expansions and ttt'' \leq t' without contractions, i.e. free protogroups recognise the same language as free pregroups.

From pregroups to compact 2-categories

In Preller, Lambek 2007, Preller and Lambek generalise the free pregroup construction above to free compact 2-categories in order to capture proof relevance. This allows to distinguish between the distinct parsings of ambiguous phrases such as “(men and women) who like maths” vs. “men and (women who like maths)”.

Note that the compact 2-categories of Lambek and Preller differ from compact closed 2-category in that they are not required to be symmetric. A non-symmetric compact closed 2-category with one object is simply a rigid monoidal category.

Given a poset of basic types BB, the objects of the free rigid monoidal category C BC_B are the same as that of the free (quasi-)pregroup, the arrows may be described as planar string diagrams. Given two types t,tP Bt, t' \in P_B, we have that ttt \leq t' if and only if there is an arrow r:ttr : t \to t' in C BC_B.

Pregroup grammars as free rigid monoidal categories

A pregroup grammar is a tuple G=(B,Σ,Δ,s)G = (B, \Sigma, \Delta, s) where BB and Σ\Sigma are finite sets called the basic types and the vocabulary respectively, ΔΣ×P B\Delta \subseteq \Sigma \times P_B is a relation called the dictionary and sP Bs \in P_B is a designated sentence type. We require that Δ\Delta is finite, i.e. the set of possible types Δ(w)P B\Delta(w) \subseteq P_B is finite for each word wΣw \in \Sigma. The language of GG is then given by:

L(G)={w 1w nΣ n|n,t inΔ(w i)ts}L(G) = \big\{ w_1 \dots w_n \in \Sigma^n \vert n \in \mathbb{N}, \quad \exists t \in \prod_{i \leq n} \Delta(w_i) \quad t \leq s \big\}

i.e. a sequence of words w 1w nΣ nw_1 \dots w_n \in \Sigma^n is said to be grammatical whenever there is a dictionary entry (w i,t i)Δ(w_i, t_i) \in \Delta for each word w iw_i such that t 1t nst_1 \dots t_n \leq s. One may simplify this by redefining L(G)={w 1w nΣ n|C G(w 1w n,s)}L(G) = \big\{ w_1 \dots w_n \in \Sigma^n \vert C_G(w_1 \dots w_n, s) \neq \emptyset \big\} where C GC_G is the free rigid monoidal category with:

  • generating objects the disjoint union B+ΣB + \Sigma,
  • generating arrows the dictionary entries (w,t)Δ(w, t) \in \Delta with dom(w,t)=wdom(w, t) = w and cod(w,t)=tcod(w, t) = t.

That is, a sequence of words is grammatical whenever there exists a string diagram going from the sequence of words to the sentence type. Note that traditionally the identity wires for words wΣw \in \Sigma are omitted, hence dictionary entries are depicted as triangles with no input. Contractions are depicted as cups, e.g. from (Alice,n),(loves,n rsn l),(Bob,n)Δ(Alice, n), (loves, n^r s n^l), (Bob, n) \in \Delta and nn rsn lnsn n^r s n^l n \leq s we get the following diagram:

Pregroup semantics as strong monoidal functors

One may give a semantics to a pregroup grammar G=(B,Σ,Δ,s)G = (B, \Sigma, \Delta, s) by defining a strong monoidal functor F:C GSF : C_G \to S, where C GC_G is the free rigid monoidal category described in section 2. SS is a suitable rigid monoidal category, e.g. FdVect\text{FdVect} or Rel\text{Rel}, depending on the application. Note the similarity with a Lawvere theory as a monoidal functor from a syntactic category to Set\text{Set}.

We require the image for all words wΣw \in \Sigma to be the monoidal unit F(w)=IF(w) = I, hence the image for each dictionary entry (w,t)Δ(w, t) \in \Delta is given by a state F(w,t):IF(t)F(w, t) : I \to F(t). The meaning F(r):IF(s)F(r) : I \to F(s) for a sentence w 1w nΣ nw_1 \dots w_n \in \Sigma^n with grammatical reduction r:w 1w nsr : w_1 \dots w_n \to s may then be computed from the individual meanings F(w i,t i):IF(t i)F(w_i, t_i) : I \to F(t_i) of the words, following Frege's principle of compositionality. This has been developed in Preller 2005 as well as in a series of papers by Bob Coecke and others, see categorical compositional distributional semantics.

Expressive power and extensions

Theorem (Buszkowski, Moroz 2008): For each pregroup grammar GG, there is a context-free grammar GG' such that L(G)=L(G)L(G) = L(G'). Furthermore, GG' may be computed in time polynomial in the size of GG.

The opposite direction also holds, hence pregroup grammar and context-free grammar are said to be weakly equivalent: the translation preserves only the generated languages, it does not preserve the structure of syntax trees.

Since it is generally accepted that natural languages go beyond context-free languages this result illustrates the lack of expressive power of pregroup grammars whence it becomes necessary to enhance the original grammar model:

One way (Kobele 2005) is to consider product pregroup grammars G 1×G 2G_1\times G_2: these assign to a string ww a tuple (t 1,t 2)(t_1,t_2) of syntactic types with grammaticality of (w,(t 1,t 2))(w,(t_1,t_2)) checked by parallel computation of the grammaticality of (w,t 1)(w,t_1) in G 1G_1 and of (w,t 2)(w,t_2) in G 2G_2. (w,(t 1,t 2))(w,(t_1,t_2)) is then grammatical iff (w,t 1)(w,t_1) and (w,t 2)(w,t_2) are i.e. L(G 1×G 2)=L(G 1)L(G 2)L(G_1\times G_2)= L(G_1)\cap L(G_2) (cf. Kobele&Kracht (2005), prop. 4). This permits to describe e.g. the context-sensitive rules of Swiss German (Lambek (2008)). The checking of syntactic types is still given by computation in a pregroup, namely the product pregroup P 1×P 2P_1\times P_2, but P 1×P 2P_1\times P_2 is not free!

Kobele and Kracht (2005) use this together with the above result by Buszkowski and the classical result in formal language theory that any type 0 language LL is the homomorphic image h(L 1L 2)h(L_1\cap L_2) of the intersection of two (deterministic) context-free languages L 1,L 2L_1,L_2 in order to show that pregroup grammars generate every type 0 language when type checking is done in arbitrary pregroups.

Since it is usually assumed that natural languages do not go beyond mildly context-sensitive languages one sees that (homomorphic images) of products of arbitrary pregroup grammars have excessive power and it becomes necessary to cut back this power. One way to do this is by restricting G 1G_1 to be of a type simpler than context-free e.g. a Dyck language? thought of now as a control language supervising the “context-free” computations in G 2G_2. This idea was explored in Kobele (2005).

Another approach to adapt the pregroup grammar formalism to mildly context-sensitive languages was proposed by Genkin et al. (2010) which augment pregroup grammars with buffers.

References

  • Joachim Lambek, Type Grammar Revisited, Logical Aspects of Computational Linguistics (1999)

  • Anne Preller?, Category Theoretical Semantics for Pregroup Grammars, Logical Aspects of Computational Linguistics, 5th International Conference, LACL 2005 (pdf)

  • Anne Preller?, Joachim Lambek, Free Compact 2-Categories, Mathematical Structures in Computer Science, Cambridge University Press (CUP), 2007 (pdf)

  • Joachim Lambek, From Word to Sentence: A Computational Algebraic Approach to Grammar, Polimetrica 2008 (pdf)

  • Wojciech Buszkowski?, Katarzyna Moroz?, Pregroup Grammars and Context-free Grammars, Computational Algebraic Approaches to Natural Language, Polimetrica (2008) (pdf)

  • Wojciech Buszkowski?. Lambek grammars based on pregroups. Logical Aspects of Computational Linguistics, LNAI 2099: 95–109, 2001.

  • Bob Coecke, An alternative Gospel of structure: order, composition, processes, Introductory chapter to C. Heunen, M. Sadrzadeh, and E. Grefenstette. Quantum Physics and Linguistics: A Compositional, Diagrammatic Discourse. Oxford University Press, 2013 (arxiv:1307.4038)

  • Gregory M. Kobele?, Pregroups, Products, and Generative Power , handout Chieti workshop on pregroups May 2005.

  • Gregory M. Kobele?, Marcus Kracht, On Pregroups, Freedom, and (Virtual) Conceptual Necessity , U. Penn Working Papers in Linguistics 10 no.1 (2005).

  • Daniel Genkin?, Nissim Francez?, and Michael Kaminski?. Mildly Context-Sensitive Languages via Buffer Augmented Pregroup Grammars, In Essays in Memory of Amir Pnueli, 2010.

Last revised on July 22, 2021 at 12:04:39. See the history of this page for a list of all contributions to it.