nLab regular language

Contents

Contents

Idea

Regular languages are a simple kind of formal languages, the least expressive in the Chomsky hierarchy.

Definition

Fix a finite set VV (serving as the alphabet) and denote its free monoid by V V^\star. A language LV L \subseteq V^\star is regular whenever there is a formal grammar G=(V,X,R,s)G = (V, X, R, s) that generates it (i.e. L(G)=LL(G) = L) such that all rules in RR are of the form xwxx \to w x' for a terminal wVw \in V and non-terminal symbols x,xXx, x' \in X.

Properties

Regular languages can be characterised by regular expressions. Regular expressions are inductively generated by the following rules:

  1. For each terminal wVw \in V, ww
  2. Basic expressions \emptyset, ϵ\epsilon
  3. For each pair of regular expressions R 1,R 2R_1,R_2, R 1|R 2R_1|R_2 and R 1R 2R_1 \cdot R_2
  4. For each regular expression RR the Kleene star R R^\star.

These can be given a semantics in the poset of languages P(V)P(V) by universal properties.

  1. L()=L(\emptyset) = \emptyset and L(R 1|R 2)=L(R 1)L(R 2)L(R_1 | R_2) = L(R_1) \cup L(R_2). These are nullary and binary joins.
  2. L(w)={w}L(w) = \{ w \}. These are representable viewing languages as de-categorified presheaves.
  3. L(ϵ)L(\epsilon) consists of only the empty string, L(R 1R 2)={αV |βL(R 1),γL(R 2)α=βγ}L(R_1\cdot R_2) = \{\alpha \in V^\star \vert \exists \beta \in L(R_1), \gamma \in L(R_2) \cdot \alpha = \beta \gamma \}. This is a de-categorified version of the Day convolution monoidal structure.
  4. L(R )={αV |n.β 0,,β n1L(R).α=β 0β n1}L(R^\star) = \{ \alpha \in V^\star \vert \exists n \in \mathbb{N}. \exists\beta_0,\ldots,\beta_{n-1} \in L(R). \alpha = \beta_0\cdots\beta_{n-1} \}. In category theoretic terms this is the least solution to the recurrence equation L(R )=L(ϵ|(RR ))L(R^\star) = L(\epsilon| (R\cdot R^\star)), and can be constructed by Kleene's fixed point theorem as an ω\omega-colimit.

Every regular language can is denoted by some regular expression in this way.

Equivalently, a language is regular if it can be recognised by a finite-state automaton.

Last revised on March 16, 2023 at 07:41:11. See the history of this page for a list of all contributions to it.