nLab regular language

Context

$(0,1)$ -Category theory

Linguistics

Idea
Definition
Properties
Related entries
References

Idea

Regular languages are a simple kind of formal languages, the least expressive in the Chomsky hierarchy.

Definition

Fix a finite set $V$ (serving as the alphabet) and denote its free monoid by $V^\star$ . A language $L \subseteq V^\star$ is regular whenever there is a formal grammar $G = (V, X, R, s)$ that generates it (i.e. $L(G) = L$ ) such that all rules in $R$ are of the form $x \to w x'$ for a terminal $w \in V$ and non-terminal symbols $x, x' \in X$ .

Properties

Regular languages can be characterised by regular expressions. Regular expressions are inductively generated by the following rules:

For each terminal $w \in V$ , $w$ is a regular expression;
There are basic expressions $\emptyset$ , $\epsilon$ ;
For each pair of regular expressions $R_1,R_2$ , there are expressions $R_1|R_2$ and $R_1 \cdot R_2$ ;
For each regular expression $R$ , the Kleene star $R^\star$ is a regular expression.

These can be given a semantics in the poset of languages $P(V^\star)$ by universal constructions in $\mathbf{2}$ -enriched category theory, where $\mathbf{2} = (0 \leq 1)$ is regarded as cartesian closed, and $V^\star$ as a discrete poset is regarded as a $\mathbf{2}$ -enriched category.

$L(\emptyset) = \emptyset$ and $L(R_1 | R_2) = L(R_1) \cup L(R_2)$ . These are nullary and binary joins.
$L(w) = \{ w \}$ . If we view languages $L \subseteq V^\star$ as $\mathbf{2}$ -enriched presheaves $V^\star \to \mathbf{2}$ , then these $L(w)$ are among the representable presheaves.
$L(\epsilon)$ consists of only the empty string, $L(R_1\cdot R_2) = \{\alpha \in V^\star \vert \exists \beta \in L(R_1), \gamma \in L(R_2) \; (\alpha = \beta \gamma) \}$ . The right side is an application of a $\mathbf{2}$ -enriched version of the Day convolution monoidal structure, with monoidal product
$\otimes: P(V^\star) \times P(V^\star) \to P(V^\star)$

induced from the concatenation product $V^\star \times V^\star \to V^\star$ that makes $V^\star$ a discrete monoidal poset.
$L(R^\star) = \{ \alpha \in V^\star \vert \exists n \in \mathbb{N}\; \exists\beta_0,\ldots,\beta_{n-1} \in L(R)\; (\alpha = \beta_0\cdots\beta_{n-1}) \}$ is the smallest submonoid of $V^\star$ that contains $L(R)$ . In category theoretic terms this is the least solution to the recurrence equation $L(R^\star) = L(\epsilon| (R\cdot R^\star))$ , and can be constructed by Kleene's fixed point theorem as an $\omega$ -colimit.

Every regular language is denoted by some regular expression in this way.

Equivalently, a language is regular if it can be recognised by a finite-state automaton.

Related entries

References

Wikipedia, Regular language
Wikipedia, Regular expression
John Horton Conway, Regular Algebra and Finite Machines. Mineola, N.Y., Dover; Newton Abbot, 2012. ISBN:978-0486485836
Damien Pous, Jana Wagemaker, Completeness Theorems for Kleene algebra with tests and top, Logical Methods in Computer Science, Volume 20, Issue 3 (September 30, 2024). (doi:10.46298/lmcs-20(3:27)2024, arXiv:2304.07190)

Last revised on July 1, 2025 at 15:04:44. See the history of this page for a list of all contributions to it.

nLab regular language

Context

$(0,1)$ -Category theory

Theorems

Linguistics

General

Philosophy of Language

Mathematical linguistics

Syntax

Semantics

Pragmatics

Concepts

People

Contents

Idea

Definition

Properties

Related entries

References

nLab regular language

Context

(0,1)(0,1)-Category theory

Theorems

Linguistics

General

Philosophy of Language

Mathematical linguistics

Syntax

Semantics

Pragmatics

Concepts

People

Contents

Idea

Definition

Properties

Related entries

References

$(0,1)$ -Category theory