constructive mathematics, realizability, computability
propositions as types, proofs as programs, computational trinitarianism
internalization and categorical algebra
algebra object (associative, Lie, …)
internal category ($\to$ more)
natural deduction metalanguage, practical foundations
type theory (dependent, intensional, observational type theory, homotopy type theory)
computational trinitarianism =
propositions as types +programs as proofs +relation type theory/category theory
In computer science, a (co-)monad (or (co-)Kleisli triple, or (co-)extension system) is a kind of data type-structure which describes “notions of computations” [Moggi (1989), Moggi (1991)] that may “have external effects or be subject to external contexts/causes” — such as involving random access memory, input/output, exception handling, writing to or reading from global variables, etc. — as familiar from imperative programming but cast as “pure functions” with deterministic and verifiable behaviour, in the style of functional programming.
In short, a (“extension system-style” [Manes (1976), Ex. 3.12] or “Kleisli triple-style” [Moggi (1991), Def. 1.2]) monad in a given programming language consists of assignments (but see below):
to any data type $D$ of a new data type $\mathcal{E}(D)$ of “$D$-data with $\mathcal{E}$-effects”,
to any pair of $\mathcal{E}$-effectful functions (programs) of the form $prog_{12} \,\colon\, D_1 \to \mathcal{E}(D_2)$ and $prog_{23} \,\colon\, D_2 \to \mathcal{E}(D_3)$ of an effective-composite function $bind^{\mathcal{E}} prog_{23} \;\circ\; prog_{12}(-) \,\colon\, D_1 \to \mathcal{E}(D_3)$ (their binding or Kleisli composition),
to any data type $D$ of a function $ret^{\mathcal{E}}_D \;\colon\; D \to \mathcal{E}(D)$ assigning “trivial $\mathcal{E}$-effects”,
such that the binding is associative and also unital with respect to the return operation, hence such that data types with $\mathcal{E}$-effectful programs between them constitute a category (the Kleisli category of the given effect/monad $\mathcal{E}$).
We now explain in more detail what this means and what it is good for.
In programming it frequently happens that a program with “nominal” output data type $D$ de facto outputs data of some modified type $T(D)$ which accounts for “external effects” caused by the program, where
is some general operation sending data types $D$ to new data types $\mathcal{E}(D)$.
For example, if alongside the computation of its nominal output data $d \colon D$ a program also writes a log message $msg \,\colon\,$ String, then its actual output data is the pair $(d, msg)$ of product type $Write(D) \,\coloneqq\, D \times String$.
Or, dually, if the program may fail and instead “throw an exception message” $msg \,\colon\,$ String, then its actual output data is either $d \colon D$ or $msg \colon String$, hence is of coproduct type $Excep(D) \,\coloneqq\, D \sqcup String$.
Given such an $\mathcal{E}$-effectful program $prog_{12} \;\colon\; D_1 \to \mathcal{E}(D_2)$ and given a subsequent program $prog_{23} \,\colon\, D_2 \to \mathcal{E}(D_3)$ accepting nominal input data of type $D_2$ and itself possibly involved in further effects of type $\mathcal{E}(-)$, then the naïve composition of the two programs makes no sense (unless $\mathcal{E}(D) = D$ is actually the trivial sort of effect), but their evident intended composition is, clearly, obtained by:
first adjusting $prog_{23}$ via a given prescription
such that $bind^{\mathcal{E}} prog_{23} \,\colon\, \mathcal{E}(D_2) \to \mathcal{E}(D_3)$:
does accept data of type $\mathcal{E}(D_2)$ and
“acts as $prog_{23}$ while carrying any previous $\mathcal{E}$-effects along” (this intuition becomes a formal fact below in (9));
then forming the naive composition $bind^{\mathcal{E}} prog_{23} \;\circ\; prog_{12}$
as follows:
(Beware that we are denoting by “$bind^{\mathcal{E}} prog (-)$” what in programming languages like Haskell is denoted by “
(-) >>= prog
” aka “fish notation”, eg. Milewski (2019), and which some authors denote by “$prog^\ast\,$”, e.g. Moggi (1991); Uustalu (2021), lecture 1, p. 12.)
Depending on the intended behaviour of these programs, it remains to specify how exactly $bind^{\mathcal{E}} prog_{23}$ “carries $\mathcal{E}(-)$-effects along”, hence what the “bind” operation (2) does concretely.
For instance, in the above example of a logging-effect, where $Write(D_2) \,\coloneqq\, D_2 \times String$, the evident way is to use the concatenation $String \times String \xrightarrow{\; concat \;} String$ and set:
$bind^{Write} prog_{23}\;\coloneqq\; D_2 \times String \xrightarrow{ prog_{23} \times Id_{String} } D_3 \times String \times String \xrightarrow{ Id_{D_3} \times concat } D_3 \times String \,.$
In the other example above, where the effect is the possible throwing of an exception message, the evident way to carry this kind of effect along is to use the codiagonal $\nabla \,\colon\, String \sqcup String \to String$, which amounts to keep forwarding the exception that has already been thrown, if any:
$bind^{Excep} prog_{23} \;\coloneqq\; D_2 \sqcup String \xrightarrow{ prog_{23} \sqcup Id_{String} } D_3 \sqcup String \sqcup String \xrightarrow{ id_{D_3} \sqcup \nabla } D_3 \sqcup String \,.$
Whatever design choice one makes for how to “carry along effects”, it must be consistent in that applying the method to a triple of $\mathcal{E}$-effectful programs which are nominally composable, then their effectful composition should be unambiguously defined in that it is associative, satisfying the following equation – here called the the first “monad law”:
Finally, for such a notion of effectful programs to be usefully connected to “pure” programs without effects, it ought to be the case that for any program $prog_{01} \,\colon\, D_0 \xrightarrow{\;} D_1$ that happens to have no $\mathcal{E}$-effects, we have a prescription for how to regard it as an $\mathcal{E}$-effectful program in a trivial way. For that purpose there should be defined an operation
which does nothing but “return” data of type $D$, but re-regarded as effectful $\mathcal{E}(D)$-data in a trivial way; so that we may construct the trivially effectful program $ret^{\mathcal{D}}_{D_1} prog_{01} \;\colon\; D_0 \xrightarrow{\;} \mathcal{E}(D_1)$.
For instance, in the above example of log-message effects this would be the operation $D \to D \times String$ which assigns the empty string $ret^{Write} \;\colon\; d \mapsto (d, \varnothing)$.
In the other example above, of exception handling, the trivial effect $D \to D \sqcup String$ is just not to throw an exception, which is just $ret^{Excep} \;\colon\; d \mapsto d$ (the right coprojection into the coproduct).
The final consistency condition (i.e. the remaining “monad law”) then is that “carrying along trivial effects is indeed the trivial operation”, i.e. that
Notice that the associativity condition (3) and the unitality condition (5) are jointly equivalent to saying that data types with hom-sets of $\mathcal{E}$-effectful programs between them, in the above sense, form a category. In category theory this is known as the Kleisli category $Kl(\mathcal{E})$ of a monad $\mathcal{E}$ on the category $Type$ of data types with programs between them:
Traditionally in category theory, the axioms on monads are presented in a somewhat different way, invoking a monad “product” natural transformation $\mathcal{E} \circ \mathcal{E} \xrightarrow{ \mu } \mathcal{E}$ instead of the “binding” operation. One readily checks that these two axiomatic presentations of monads are in fact equal – see (7) below –, but the above “Kleisli triple/extension system”-presentation is typically more relevant in the practice of functional programming.
In summary, a choice of assignments (but see below) to data types $D_i$ of
$\mathcal{E}(D) \;\colon\; Type$,
namely of types of $\mathcal{E}$-effectful data of nominal type $D$ (1);
$bind^{\mathcal{E}}_{D_1, D_2} \;\colon\; Hom\big(D_1,\, \mathcal{E}(D_2)\big) \longrightarrow Hom\big(\mathcal{E}(D_1),\,\mathcal{E}(D_2)\big)$,
namely of how to execute a prog while carrying along any previous effects (2);
$ret^{\mathcal{E}}_D \;\colon\; D \to \mathcal{E}(D)$,
namely of how to regard plain $D$-data as trivially effectful (4)
subject to:
the associativity condition (3)
is called a monad in computer science (also: “Kleisli triple” in functional programming) and serves to encode the notion that all programs may be subject to certain external effects of a kind $\mathcal{E}$ that is specified by the above choices of monad operations.
Here, for the time being (but see below), we write $Hom(D_1, D_2)$ for the set of programs/functions taking data of type $D_1$ to data of $D_2$ (the hom set in the plain category of types).
The first running example above is known as the writer monad, since it encodes the situation where programs may have the additional effect of writing a message string into a given buffer.
The other running example above is naturally called the exception monad.
Relation to other familiar axioms for monads. The above Kleisli/extension-structure, making a Kleisli-style monad in computer science, may and often is encoded in different equivalent ways:
Alternatively postulating operations
$D \mapsto \mathcal{E}(D)$ (as before)
$fmap^{\mathcal{E}}_{D_1, D_2} \;\colon\; Hom\big(D_1 \to D_2\big) \longrightarrow Hom\big(\mathcal{E}(D_1), \mathcal{E}(D_2)\big)$
$ret^{\mathcal{E}}_D \;\colon\; D \to \mathcal{E}(D)$
$join^{\mathcal{E}}_D \;\colon\; \mathcal{E}\big( \mathcal{E}(D) \big) \to \mathcal{E}(D)$
such that
$fmap^{\mathcal{E}}$ is functorial on data types,
$join^{\mathcal{E}}$ is associative and unital (with respect to $ret$) as a natural transformation,
yields the definition of monad more traditionally used in category theory (namely as a monoid object in endofunctors, here on the plain category of data types).
Direct inspection shows that one may bijectively transmute such $bind$- and $join$-operators into each other by expressing them as the following composites (using category theory-notation, for instance “ev” denotes the evaluation map):
But in fact, in functional programming-languages one typically considers an enhanced version of the above situation:
In these higher-order languages one has, besides the (hom-)set of programs/functions $Hom\big(D_1, \, D_2\big)$ also the actual data type of functions, namely the function type $D_1 \to D_2$, which in terms of categorical semantics is the internal hom-object $Map(D_1, \, D_2)$ in the cartesian closed category of data types. Therefore, in such languages (like Haskell) the type of the binding operation for given data types $D_1$, $D_2$ is actually taken to be the function type/internal hom
(where we used the hom-isomorphism of the product$\dashv$internal hom-adjunction to re-identify the types on the right)
which (beware) is traditionally written without many of the parenthesis, as follows:
In general (except in the base topos of Sets), such an iterated function type/internal hom is richer than (certainly different from) the corresponding plain hom set, and accordingly a “Kleisli triple” defined as above but with the binding operation typed in this internal way is more information than a plain monad on the underlying category of types: namely what is called a strong monad (eg. McDermott & Uustalu (2022)).
On such a strong monad, the bind operation is defined as the following composite:
This makes the monads be enriched monads in the self-enrichment of the category of types given by the function type/internal hom.
In particular, monads as used in Haskell are really strong/enriched monads, in this way.
Yet more structure on effect-monads is available in dependent type theories with type universes, where one may demand that the monad operation $D \mapsto T(D)$ is not just an endofunction on the set of types, but an endomorphism of the type universe. At least for idempotent monads this case is further discussed at reflective subuniverse and modal type theory and maybe elsewhere.
Further in the practice of programming: if programs may cause external effects, as above, then one will often want to have some of them handle these effects.
This is particularly evident in the above running examples of exceptions whose whole design purpose typically is to be handled when they arise.
Since a “handling” of $\mathcal{E}$-effects should mean that these be done with and turned into pure data of some actual type $D$, an $\mathcal{E}$-effect handler on(to) a data type $D$ should primarily consist of a program of the form
that handles effects produced by $\mathcal{E}$-effectful programs $D' \to \mathcal{E}(D)$, turning them into pure computations $D' \to D$.
But in addition, such a handler needs to handle effects that have been “carried along” (2) from previous computations, even along otherwise in-effectful computations $prog_{0,1} \;\colon\; D_{0} \to D_{1}$; therefore all these need to be assigned handlers:
Of such choice of effect handling, consistency demands that:
first handling all previous effects carried along (2) and then the newly produced effects by a given $prog \,\colon\, D \to \mathcal{E}(D')$ has the same result as letting $prog$ take care of carrying along previous effects by passing to $prog[-]$ and then handling the resulting accumulation at once:
handling the trivial effect ought to be no extra operation:
A data structure consisting of such assignments $hndl^{\mathcal{E}}$ subject to these “laws” is known as an Kleisli triple algebra (e.g. Uustalu (2021), Lec. 2) or algebra over an extension system (see here) for $\mathcal{E}$ and is bijectively the same as what in traditional category theory is known as an algebra over $\mathcal{E}$ or $\mathcal{E}$-algebra or, maybe best: an $\mathcal{E}$-module or $\mathcal{E}$-modal type.
Given a pair of such $\mathcal{E}$-modal types $\big(D_1, hndl^{\mathcal{E}}_{D_1} \big)$, $\big(D_2, hndl^{\mathcal{E}}_{D_2} \big)$, a function $\phi \,\colon\, D_1 \to D_2$ is a homomorphism between them if the result of handling $\mathcal{E}$-effects before or after applying $\phi$ is the same, hence if for all functions $f \,\colon\, D_0 \to D_1$ the following diagram commutes:
Free $\mathcal{E}$-effect handlers. Notice that above we motivated already the binding operation (2) as a kind of $\mathcal{E}$-effect handling: namely as the “carrying along” of previous $\mathcal{E}$-effects. This intuition now becomes a precise statement:
For any $D_2 \;\colon\; Type$ we may tautologically handle (8) $\mathcal{E}$-effects by absorbing them into the data type $\mathcal{E}(D_2)$, with the effect-handler simply being the effect-binding operation (2):
The $\mathcal{E}$-effect-handlers (8) arising this way are called free.
The full subcategory of free $\mathcal{E}$-effect handlers among all $\mathcal{E}$-modal types is known as the Kleisli category. By the Kleisli equivalence, this is equivalent to the (plain) category (6) of $\mathcal{E}$-effectful programs that we started with. Here this means that:
which makes a lot of sense.
(…) e.g. Uustalu & Vene (2008), POM13, GKOBU16, KRU20 (…):
By formal duality:
On the one hand:
A Kleisli map for a monad $\mathcal{E}$ is a program
causing an external $\mathcal{E}$-effect.
A Kleisli-module for a monad $\mathcal{E}T$ is a prescription
for handling such $\mathcal{E}$-effects.
Dually:
A co-Kleisli map for a comonad $\mathcal{C}$ is a program
subject to an external $\mathcal{C}$-“context” (Uustalu & Vene (2008), POM13)
(a $\mathcal{C}$-cause).
A co-Kleisli co-module for a comonad $\mathcal{C}$ is a program
producing or providing such $\mathcal{C}$-contexts.
(cf. [Uustalu (2021), Lect. 3, slide 9])
(…)
Various monads are definable in terms of standard type-forming operations (such as product types, function types, etc.). These include the following.
The maybe monad encodes possible controlled failure of a program to execute.
An exception monad, more generally, encodes possible controlled failure together with the output of an error “message” in the general form of data of some type.
The reader monad and coreader comonad both encode reading out a global parameter.
The state monad encodes the possibility of consecutive reading and re-setting a global parameter – this provides a notion of random access memory.
The costate comonad encodes a way (a “lens”) to read and write fields in databases.
The writer monad encodes writing consecutively into a global resource, such as for a stream.
The continuation monad encodes continuation-passing style of program execution.
The selection monad encodes selecting a value of a type depending on the values of some function on it.
A functional program with input of type $X$, output of type $Y$ and mutable state $S$ is a function (morphism) of type $X \times S \longrightarrow Y \times S$. Under the (Cartesian product $\dashv$ internal hom)-adjunction this is equivalently given by its adjunct, which is a function of type $X \longrightarrow [S, S \times Y ]$. Here the operation $[S, S\times (-)]$ is the monad induced by the above adjunction and this latter function is naturally regarded as a morphism in the Kleisli category of this monad. This monad $[S, S\times (-)]$ is called the state monad for mutable states of type S.
The maybe monad is the operation $X \mapsto X \coprod \ast$. The idea here is that a function $X \longrightarrow Y$ in its Kleisli category is in the original category a function of the form $X \longrightarrow Y \coprod \ast$ so either returns indeed a value in $Y$ or else returns the unique element of the unit type/terminal object $\ast$. This is then naturally interpreted as “no value returned”, hence as indicating a “failure in computation”.
(…) exception monad (…)
The continuation monad on a given type $S$ acts by $X \mapsto [[X,S],S]$.
(…)
Other monads may be supplied “axiomatically” by the programming language, meaning that they are data structures typed as monads, but whose actual type formation, binding- and return-operations are special purpose operations provided by the programming environment.
This includes:
the completion monad, as in constructive analysis, used for dealing for instance with real numbers.
In this vein:
relation between type theory and category theory, categorical semantics, categorical logic
Examples of (co)monads in (homotopy) type theory involve in particular modal operators as they appear in
See also:
For an approach to composing monads, see
Another approach to modelling side effects in functional programming languages are
Free monads in computer science appear in the concepts of
Other generalizations are:
There is also: monad (in linguistics)
The extension system-style presentation of monads as used in computer science is briefly mentioned in:
and expanded on in
but not related there to computation.
The original observation of monads as “notions of computation” is:
Eugenio Moggi, Computational lambda-calculus and monads, in: Proceedings of the Fourth Annual Symposium on Logic in Computer Science (1989) 14-23 [doi:10.1109/LICS.1989.39155]
Eugenio Moggi, Notions of computation and monads, Information and Computation, 93 1 (1991) [doi:10.1016/0890-5401(91)90052-4, pdf]
On the impact of Moggi (1991):
In the generality of relative monads:
The dual notion of comonads in computer science as modelling contexts:
Tarmo Uustalu, Varmo Vene, Comonadic Notions of Computation, Electronic Notes in Theoretical Computer Science 203 5 (2008) 263-284 [doi:10.1016/j.entcs.2008.05.029]
Tomas Petricek, Dominic Orchard, Alan Mycroft, Coeffects: Unified Static Analysis of Context-Dependence, in: Automata, Languages, and Programming. ICALP 2013, Lecture Notes in Computer Science 7966 Springer (2013) [doi:10.1007/978-3-642-39212-2_35]
and emphasis on the combination of monads, comonads and graded modalities:
Marco Gaboardi, Shin-ya Katsumata, Dominic Orchard, Flavien Breuvart, Tarmo Uustalu, Combining effects and coeffects via grading, ICFP 2016: Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming (2016) 476–489 [doi:10.1145/2951913.2951939, talk abstract, video rec]
Shin-ya Katsumata, Exequiel Rivas, Tarmo Uustalu, LICS (2020) 604-618 Interaction laws of monads and comonads [arXiv:1912.13477, doi:10.1145/3373718.3394808]
Textbook accounts of monads as used in actual programming languages such as Haskell:
Further discussion/exposition of the notion and application of (co)monads in computer science:
Philip Wadler, Comprehending Monads, in Conference on Lisp and functional programming, ACM Press, 1990 (pdf)
Stephen Brookes, Shai Geva, Computational Comonads and Intensional Semantics, CMU-CS-91-190 (1991) [pdf]
Philip Wadler, Monads for functional programming, in M. Broy (eds.) Program Design Calculi NATO ASI Series, 118 Springer (1992) [doi;10.1007/978-3-662-02880-3_8, pdf]
Philip Mulry, Monads in semantics , ENTCS 14 (1998) pp.275-286.
John Hughes, section 2 of Generalising Monads to Arrows, Science of Computer Programming (Elsevier) 37 (1-3): 67–111. (2000) (pdf)
Robert Harper, Of course ML Has Monads! (2011) (web)
Nick Benton, Categorical Monads and Computer Programming, in: Impact150: stories of the impact of mathematics. London Mathematical Society (2015) [pdf, pdf, doi:10.1112/i150lms/t.0002]
Emily Riehl, A categorical view of computational effects, talk at C$\circ$mp$\circ$se$\colon\!\colon$Conference (2017) [pdf, pdf]
Rob Norris, Functional Programming with Effects, talk at Scala Days 2018 [video: YT]
Tarmo Uustalu, lecture notes for MGS 2021 (2021):
Monads and Interaction Lecture 1 [pdf, pdf]
Monads and Interaction Lecture 2 [pdf, pdf]
Christina Kohl, Christina Schwaiger, Monads in Computer Science (2021) [pdf, pdf]
See also:
Wikipedia, Monad (functional programming)
Wikipedia, Side_effect_(computer_science)
The specification of monads in Haskell is at:
and an exposition of category theory and monads in terms of Haskell is in
On the issue of the CS-style monads really being strong monads:
A comparison of monads with applicative functors (also known as idioms) and with arrows (in computer science) is in
Discussion of comonads in this context includes
Discussion of aspects of quantum computing in terms of monads:
The quantum IO monad:
Exposition:
Implementation in Haskell:
Last revised on March 15, 2023 at 14:08:08. See the history of this page for a list of all contributions to it.