This page is used as a hub to gather references about category-theoretic work with explicit applications to biology.
Biology can be seen as an umbrella term for a spectrum of scientific domains all interested in studying life, at various scales. A range of category-theoretic formalisms have now been proposed to model these scales, from DNA mechanisms to complex systems through protein interactions. This page gathers works expressing their results in the language of biology by using category theory concepts. More works with potential applications to topics somewhat related to biological questions are referred at the end of the page.
This section presents category theoretic models using categories as ways to describe biological mechanisms and relations.
Ontologies are used in genomics to classify categories of observable phenomena such as diseases or gene expressions. A category-theoretic model called “Ologs” was proposed to formalize such ontologies. Ologs are freely generated categories over graphs whose objects and arrows are defined by (English) sentences such that
each object is a nominal sentence about a given topic;
each arrow is encoded by a grammatical predicate starting with the verb is but deprived of its last grammatical object such that the arrow, its source and its target all together define a semantically correct (English) sentence.
The idea is that each arrow of an olog describes a fact about a given topic.
Ologs have been used to characterize hierarchies in biology.
Categories of segments aim to model the set of natural and experimental operations that can be done on DNA segments. The definition of their arrows is simple but flexible enough to express a wide range of biological mechanisms occurring in genetics and related fields.
For every non-negative integer $n$, we will denote the set $\{1,2,\dots,n\}$ as $[n]$. Note that $[0] = \emptyset$.
For every preorder $(\Omega,\preceq)$, we define a segment on $\Omega$ as a tuple $(n_0,n_1,t,c)$ where $n_0$ and $n_1$ are non-negative integers, $t:[n_1] \to [n_0]$ is an order-preserving surjection and $c:[n_0] \to \Omega$ is a function.
If we take the preorder $\mathbf{2} = \{ 0\leq 1\}$, then the following diagram represents a segment $(14,5,c,t)$ in $\mathbf{2}$; the brackets represent the fibers of the order-preserving surjection $t$ while the corresponding colors represent the mappings of the function $c$, which lands in the set $\mathbf{2}$.
For every preorder $(\Omega,\preceq)$, we define a morphism of segments from segment $(n_0,n_1,t,c)$ on $\Omega$ to a segment $(n_0',n_1',t',c')$ on $\Omega$ as a pair $(f_1,f_0)$ where $f_1:[n_1] \to [n_1']$ is an order-preserving injection and $f_0:[n_0] \to [n_0']$ is an order-preserving function such that the relation $c'(f_0(i)) \preceq c(i)$ holds in $(\Omega,\preceq)$ for every $i \in [n_0]$.
We define the category of segments over a preorder $(\Omega,\preceq)$ as the category $\mathbf{Seg}(\Omega)$ whose objects are segments over $\Omega$ and whose arrows are the morphisms of segments between them.
We can show that if the preorder $(\Omega,\preceq)$ defines a lattice, then the category $\mathbf{Seg}(\Omega)$ can be equipped with a site structure.
Ologs and biology:
Introductory material for categories of segments:
R Tuyeras, Category Theory for Genetics, arXiv:1708.05255
This section presents category theoretic models taking the form of diagrams. These models can be either presented as functors with properties or as commutative diagrams. Common examples are models for a limit sketch. The first attempt to formalize biological systems in terms of diagrams (with limits) was initiated by R. Rosen (see the references at the end of the page). However, Rosen’s work stays quite abstract and does not treat of any specific biological phenomenon.
In the spirit of Spivak's approach in encoding databases as functors over small categories, functors have been used to organize (and hence model) biological data. One example is that of stock-flow diagrams, which are defined as follows.
Let us denote as $\mathsf{H}$ the free category generated over the following graph:
The previous diagram should be seen as a sketch specifying a structure in which there are links that go from a stock to a flow such that each flow goes from a stock to another stock.
We define a primitive stock-flow diagram as a functor $\mathsf{H} \to \mathbf{FinSet}$ where $\mathbf{FinSet}$ is the category of finite sets and functions.
A stock-flow diagram consists of primitive stock-flow diagram $F:\mathsf{H} \to \mathbf{FinSet}$ and, for every element $x \in F(\mathrm{flow})$, a continuous function $\mathbb{R}^{U_x} \to \mathbb{R}$ where $U_x$ denotes the finite fiber $F(t)^{-1}(x)$.
There is a notion of open stock-flow diagram that can be composed by using the composition of cospans. Stock-flow diagrams have been used to model epidemics and more specifically COVID-19.
A pedigrad is a model for a limit sketch defined on a category of segments. The functors defining pedigrads can land in any types of categories. These functors have been used to model genomic data and design algorithms to study them.
Stock-flow diagrams and petri nets:
JC Baez, X Li, S Libkind, N Osgood and E Patterson, Compositional modeling with stock and flow diagrams, arXiv:2205.08373
A Baas, J Fairbanks, M Halter, S Libkind and E Patterson, An algebraic framework for structured epidemic modeling, arXiv:2203.16345
JC Baez, BS Pollard, A Compositional Framework for Reaction Networks, arXiv:1704.02051
Pedigrads:
R Tuyeras, Category theory for genetics I: mutations and sequence alignments, Theory and Applications of Categories, Vol. 33, 2018, No. 40, pp 1269-1317, link
R Tuyeras, Category theory for genetics II: genotype, phenotype and haplotype, arXiv:1805.07004
R Tuyeras, A category theoretical argument for causal inference, arXiv:2004.09999
Adjunctions have been used to model pathogens and disease diagnoses, and their corresponding immune response. For example, denote as $I$ the set of immune responses and as $P$ the set pathogens and disease symptoms. In practice, we can map a subset of $P$ to a subset of $I$. This defines a binary relation as follows.
We can complete this binary relation into a functor of the following form.
We can then define a functor $F:\mathsf{Sub}(I) \to \mathsf{Sub}(P)$ with the following specification.
If the binary relation $Q$ is defined such that $F$ preserves meets (ie. intersections), then we can use the adjoint functor theorem to define a left-adjoint $L:\mathsf{Sub}(P) \to \mathsf{Sub}(I)$ for the functor $F$.
The previous adjunction gives us a context in which we can reason about immune responses and their triggering pathogens and diseases. Further, the adjunction formalism ensures a certain continuity in time and space regarding the linking of diseases/pathogens to their immune response (as suggested in the following correspondence).
We can embed the previous type of models in time and space by filtering the sets $P$ and $I$ into a sequence (or a category) of subsets containing chronological and spatial occurrences of diseases and their immune responses, respectively. In this case, the corresponding binary relations $Q \subseteq \mathsf{Sub}(P) \times \mathsf{Sub}(I)$ need to be defined for each time and space parameter in a functorial way.
Operads have been used to model phylogenetic trees, see:
A little-disc-operad-inspired formalism was developed to model biological systems and cellular behaviors. This formalism considers gradient descent techniques to make certain subsets $A \subseteq \mathbb{R}^{\times n}$ converge towards algebra-like structures.
More specifically, these gradient descent techniques use the underlying equations of the axioms for algebras as objective functions. The sets $A$ obtained through these optimizations can be interpreted as models for specialization of biological functions and entropic mechanisms in living organisms.
Phylogenies:
Specialization and gradient descent:
This section gathers works in domains that often use category theory as a language for logical discourse.
Algebraic topology techniques and intuitions developed along with persistent homology have been used to study neuronal morphologies.
Additionally, the clustering algorithm UMAP, commonly-used in computational biology to classify sequencing data, relies on properties holding for fuzzy simplicial set?s. For more detail, see the following blog post:
The concept of singularities, in algebraic geometry, has been used to model anatomic morphologies and behaviors.
Additionally, various concepts of algebraic geometry such as manifolds and moduli spaces have been used by M Gromov to model molecular mechanisms in living cells.
Persistent homology:
L Kanari, P Dłotko, M Scolamiero, R Levi, J Shillcock, K Hess, H Markram (2016), Quantifying topological invariants of neuronal morphologies, arXiv:1603.08432
Y Lee, SD Barthel, P Dłotko, SM Moosavi, K Hess, B Smit, (2017), Pore-geometry recognition: on the importance of quantifying similarity in nanoporous materials, arXiv:1701.06953.
Fuzzy simplicial sets:
Algebraic geometry:
EC Zeeman, Catastrophe Theory, Scientific American, April 1976; pp. 65–70, 75–83, pdf
N Baas has proposed hyperstructures to describe hierarchical organizations in biology. While these structures were originally designed to organize extended cobordism structures, they are argued to also be appropriate for modeling multilevel systems in biology. Note that contrarily to multilevel structures such as $n$-categories, hyperstructures offer more freedom in that biological processes are not necessarily oriented as globular arrows but, instead, appear to be organized as “aggregates” with bonds.
The idea behind applying hyperstructures to biology is that they allow us to consider some set $X_0$ of “agents” such that any subset $S \subseteq X_0$ can define an aggregate when it is “labeled” by an explanation (or description) $\omega$ for that aggregation.
The pair $(S,\omega)$ can then be represented by another label, say $\beta$, that can classify a collection of pairs sharing similarities.
The previous construction can then be repeated recursively on the set of labels $\beta$. This set, call it $X_1$, could potentially be the set of all the biological organs in an individual. Then, the next level $X_2$ (obtained from $X_1$ by following the previous procedure) can be the set of all organ systems, which can subsequently be organized as bodies on a fourth level $X_3$.
Note that hyperstructures also require compatibility properties between the labels. In particular, for each level $X_k$, the pairs $(S,\omega)$ should be organized into a Grothendieck construction $\int \Omega$ such that the mappings $(S,\omega) \mapsto \beta$ define a functor $\int \Omega \to \mathbf{Set}$.
Bigraphs are a type of hypergraph-based structures that can be linked to process algebras such as the $\lambda$-calculus, the $\pi$-calculus and its stochastic variant. Each of these calculus formalisms has been used to model biological systems.
Interestingly, it was shown that the $\pi$-calculus can be interpreted within a 2-category-theoretic setting.
A number of bigraph-based formalisms have been proposed to model complex systems, some with a category theoretic flavor. For example, stochastic bigraphs and their compositions have been used to model membrane budding in a biological system.
Hyperstructures:
NA Baas, On the Philosophy of Higher Structures, arXiv:1805.11943
NA Baas, On Higher Structures, arXiv:1509.00403
NA Baas, Extended Memory Evolutive Systems in a Hyperstructure Context. Axiomathes 19, 215–221 (2009), link
Bigraphs:
R Milner, Bigraphs and Their Algebra, Electronic Notes in Theoretical Computer Science
Volume 209, 24 April 2008, Pages 5-19, link
J Krivine, R Milner, A Troina, Stochastic Bigraphs, Electronic Notes in Theoretical Computer Science Volume 218, 22 October 2008, Pages 73-9, link
Relations between $\lambda$-calculus, $\pi$-calculus and biology:
S Federhen, Replication is Recursion; or, Lambda: the Biological Imperative, bioarxiv
A Regev, W Silverman, E Shapiro E, Representation and simulation of biochemical processes using the pi-calculus process algebra, Pacific Symposium on Biocomputing 6:459-470 (2001), pdf
See the blog post by John Baez: Biology and the Pi-Calculus
M Stay, LG Meredith, Higher category models of the pi-calculus, arXiv:1504.04311
A general discussion on using category theory for biology can be found on the $n$-category café:
Phylogenomics:
Cell biology:
Systems biology:
R Rosen, The representation of biological systems from the standpoint of the theory of categories , Bulletin of Mathematical Biophysics, Vol 20. 1958, pdf
IC Baianu, JF Glazebrook and R Brown, A Category Theory And Higher Dimensional Algebra Approach To Complex Systems Biology, Meta-systems And Ontological Theory Of Levels: Emergence Of Life, Society, Human Consciousness And Artificial Intelligence, text
Neuroscience:
According to Mikhail Gromov, the mathematical structures nearest to what is happening in the mind are n-categories. See his talk: Ergologic and Interfaces Between Languages
AC Ehresmann and P S Wlimes, Towards a theoretical framework for wandering logic intelligence memory evolutive systems. In P. L. Simeonov, L. S. Smith, and A. C. Ehresmann, editors, Integral. Biomathics: Tracing the Road to Reality. Springer-Verlag, 2012.
AC Ehresmann and J-P Vanbremeersch, The memory evolutive systems as a model of Rosen’s organisms. Axiomathes, 16:165–214, 2006.
AC Ehresmann and J-P Vanbremeersch. Memory Evolutive Systems: Hierarchy, Emergence, Cognition, volume 4 of Studies in Multidisciplinarity. Elsevier, 2007.
AC Ehresmann, N Baas, and J-P Vanbremeersch. Hyperstructures and memory evolutive systems.
Intern. J. Gen. Sys., 33(5):553–568, 2004.
D Pastor, E Beurier, AC Ehresmann, R Waldeck, Interfacing biology, category theory and mathematical statistics, pdf
Last revised on March 24, 2023 at 12:24:32. See the history of this page for a list of all contributions to it.