nLab para construction

Contents

Contents

Idea

In deep learning and game theory, we usually think of neural networks/economic agents as processes taking in an input AA and producing an output BB. However, we additionally want to model that these processes have extra, β€œhidden” inputs not available to the outside world. In neural networks we call these weights (or parameters) and in game theory we call these strategies.

In other words, we want to form a category where a morphism Aβ†’BA \to B contains the data of a) a parameter space PP and b) a morphism f:PβŠ—Aβ†’Bf : P \otimes A \to B. From this description we see that this construction necessitates a choice of some underlying monoidal category π’ž\mathcal{C}.

Such a morphism might be visualised using the string diagram language of monoidal categories (below, left). However, this notation does not emphasise the special role played by PP, which is part of the data of the morphism itself. Parameters and data in machine learning have different semantics; by separating them on two different axes, we obtain a graphical language which is more closely tied to these semantics (below, right).

This gives us an intuitive way to compose parameterised maps:

This construction is called Para(π’ž)\mathbf{Para}(\mathcal{C}), originally introduced in (Fong, Spivak and Tuyeras 2019) in a specialised form, then successively refined in (Gavranovic 2019), (Capucci et al. 2020) and (Cruttwell et al. 2021).

Definition

Let (C,I,βŠ—)(\mathbf C, I, \otimes) be a symmetric monoidal category. Then Para(π’ž)\mathbf{Para}(\mathcal{C}) is a bicategory with the following data:

  • Its 0-cells are the objects of π’ž\mathcal{C}.
  • A 1-cell Aβ†’BA \to B is a choice of a parameter object P:π’žP : \mathcal{C} and a morphism
    f:PβŠ—Aβ†’B f : P \otimes A \to B

    in π’ž\mathcal{C}.

  • A 2-cell (P,f)β‡’(Q,g)(P, f) \Rightarrow (Q, g) is a morphism r:Pβ†’Qr : P \to Q in π’ž\mathcal{C} such that f=g∘(rβŠ—A)f = g \circ (r \otimes A).

The sequential composition of a map f:PβŠ—Aβ†’Bf : P \otimes A \to B and g:QβŠ—Bβ†’Cg : Q \otimes B \to C is given by the animation above. The composite is a QβŠ—PQ \otimes P-parameterised map defined as

QβŠ—PβŠ—Aβ†’QβŠ—fQβŠ—Bβ†’gC Q \otimes P \otimes A \xrightarrow{Q \otimes f} Q \otimes B \xrightarrow{g} C

The construction defined here works in the general setting of actegories.

Properties

Para(π’ž)\mathbf{Para}(\mathcal{C}) is symmetric monoidal when π’ž\mathcal{C} is

todo

As an oplax colimit

todo

Examples

When the base category is set to be the category of optics (in computer science), then Para(Optic(π’ž))\mathbf{Para(\mathbf{Optic(\mathcal{C})})} recovers the category of neural networks defined in (Capucci et al. 2020).

References

Last revised on July 23, 2021 at 07:51:36. See the history of this page for a list of all contributions to it.