Once beyond the realm of normed vector spaces, the various ways of defining differentiation diverge. This is particularly evident if one considers the slightly stronger notion of continuous differentiability wherein the assignment of the derivative must also be continuous.
One can make a reasonable start by saying that for a function $f \colon E \supseteq U \to F$ to be continuously differentiable then it must at least satisfy the notion of Gâteaux differentiability, and one can throw in the requirement that the assignment of the directional derivative be continuous and linear (this is known as Gâteaux–Lévy differentiability). Thus one obtains a map $D f \colon U \to \mathcal{L}(E,F)$. However, outside the realm of normed vector spaces there is not a unique topology on $\mathcal{L}(E,F)$ and thus one can come up with a variety of meanings for the phrase “$D f$ is continuous”.
Let us remind ourselves of the situation in finite dimensions.
A function $f \colon \mathbb{R}^m \supseteq U \to \mathbb{R}^n$, where $U$ is open, is said to be continuously differentiable, or of class $C^1$, if there is a continuous map $D f \colon U \to \mathcal{L}(\mathbb{R}^m, \mathbb{R}^n)$ with the property that for each $x \in U$ and $h \in \mathbb{R}^m$ then
Note that for $x \in U$ and $h \in \mathbb{R}^n$ there is an open interval $(-\epsilon,\epsilon)$ with the property that for $t \in (-\epsilon,\epsilon)$ then $x + t h \in U$ and so the limit makes sense.
In infinite dimensions the difficulty with extending the standard definition is that of the topology on continuous linear maps. This becomes more evident with higher derivatives. Thus the definition depends on such a choice. In addition, one needn’t use a topology but can make sense of the definition with a convergence structure on the space of linear maps.
Let $E$ and $F$ be locally convex topological vector spaces. Let $U \subseteq E$ be an open set. Let $\mathcal{L}(E,F)$ be the space of continuous linear maps from $E$ to $F$. Let $\Lambda$ be a convergence structure on $\mathcal{L}(E,F)$. A continuous function $f \colon U \to F$ is said to be differentiable of class $C^1_\Lambda$ if there exists a continuous mapping $D f \colon U \to \mathcal{L}_\Lambda(E,F)$, called the derivative of $f$, such that for every $x,h \in U \times E$ then
We define $C^1_\Lambda(X,F)$ to be the set of functions $f \colon X \to F$ which are of class $C^1_\Lambda$.
There are a variety of convergence structures and topologies on $\mathcal{L}(E,F)$ that can be used. Some of them with particular properties are gathered in the list below. For these, the notation is condensed slightly as indicated.
In the following, given semi-norms $\alpha$ on $E$ and $\beta$ on $F$ we define a semi-norm $\rho_{\beta,\alpha}$ on $\mathcal{L}(E,F)$ by
$C^1_\Theta$: the translation-invariant convergence structure with filters:
$\mathcal{F} \to 0$ if there is a semi-norm $\alpha$ on $E$ such that for each semi-norm $\beta$ on $F$ and $\epsilon \gt 0$ there is some $Q \in \mathcal{F}$ with $\sup_{u \in Q} \rho_{\beta,\alpha}(u) \le \epsilon$.
$C^1_\Delta$: (Marinescu’s convergence structure) the colimit in the category of convergence vector spaces of the following family of spaces. Let $\Phi$ denote the family of mappings from the set of continuous semi-norms on $F$ to that on $E$. For $\phi \in \Phi$ define:
$C^1_\Pi$: the compatible convergence structure with filters:
$\mathcal{F} \to 0$ if for each semi-norm $\beta$ on $F$ there is a semi-norm $\alpha$ on $E$ such that for each $\epsilon \gt 0$ there is some $Q \in \mathcal{F}$ with $\sup\{\rho_{\beta,\alpha}(u) | u \in Q\} \le \epsilon$.
$C^1_{q b}$: the quasi-bounded convergence structure. This has filters:
$\mathcal{F} \to 0$ if $\mathcal{F}(\mathcal{B}^n) \to 0$ for every quasi-bounded filter $\mathcal{B}$ on $E$.
Recall that a filter $\mathcal{B}$ on $E$ is quasi-bounded if $\mathcal{V} \cdot \mathcal{B} \to 0$ where $\mathcal{V}$ is the neighbourhood filter of $0$ in $\mathbb{R}$.
$C^1_c$: the continuous convergence structure. This is the coarsest convergence structure on $\mathcal{L}(E,F)$ which makes the evaluation map $\mathcal{L}(E,F) \times E \to F$ continuous.
$C^1_{\mathcal{S}}$: the topology of uniform convergence on a family $\mathcal{S}$ of bounded subsets of $E$, in particular:
The relationships between the various definitions of $C^1_?$ are displayed in the following diagram.
Let us extract some particular cases:
An important question to ask of the various definitions of continuously differentiable is whether they satisfy the chain rule. The following result provides the basis for this.
Let $E$, $F$, $G$ be LCTVS, let $U \subseteq E$ and $V \subseteq F$ be open sets, let $\Lambda$ and $\Lambda'$ be convergence structures on $\mathcal{L}(E,F)$ and $\mathcal{L}(E,G)$ respectively.
Assume that the composition map:
is continuous.
Let $f \colon U \to F$ and $g \colon V \to G$ be functions with $f(U) \subseteq V$. Suppose that $f$ is of class $C^1_\Lambda$ and $g$ of class $C^1_k$. Then $g \circ f$ is of class $C^1_{\Lambda'}$.
The chain rule holds for each of $C^1_c$, $C^1_{q b}$, $C^1_\Pi$, $C^1_\Delta$, $C^1_\Theta$.
The following partial chain rules also hold:
A minor wriggle enters the story with higher derivatives due to the fact that the higher derivatives are multilinear maps $E^n \to F$ and so not only are there different notions of convergence to put on these spaces, there are also different possible meanings of the statement that these are continuous. When dealing with one of the topologies (defined by some family of bounded sets), we will end up with derivatives in $\mathcal{H}^n_{\mathcal{S}}(E,F)$ rather than $\mathcal{L}^n_{\mathcal{S}}(E,F)$ in the notation of continuous multilinear operator.
However, we can start with a very weak notion to get the ball rolling. Let $L^n(E,F)$ be the space of all (not necessarily continuous) $n$-linear maps $E^n \to F$. We equip it with the topology of simple convergence.
Let $E$ and $F$ be LCTVS, $U \subseteq E$ an open subset, $p \in \mathbb{N}$. A function $f \colon U \to F$ is said to be weakly $p$-times differentiable if there exist functions $D^k f \colon U \to L^k(E,F)$ for $k = 0, 1, \dots, p$ such that $D^0 f = f$ and for each $x \in U$, $h \in E$, and $k = 0, 1, \dots, p-1$ then
Note that we don’t assume that the $D^k f$ are continuous. If some are continuous then some nice properties ensue. For example, if $D^p f$ is continuous then each $D^k f$ for $k \le p$ is totally symmetric.
From the definition of weakly $p$-times differentiable we can define the various classes of continuously $p$-times differentiable. For the definition of $\mathcal{H}^n_{\mathcal{S}}(E,F)$ see the page on continuous multilinear operator.
Let $E$ and $F$ be LCTVS, $U \subseteq E$ an open subset, $p \in \mathbb{N}$. Let $\mathcal{S}$ be a family of bounded sets in $E$ which covers $E$. A function $f \colon U \to F$ is said to be differentiable of class $C^p_{\mathcal{S}}$ if $f$ is weakly $p$-times differentiable and such that for $k = 0,1,\dots, p$ then:
Using convergence structures, we have:
Let $E$ and $F$ be LCTVS, $U \subseteq E$ an open subset, $p \in \mathbb{N}$. Let $\Lambda$ be one of the convergence structures $\Lambda_c$, $\Lambda_{q b}$, $\Pi$, $\Delta$, or $\Theta$ on $\mathcal{L}^p(E,F)$. A function $f \colon U \to F$ is said to be differentiable of class $C^p_{\mathcal{S}}$ if $f$ is weakly $p$-times differentiable and such that for $k = 0,1,\dots, p$ then:
For fixed $p$, the relationships between the various $C^p_?$ are the same as for $p = 1$. For varying $p$ we have:
Let $E$ and $F$ be LCTVS, $U \subseteq E$ an open set. If $f \colon U \to E$ is of class $C^{p+1}_c$ then $f$ is of class $C^p_\Pi$. Whence we have:
If $E$ is metrisable (resp. Fréchet) and $f$ is of class $C^{p+1}_k$ (resp. $C^{p+1}_s$) then $f$ is of class $C^p_\Pi$.
We make the obvious definition for a smooth function:
Let $\Lambda$ be one of the convergence structures $\Lambda_s$, $\Lambda_k$, $\Lambda_{p k}$, $\Lambda_b$, $\Lambda_c$, $\Lambda_{q b}$, $\Pi$, $\Delta$, or $\Theta$. A function $f \colon U \to F$ is said to be differentiable of class $C^\infty_\Lambda$ if it is of class $C^p_\Lambda$ for all $p \in \mathbb{N}$.
We have the following table of relationships.
Therefore, for Banach spaces all the notions of “smooth function” collapse, whilst for Fréchet spaces all of the notions coarser than $\Pi$ collapse.