See also differentiation.
synthetic differential geometry
Introductions
from point-set topology to differentiable manifolds
geometry of physics: coordinate systems, smooth spaces, manifolds, smooth homotopy types, supergeometry
Differentials
Tangency
The magic algebraic facts
Theorems
Axiomatics
(shape modality $\dashv$ flat modality $\dashv$ sharp modality)
$(ʃ \dashv \flat \dashv \sharp )$
dR-shape modality $\dashv$ dR-flat modality
$ʃ_{dR} \dashv \flat_{dR}$
(reduction modality $\dashv$ infinitesimal shape modality $\dashv$ infinitesimal flat modality)
$(\Re \dashv \Im \dashv \&)$
fermionic modality $\dashv$ bosonic modality $\dashv$ rheonomy modality
$(\rightrightarrows \dashv \rightsquigarrow \dashv Rh)$
Models
Models for Smooth Infinitesimal Analysis
smooth algebra ($C^\infty$-ring)
differential equations, variational calculus
Chern-Weil theory, ∞-Chern-Weil theory
Cartan geometry (super, higher)
A function (map) is differentiable at some point if it can be well approximated by a linear map near that point. The approximating linear maps at different points together form the derivative of the map.
One may then ask whether the derivative itself is differentiable, and so on. This leads to a hierarchy of ever more differentiable maps, starting with continuous maps and progressing through maps that are $n$ times (continuously) differentiable to those that are infinitely differentiable, and finally to those that are analytic. Infinitely differentiable maps are sometimes called smooth.
Differentiability is first defined directly for maps between (open subsets of) a Cartesian space. These differentiable maps can then be used to define the notion of differentiable manifold, and then a more general notion of differentiable map between differentiable manifolds, forming a category called Diff. We have a parallel hierarchy of ever more differentiable manifolds and ever more differentiable maps between them. Since every more differentiable manifold has an underlying less differentiable manifold, we may always consider maps that are less differentiable than the manifolds between which they run.
In all of the following $\mathbb{R}$ denotes the real numbers and $\mathbb{R}^n$ for $n \in \mathbb{N}$ their $n$-fold Cartesian product. For $i \in \{1, \cdots, n\}$ we write
for the projection map onto the $i$th factor.
When considering convergence of sequences of elements of these sets we regard them as Euclidean metric spaces with the Euclidean norm
The open subsets of the corresponding metric topology are the unions of open balls in $\mathbb{R}^n$.
(differentiation of real-valued functions on Cartesian space)
Let $n \in \mathbb{N}$ and let $U \subset \mathbb{R}^n$ be an open subset.
Then a function $f \;\colon\; U \longrightarrow \mathbb{R}$ is called differentiable at $x\in U$ if there exists a linear map $d f_x : \mathbb{R}^n \to \mathbb{R}$ such that the following limit exists as $h$ approaches zero “from all directions at once”:
This means that for all $\epsilon \in (0,\infty)$ there exists an open subset $V\subseteq U$ containing $x$ such that whenever $x+h\in V$ we have $\frac{f(x+h)-f(x) - d f_x(h)}{\Vert h\Vert} \lt \epsilon$.
We say that $f$ is differentiable on a subset $I$ of $U$ if $f$ is differentiable at every $x\in I$, and differentiable (tout court) if $f$ is differentiable on all of $U$. We say that $f$ is continuously differentiable if it is differentiable and $d f$ is a continuous function.
The map $d f_x$ is called the derivative or differential of $f$ at $x$.
(Notation and Terminology)
If $n=1$, as in classical one-variable calculus, then $d f_x$ in def. 1 may be identified with a real number, and that number is also called the derivative of $f$ at $x$ and often written $f'(x)$. (In that case, the notation $d f$ is generally still reserved for the corresponding linear map, with its input denoted by $d x$, so that we have $d f = f'(x) d x$.)
(equivalent formulation)
An equivalent way to state def. 1 is to say that
where $E$ is a function such that $\lim_{h\to 0}E(h) = 0$. This is easy to see; just let $E(h) = \frac{f(x+h)-f(x) - d f_x(h)}{\Vert h\Vert}$.
Another equivalent way to say it is that
where $E_i$ are functions such that $\lim_{h\to 0}E_i(h) = 0$. For if this is true, then $E(h) = \frac{1}{\Vert h\Vert}(E_1(h) h_1 + \cdots + E_n(h) h_n)$ satisfies the previous definition. Conversely, if the previous definition holds, then defining $E_i(h) = \frac{h_i}{\Vert h \Vert} E(h)$ satisfies this definition.
A weaker notion of differentiability is the following:
Let $n \in \mathbb{N}$ and $U \subset \mathbb{R}^n$ an open subset.
Then a function $f \colon U \longrightarrow \mathbb{R}$ is said to have directional derivative in the direction of $v \in \mathbb{R}^n$ at $x\in U$ if the limit
exists. Here $h$ is just a real number.
Historically, the term ‘directional derivative’ was reserved for when $v$ is a unit vector (or divide the derivative above by $\|v\|$), but the general concept involves less structure and is more important but has no other established name. If $v$ is a standard basis vector $e_i$, then the directional derivative is called a partial derivative with respect to the corresponding coordinate, and often written $\frac{\partial f}{\partial x_i}$ or $f_{x_i}$.
If $f$ is differentiable at $x$ in the sense of def. 1, then $d f_x(v)$ is its directional derivative along $v$. In particular, the coordinates of $d f_x$ are the partial derivatives of $f$.
In general, $f$ may have all partial derivatives, and even all directional derivatives, without being differentiable; see the examples below. However, if $f$ has all partial derivatives and they are continuous as functions of $x$, then in fact $f$ is differentiable (and indeed continuously differentiable).
(differentiation of functions between Cartesian spaces)
Let $n_1, n_2 \in \mathbb{N}$ and let $U\subseteq \mathbb{R}^{n_1}$ be an open subset.
Then a function $f \;\colon\; U \longrightarrow \mathbb{R}^{n_2}$ is differentiable if for all $i \in \{1, \cdots, n_2\}$ the component function
is differentiable in the sense of def. 1.
In this case, the derivatives $d f_i \colon \mathbb{R}^n \to \mathbb{R}$ of the $f_i$ assemble into a linear map of the form
For $X$ and $Y$ differentiable manifolds, then a function $f \colon X \longrightarrow Y$ is called differentiable if for $\left\{ \mathbb{R}^{n} \underoverset{\simeq}{\phi_i}{\to} U_i \subset X\right\}$ an atlas for $X$ and $\left\{ \mathbb{R}^{n'} \underoverset{\simeq}{\psi_j}{\to} V_j \subset X_2\right\}$ and atlas for $Y$ then for all $i \in I$ and $j \in J$ the function
is differentiable in the sense of def. 3.
For $U \subset \mathbb{R}^n$ an open subset and $f \;\colon\; U\to \mathbb{R}^m$ a differentiable function (def. 1), we may regard its differential $d f$ as a function from a subset of $U$ (the points where $f$ is differentiable) to the space $L(\mathbb{R}^n,\mathbb{R}^m)$ of linear maps. Since $L(\mathbb{R}^n,\mathbb{R}^m) \cong \mathbb{R}^{n m}$ is again a Cartesian space, we may then ask whether $d f$ is differentiable.
We can then iterate, obtaining the following hierarchy of differentiability. Because iterated differentiability by itself is not very useful, and a differentiable map is necessarily continuous, one generally includes continuity of the last assumed derivatives.
The map $f$ is continuous or $C^0$ if it is a continuous map between underlying topological spaces. We begin with this since a differentiable map is necessarily continuous.
The map $f$ is continuously differentiable or $C^1$ on $U$ if it is differentiable at all points of $U$ and the resulting map $d f$ is continuous. (At this point, if we generalize to infinite-dimensional spaces, we get a variety of notions of when the differential is ‘continuous’; see continuously differentiable map for discussion.)
The map $f$ is twice differentiable if it is differentiable and its derivative $d f$ is differentiable. A twice differentiable map must be continuously differentiable. Similarly, $f$ is twice continuously differentiable or $C^2$ if it is twice differentiable and the second derivative $d d f$ is continuous.
By recursion, $f$ is $n + 1$ times differentiable if $f$ is $n$ times differentiable and the $n$th derivative of $f$ is differentiable. Similarly, $f$ is $n$ times continuously differentiable or $C^n$ if $f$ is $n$ times differentiable and the $n$th derivative of $f$ is continuous.
The map $f$ is smooth or infinitely differentiable or $C^\infty$ if it is $n$ times differentiable for all $n$, or equivalently if it is $C^n$ for all $n$. (There is no difference between infinite differentiability and infinite continuous differentiability.) One may also define this notion coinductively: $f$ is infinitely differentiable if it is differentiable and its derivative $d f$ is infinitely differentiable.
One step higher, we may ask whether $f$ is analytic or $C^\omega$.
If $f:U\to \mathbb{R}^m$ is twice differentiable with $U\subseteq \mathbb{R}^n$, its second derivative
is a function from $U$ into the space of bilinear maps from $\mathbb{R}^n\times \mathbb{R}^n$ to $\mathbb{R}^m$.
If $f:U\to \mathbb{R}^m$ is twice differentiable, then its second derivative $d(d f)$ lands in the space of symmetric bilinear maps, i.e. for any $x\in U$ and $v,w\in \mathbb{R}^n$ we have
It suffices to assume $m=1$; otherwise we just consider it componentwise. Define a function $g:\mathbb{R}\to \mathbb{R}$ by
Then by the chain rule, $g$ is differentiable and
where $E_1, E_2, E \to 0$ as $(v,w)\to 0$. Now the mean value theorem tells us that for some $\xi\in(0,1)$ we have
But the “second-order difference” $f(x+v+w) - f(x+v) - f(x+w) + f(x)$ is manifestly symmetric in $v$ and $w$, so we have
where $\lim_{(v,w)\to 0} E = 0$. But bilinearity of the LHS then implies that it is identically zero.
The components of the bilinear map $d(d f)$ are the second-order partial derivatives $\frac{\partial^2 f}{\partial x_i \partial x_j}$ of $f$. Thus, this theorem says that if $f$ is twice differentiable, then the mixed partials are equal,
The second-order partial derivatives may exist without the mixed partials being equal; see below for a counterexample. However, the theorem shows that if we require $f$ to actually be twice differentiable rather than merely having second-order partial derivatives, then this cannot happen.
In particular, we have the following corollary, which is more commonly found in textbooks.
If $f$ has first and second-order partial derivatives, and the latter are continuous in a neighborhood of $x$, then the mixed partial derivatives are equal,
Continuity of the second-order partials implies that the first-order partials are differentiable, and hence so is the differential $d f$.
Note that the proof of the theorem implies that if $f$ is twice differentiable at $x$, then there exists a bilinear map $\partial^2 f_x : \mathbb{R}^n \times \mathbb{R}^n \to \mathbb{R}^m$ such that
where $\lim_{v,w\to 0} E(v,w) = 0$. This is a condition that makes sense as a condition on an arbitrary $f$ without assuming differentiability. and if it holds, then the bilinear map $\partial^2 f_x$ must be symmetric.
Moreover, as explained here, if $f$ is differentiable in a neighborhood of $x$ and satisfies this condition at $x$, then it is in fact twice differentiable at $x$. To see this, it suffices to show that each coordinate of $d f$ is differentiable, so let $w=\delta e_i$, with $e_i$ a unit basis vector and $\delta$ a real number $\neq 0$. Then we have
Taking the limit as $\delta\to 0$ and using differentiability of $f$ at $x$ and $x+v$ (for sufficiently small $v$), we get
Now take the limit as $v\to 0$; on the left we get $0$ by assumption, so the function $y\mapsto d f_y(e_i)$ is differentiable at $x$ with derivative $\partial^2 f_x(-,e_i)$. Thus, $d f$ is differentiable at $x$.
On the other hand, this condition by itself does not even imply that $f$ is continuous. For instance, if $f$ is a $\mathbb{Q}$-linear map $\mathbb{R}\to \mathbb{R}$, then the second-order difference $f(x+v+w) - f(x+v) - f(x+w) + f(x)$ is identically zero.
Let $f:\mathbb{R}^n\to \mathbb{R}$ be differentiable. Instead of asking whether $d f : U \to L(\mathbb{R}^n,\mathbb{R})$ is differentiable, we can ask whether its exponential transpose $d f : U\times \mathbb{R}^n \to \mathbb{R}$ is differentiable. (Note that $U\times \mathbb{R}^n$ is the tangent bundle of $U\subseteq \mathbb{R}^n$.) This amounts to asking that for $x\in U$ and $v\in \mathbb{R}^n$, we have
for a linear map $d^2 f_{(x,v)}:\mathbb{R}^{2n} \to \mathbb{R}$, where $\lim_{(w,h)\to 0} E = 0$. Setting $h=0$, we see that this implies that $d f : U \to L(\mathbb{R}^n,\mathbb{R})$ is differentiable, with differential $d(d f)_x = d^2 f_{(x,v)}(w,0)$ for any $v$. And setting $w=0$, we obtain $d f_x(h) = d^2 f_{(x,v)}(0,h)$ for any $v$. Thus, we can write
where $\partial^2 f_x$ is the symmetric bilinear map from the previous section. Conversely, if $f$ is twice differentiable, then using linearity and continuity of $d f$ it is easy to see that the above condition holds.
Therefore, these two kinds of twice-differentiability of $f$ are equivalent as conditions on $f$, but the resulting second differential is different. In the second case, we get
rather than merely the first term $\sum_{i,j} \frac{\partial^2f}{\partial x_i \partial x_j} d x_i \, d x_j$. There are two advantages to the second approach (asking that $d f : U\times \mathbb{R}^n \to \mathbb{R}$ be differentiable).
Firstly, we can reformulate it in terms of $f$ by asking that there exists a linear map $d f_x : \mathbb{R}^n \to \mathbb{R}^m$ and a bilinear map $\partial^2 f_x : \mathbb{R}^n \times \mathbb{R}^n \to \mathbb{R}^m$ such that
where $\lim_{v,w,h \to 0} E(v,w,h) = 0$. This holds if $f$ is twice differentiable, for then we can write
and $d(d f)_x(v+w,h)$ can also be incorporated into the error term. Of course, setting $h=0$ we obtain the characterization of twice-differentiability from the previous section. But setting $v=w=0$, we find that $f$ is differentiable at $x$ with derivative $d f_x$. So here we have a direct characterization of the second derivative which also implies that the first derivative exists (although it seems that it doesn’t imply differentiability in a neighborhood of $x$, so that the resulting “second derivative” may not actually be the derivative of the first derivative).
Secondly, the virtue of a second differential incorporating the first derivatives is that like the first differential $d f$, but unlike the bilinear map $\partial^2 f$, it satisfies Cauchy's invariant rule. This means that we can express the chain rule for second differentials of composite maps simply by substitution: if $y = f(u)$ and $u = g(x)$, then finding $d^2 y$ in terms of $d u$ and $d^2 u$, and $d u$ and $d^2 u$ in terms of $d x$ and $d^2 x$, and substituting, gives the correct expression for $d^2 y$ in terms of $d x$ and $d^2 x$.
In fact, this can be proven using the above direct characterization of the second differential, in essentially exactly the same way that we prove the ordinary chain rule for first derivatives. We sketch the proof, omitting the explicit error terms. We can write
as
where $v' = f(x+v) - f(x)$ and $w' = f(x+w) - f(x)$ and $h' = f(x+v+w+h) - f(x+v) - f(x+w) + f(x)$. Now by extra-strong twice differentiability of $g$, this is approximately equal to
But by differentiability of $f$, we have $v' \approx d f_x(v)$ and $w' \approx d f_x(w)$, while by extra-strong twice differentiability of $f$ we have $h' \approx \partial^2 f_x(v,w) + d f_x(h)$. Thus, we obtain approximately
which is exactly what we would get by substitution.
If $X$ and $Y$ are $C^k$-differentiable manifolds, then we may define what it means for a map $f:X\to Y$ to be $n$ times differentiable, or $C^n$, for any $n\le k$, by asking that it yield such a map when restricted to any charts. The most common case is when $X$ and $Y$ are smooth (infinitely differentiable) manifolds, so that we can define $C^n$ functions between them for all $n\le \infty$. (Analytic manifolds, which are necessary in order to define analyticity of $f$, are somewhat rarer.)
A differentiable map between manifolds induces a map between their tangent bundles $d f : T X \to T Y$; this operation extends to a functor from $C^{k+1}$ manifolds and $C^{k+1}$ maps to $C^k$ manifolds and $C^k$ maps. See differentiation for more.
The function $f:\mathbb{R}^2 \to \mathbb{R}$ defined by
is continuous everywhere, and has directional derivatives (def. 2) at $(0,0)$ in all directions, but is not differentiable at $(0,0)$ in the sense of def. 1. Note that the directional derivative along the line $y=m x$ is $\frac{m^3}{1+m^2}$, which is not a linear function.
One may wonder whether existence of a linear map $d f_x$ is enough, but this is also not the case. The function $f:\mathbb{R}^2 \to \mathbb{R}$ defined by
has all directional derivatives at $(0,0)$ equaling $0$, so that in particular there is a linear map $d f_{(0,0)}$ whose values are the directional derivatives. But it is not differentiable at $(0,0)$; in fact, it is not even continuous at $(0,0)$. (Thus it also provides an example of a discontinuous function which has all directional derivatives.)
Mike Shulman: Is there a function that is continuous in a neighborhood of $(0,0)$ and has all directional derivatives there equaling $0$, but is not differentiable?
Let $k$ be a natural number, and consider the function
from the real line to itself, with $f(0) \coloneqq 0$. Away from $0$, $f_k$ is smooth (even analytic); but at $0$, $f_0$ is not continuous, $f_1$ is continuous but not differentiable, $f_2$ is differentiable but not continuously differentiable, and so on: * If $k = 2 n$ is even, then $f_k$ is differentiable $n$ times but not continuously differentiable $n$ times; * If $k = 2 n + 1$ is odd, then $f_k$ is continuously differentiable $n$ times but not differentiable $n + 1$ times.
Similarly, in two dimensions we can consider functions such as
together with $f(0,0) = 0$. This is smooth away from $0$, and once differentiable at $0$, even in the strong sense that it is well-approximated by a linear function near $0$. However, its derivative is not continuous at $0$. In particular, this shows that the converse of the theorem “if the partial derivatives exist and are continuous at a point, then the function is differentiable there” fails.
The function $f:\mathbb{R}^2 \to \mathbb{R}$ defined by
plus $f(0,0) = 0$, has partial derivatives $\frac{\partial^2f}{\partial x \partial y}$ and $\frac{\partial^2f}{\partial y \partial x}$ that both exist but are not equal at $(0,0)$ (nor are they continuous at $(0,0)$). Therefore, by the theorem proven above, it is not twice differentiable at $(0,0)$.
While differentiability means approximability by a linear function, twice differentiability does not mean approximability by a quadratic function. For example, the function
(with $f(0) =0$, to make it continuous there) is not twice differentiable at $x=0$, but it is well-approximated by the quadratic polynomial $p(x) = 0$ in the sense that their difference is $o(x^2)$ as $x\to 0$. Even worse, the function
is not twice differentiable at $x=0$, but it is well-approximated by a polynomial of any finite degree $n$ (namely, the zero polynomial), in the sense that their difference is $o(x^n)$ as $x\to 0$.
In some contexts, it is useful to say that functions such as these have “pointwise second derivatives”. More precisely, we say that a function $f$ has a pointwise $k^{th}$ derivative at $a$ if there exists a polynomial $p$ of degree $k$ such that
In this case, the pointwise $k^{th}$ derivative is $f^{(k)}_{pt}(a) = p^{(k)}(a)$. Thus we would say that while $f(x) = x^3 \sin(1/x)$ does not have a second derivative at $0$, it does have a pointwise second derivative at $0$, and $f_{pt}''(0) = 0$. See e.g. this MSE answer.
In the definition of strong twice-differentiability, we cannot replace the symmetric bilinear map $\partial^2 f_x(v,w)$ by the corresponding quadratic form $Q_x(v) = \partial^2f_x(v,v)$. In other words, if we suppose that
where $\lim_{v\to 0} E(v) = 0$, for some quadratic form $Q_x$, it does not follow that $f$ is twice differentiable at $x$, even in one dimension. The same old counterexample
(with $f(0)=0$) works: we have
so $E(v) = v (4\sin(\frac{1}{2v}) - \sin(\frac{1}{v})) \to 0$.
continuous function, differentiable function, continuously differentiable function, smooth function, analytic function