nLab differentiable map

Differentiable maps

See also differentiation and derivative.

Context

Differential geometry

synthetic differential geometry

Introductions

from point-set topology to differentiable manifolds

geometry of physics: coordinate systems, smooth spaces, manifolds, smooth homotopy types, supergeometry

Differentials

V-manifolds

smooth space

Tangency

The magic algebraic facts

Theorems

Axiomatics

cohesion

infinitesimal cohesion

tangent cohesion

differential cohesion

graded differential cohesion

singular cohesion

id id fermionic bosonic bosonic Rh rheonomic reduced infinitesimal infinitesimal & étale cohesive ʃ discrete discrete continuous * \array{ && id &\dashv& id \\ && \vee && \vee \\ &\stackrel{fermionic}{}& \rightrightarrows &\dashv& \rightsquigarrow & \stackrel{bosonic}{} \\ && \bot && \bot \\ &\stackrel{bosonic}{} & \rightsquigarrow &\dashv& \mathrm{R}\!\!\mathrm{h} & \stackrel{rheonomic}{} \\ && \vee && \vee \\ &\stackrel{reduced}{} & \Re &\dashv& \Im & \stackrel{infinitesimal}{} \\ && \bot && \bot \\ &\stackrel{infinitesimal}{}& \Im &\dashv& \& & \stackrel{\text{étale}}{} \\ && \vee && \vee \\ &\stackrel{cohesive}{}& \esh &\dashv& \flat & \stackrel{discrete}{} \\ && \bot && \bot \\ &\stackrel{discrete}{}& \flat &\dashv& \sharp & \stackrel{continuous}{} \\ && \vee && \vee \\ && \emptyset &\dashv& \ast }

Models

Lie theory, ∞-Lie theory

differential equations, variational calculus

Chern-Weil theory, ∞-Chern-Weil theory

Cartan geometry (super, higher)

Differentiable maps

Idea

A function (map) is differentiable at some point if it can be well approximated by a linear map near that point. The approximating linear maps at different points together form the derivative of the map.

One may then ask whether the derivative itself is differentiable, and so on. This leads to a hierarchy of ever more differentiable maps, starting with continuous maps and progressing through maps that are nn times (continuously) differentiable to those that are infinitely differentiable, and finally to those that are analytic. Infinitely differentiable maps are sometimes called smooth.

Differentiability is first defined directly for maps between (open subsets of) a Cartesian space. These differentiable maps can then be used to define the notion of differentiable manifold, and then a more general notion of differentiable map between differentiable manifolds, forming a category called Diff. We have a parallel hierarchy of ever more differentiable manifolds and ever more differentiable maps between them. Since every more differentiable manifold has an underlying less differentiable manifold, we may always consider maps that are less differentiable than the manifolds between which they run.

Definitions

In all of the following \mathbb{R} denotes the real numbers and n\mathbb{R}^n for nn \in \mathbb{N} their nn-fold Cartesian product. For i{1,,n}i \in \{1, \cdots, n\} we write

n pr i v=(v 1,,v n) v i \array{ \mathbb{R}^n &\overset{pr_i}{\longrightarrow}& \mathbb{R} \\ \vec v = (v_1, \cdots, v_n) &\mapsto& v_i }

for the projection map onto the iith factor.

When considering convergence of sequences of elements of these sets we regard them as Euclidean metric spaces with the Euclidean norm

: n[0,). {\Vert -\Vert} \;\colon\; \mathbb{R}^n \longrightarrow [0,\infty) \subset \mathbb{R} \,.

The open subsets of the corresponding metric topology are the unions of open balls in n\mathbb{R}^n.

In the real numbers

Epsilon-delta definition

A function f:f:\mathbb{R} \to \mathbb{R} is differentiable if it comes with a function dfdx:\frac{d f}{d x}:\mathbb{R} \to \mathbb{R} and a function M f: + +M_f:\mathbb{Q}_+ \to \mathbb{Q}_+ in the positive rational numbers, such that

  • for every positive rational number ϵ +\epsilon \in \mathbb{Q}_+, for every real number hh \in \mathbb{R} such that 0<|h|<M f(ϵ)0 \lt | h | \lt M_f(\epsilon), and for every real number xx \in \mathbb{R},
    |f(x+h)dfdx(x)|<ϵ|h|\left|f(x + h) - \frac{d f}{d x}(x)\right| \lt \epsilon |h|

Infinitesimal definition

Given a predicate PP on the real numbers \mathbb{R}, let II denote the set of all elements in \mathbb{R} for which PP holds. A partial function f:f:\mathbb{R} \to \mathbb{R} is equivalently a function f:If:I \to \mathbb{R} for any such predicate PP and set II.

A function f:If:I \to \mathbb{R} is differentiable at a subset SIS \subseteq I with injection j:Sj:S \hookrightarrow \mathbb{R} if it has a function dfdx:S\frac{d f}{d x}:S \to \mathbb{R} such that for all Archimedean ordered Artinian local KK-algebras AA with ring homomorphism h:KAh:K \to A and nilradical DD such that for all ϵD\epsilon \in D, ϵ 2=0\epsilon^2 = 0, for all nilpotent elements ϵD\epsilon \in D,

f A(h(j(a))+ϵ)=h(j(a))+h(dfdx(a))ϵf_A(h(j(a)) + \epsilon) = h(j(a)) + h\left(\frac{d f}{d x}(a)\right) \epsilon

A function f:If:I \to \mathbb{R} is differentiable at an element aIa \in I if it is differentiable at the singleton subset {a}\{a\}, and a function f:If:I \to \mathbb{R} is differentiable if it is differentiable at the improper subset of II.

From a Cartesian space to the real numbers

Definition

(differentiation of real-valued functions on Cartesian space)

Let nn \in \mathbb{N} and let U nU \subset \mathbb{R}^n be an open subset.

Then a function f:Uf \;\colon\; U \longrightarrow \mathbb{R} is called differentiable at xUx\in U if there exists a linear map df x: nd f_x : \mathbb{R}^n \to \mathbb{R} such that the following limit exists as hh approaches zero “from all directions at once”:

lim h0f(x+h)f(x)df x(h)h=0. \lim_{h\to 0} \frac{f(x+h)-f(x) - d f_x(h)}{\Vert h\Vert} = 0.

This means that for all ϵ(0,)\epsilon \in (0,\infty) there exists an open subset VUV\subseteq U containing xx such that whenever x+hVx+h\in V we have f(x+h)f(x)df x(h)h<ϵ\frac{f(x+h)-f(x) - d f_x(h)}{\Vert h\Vert} \lt \epsilon.

We say that ff is differentiable on a subset II of UU if ff is differentiable at every xIx\in I, and differentiable (tout court) if ff is differentiable on all of UU. We say that ff is continuously differentiable if it is differentiable and dfd f is a continuous function.

The map df xd f_x is called the derivative or differential of ff at xx.

Remark

(Notation and Terminology)

If n=1n=1, as in classical one-variable calculus, then df xd f_x in def. may be identified with a real number, and that number is also called the derivative of ff at xx and often written f(x)f'(x). (In that case, the notation dfd f is generally still reserved for the corresponding linear map, with its input denoted by dxd x, so that we have df=f(x)dxd f = f'(x) d x.)

Remark

(equivalent formulation)

An equivalent way to state def. is to say that

f(x+h)=f(x)+df x(h)+E(h)h f(x+h) = f(x) + d f_x(h) + E(h){\Vert h\Vert}

where EE is a function such that lim h0E(h)=0\lim_{h\to 0}E(h) = 0. This is easy to see; just let E(h)=f(x+h)f(x)df x(h)hE(h) = \frac{f(x+h)-f(x) - d f_x(h)}{\Vert h\Vert}.

Another equivalent way to say it is that

f(x+h)=f(x)+df x(h)+E 1(h)h 1++E n(h)h n f(x+h) = f(x) + d f_x(h) + E_1(h) h_1 + \cdots + E_n(h) h_n

where E iE_i are functions such that lim h0E i(h)=0\lim_{h\to 0}E_i(h) = 0. For if this is true, then E(h)=1h(E 1(h)h 1++E n(h)h n)E(h) = \frac{1}{\Vert h\Vert}(E_1(h) h_1 + \cdots + E_n(h) h_n) satisfies the previous definition. Conversely, if the previous definition holds, then defining E i(h)=h ihE(h)E_i(h) = \frac{h_i}{\Vert h \Vert} E(h) satisfies this definition.

Partial and directional derivatives

A weaker notion of differentiability is the following:

Definition

(directional derivative)

Let nn \in \mathbb{N} and U nU \subset \mathbb{R}^n an open subset.

Then a function f:Uf \colon U \longrightarrow \mathbb{R} is said to have directional derivative in the direction of v nv \in \mathbb{R}^n at xUx\in U if the limit

lim h0f(x+hv)f(x)h \lim_{h\to 0} \frac{f(x+h v)- f(x)}{h}

exists. Here hh is just a real number.

Historically, the term ‘directional derivative’ was reserved for when vv is a unit vector (or divide the derivative above by v\|v\|), but the general concept involves less structure and is more important but has no other established name. If vv is a standard basis vector e ie_i, then the directional derivative is called a partial derivative with respect to the corresponding coordinate, and often written fx i\frac{\partial f}{\partial x_i} or f x if_{x_i}.

If ff is differentiable at xx in the sense of def. , then df x(v)d f_x(v) is its directional derivative along vv. In particular, the coordinates of df xd f_x are the partial derivatives of ff.

In general, ff may have all partial derivatives, and even all directional derivatives, without being differentiable; see the examples below. However, if ff has all partial derivatives and they are continuous as functions of xx, then in fact ff is differentiable (and indeed continuously differentiable).

Functions between Cartesian spaces

Definition

(differentiation of functions between Cartesian spaces)

Let n 1,n 2n_1, n_2 \in \mathbb{N} and let U n 1U\subseteq \mathbb{R}^{n_1} be an open subset.

Then a function f:U n 2f \;\colon\; U \longrightarrow \mathbb{R}^{n_2} is differentiable if for all i{1,,n 2}i \in \{1, \cdots, n_2\} the component function

f i:Uf n 2pr u f_i \;\colon\; U \overset{f}{\longrightarrow} \mathbb{R}^{n_2} \overset{pr_u}{\longrightarrow} \mathbb{R}

is differentiable in the sense of def. .

In this case, the derivatives df i: nd f_i \colon \mathbb{R}^n \to \mathbb{R} of the f if_i assemble into a linear map of the form

df x: n 1 n 2. d f_x \;\colon\; \mathbb{R}^{n_1} \to \mathbb{R}^{n_2} \,.

Functions between differentiable manifolds

Definition

For XX and YY differentiable manifolds, then a function f:XYf \colon X \longrightarrow Y is called differentiable if for { nϕ iU iX}\left\{ \mathbb{R}^{n} \underoverset{\simeq}{\phi_i}{\to} U_i \subset X\right\} an atlas for XX and { nψ jV jX 2}\left\{ \mathbb{R}^{n'} \underoverset{\simeq}{\psi_j}{\to} V_j \subset X_2\right\} and atlas for YY then for all iIi \in I and jJj \in J the function

nAA(fϕ i) 1(V j)ϕ if 1(V j)fV jψ j 1 n \mathbb{R}^n \supset \phantom{AA} (f\circ \phi_i)^{-1}(V_j) \overset{\phi_i}{\longrightarrow} f^{-1}(V_j) \overset{f}{\longrightarrow} V_j \overset{\psi_j^{-1}}{\longrightarrow} \mathbb{R}^{n'}

is differentiable in the sense of def. .

Higher differentiability

Iterated differentiability

For U nU \subset \mathbb{R}^n an open subset and f:U mf \;\colon\; U\to \mathbb{R}^m a differentiable function (def. ), we may regard its differential dfd f as a function from a subset of UU (the points where ff is differentiable) to the space L( n, m)L(\mathbb{R}^n,\mathbb{R}^m) of linear maps. Since L( n, m) nmL(\mathbb{R}^n,\mathbb{R}^m) \cong \mathbb{R}^{n m} is again a Cartesian space, we may then ask whether dfd f is differentiable.

We can then iterate, obtaining the following hierarchy of differentiability. Because iterated differentiability by itself is not very useful, and a differentiable map is necessarily continuous, one generally includes continuity of the last assumed derivatives.

  • The map ff is continuous or C 0C^0 if it is a continuous map between underlying topological spaces. We begin with this since a differentiable map is necessarily continuous.

  • The map ff is continuously differentiable or C 1C^1 on UU if it is differentiable at all points of UU and the resulting map dfd f is continuous. (At this point, if we generalize to infinite-dimensional spaces, we get a variety of notions of when the differential is ‘continuous’; see continuously differentiable map for discussion.)

  • The map ff is twice differentiable if it is differentiable and its derivative dfd f is differentiable. A twice differentiable map must be continuously differentiable. Similarly, ff is twice continuously differentiable or C 2C^2 if it is twice differentiable and the second derivative ddfd d f is continuous.

  • By recursion, ff is n+1n + 1 times differentiable if ff is nn times differentiable and the nnth derivative of ff is differentiable. Similarly, ff is nn times continuously differentiable or C nC^n if ff is nn times differentiable and the nnth derivative of ff is continuous.

  • The map ff is smooth or infinitely differentiable or C C^\infty if it is nn times differentiable for all nn, or equivalently if it is C nC^n for all nn. (There is no difference between infinite differentiability and infinite continuous differentiability.) One may also define this notion coinductively: ff is infinitely differentiable if it is differentiable and its derivative dfd f is infinitely differentiable.

  • One step higher, we may ask whether ff is analytic or C ωC^\omega.

Uniform differentiability

Note that ff is differentiable on a set UU iff there is a function ff' on UU (necessarily unique, assuming that UU has no isolated points) such that

ϵ>0,xU,δ>0,yU,|yx|<δ|f(y)f(x)f(x)(yx)|<ϵ|yx|. \forall\, \epsilon \gt 0,\; \forall\, x \in U,\; \exists\, \delta \gt 0,\; \forall\, y \in U,\; {|{y - x}|} \lt \delta \;\Rightarrow\; {|{f(y) - f(x) - f'(x)(y - x)}|} \lt \epsilon \,{|{y - x}|} .

Reverse quantifiers, and ff is uniformly differentiable on UU iff there is a function ff' on UU such that

ϵ>0,δ>0,xU,yU,|yx|<δ|f(y)f(x)f(x)(yx)|<ϵ|yx|. \forall\, \epsilon \gt 0,\; \exists\, \delta \gt 0,\; \forall\, x \in U,\; \forall\, y \in U,\; {|{y - x}|} \lt \delta \;\Rightarrow\; {|{f(y) - f(x) - f'(x)(y - x)}|} \lt \epsilon \,{|{y - x}|}.

In classical mathematics, ff is uniformly differentiable if and only if ff is differentiable and its derivative ff' is uniformly continuous; in other words, ff is uniformly differentiable iff ff is uniformly-continuously differentiable. The same is true in constructive mathematics as long as one assumes dependent choice. In the absence of dependent choice, however, the argument only goes one way, and uniform differentiability is stronger. Furthermore, just as pointwise continuity is not as well behaved constructively as uniform continuity, so pointwise differentiability is not as well behaved constructively as uniform differentiability. For this reason, uniform differentiability is particularly important in constructive mathematics.

In addition, one could talk about locally uniform differentiablilty, which is a continuously differentiable function on a set UU which is uniformly differentiable on every closed and bounded subset VUV \subseteq U.

Symmetry of higher derivatives

If f:U mf:U\to \mathbb{R}^m is twice differentiable with U nU\subseteq \mathbb{R}^n, its second derivative

d(df):UL( n,L( n, m))Bilin( n, n; m)d(d f) : U \to L(\mathbb{R}^n,L(\mathbb{R}^n,\mathbb{R}^m)) \cong Bilin(\mathbb{R}^n,\mathbb{R}^n;\mathbb{R}^m)

is a function from UU into the space of bilinear maps from n× n\mathbb{R}^n\times \mathbb{R}^n to m\mathbb{R}^m.

Theorem

(Symmetry of higher derivatives). If f:U mf:U\to \mathbb{R}^m is twice differentiable, then its second derivative d(df)d(d f) lands in the space of symmetric bilinear maps, i.e. for any xUx\in U and v,w nv,w\in \mathbb{R}^n we have

d(df) x(v,w)=d(df) x(w,v). d(d f)_x(v,w) = d(d f)_x(w,v).
Proof

It suffices to assume m=1m=1; otherwise we just consider it componentwise. Define a function g:g:\mathbb{R}\to \mathbb{R} by

g(ξ)=f(x+ξv+w)f(x+ξv). g(\xi) = f(x+\xi v+w) - f(x+\xi v).

Then by the chain rule, gg is differentiable and

g(ξ) =df x+ξv+w(v)df x+ξv(v) =(df x+ξv+w(v)df x(v))(df x+ξv(v)df x(v)) =d(df) x(v,ξv+w)+E 1|v||ξv+w|d(df) x(v,ξv)E 2|v||ξv| =d(df) x(v,w)+E|v|(|v|+|w|) \begin{aligned} g'(\xi) &= d f_{x+\xi v+w}(v) - d f_{x+\xi v}(v)\\ &= \Big(d f_{x+\xi v+w}(v) - d f_{x}(v)\Big) - \Big(d f_{x+\xi v}(v) - d f_x(v)\Big)\\ &= d(d f)_{x}(v,\xi v + w) + E_1 |v|\,|\xi v + w| - d(d f)_x(v,\xi v) - E_2 |v|\,|\xi v|\\ &= d(d f)_x(v,w) + E |v|\, (|v|+|w|) \end{aligned}

where E 1,E 2,E0E_1, E_2, E \to 0 as (v,w)0(v,w)\to 0. Now the mean value theorem tells us that for some ξ(0,1)\xi\in(0,1) we have

f(x+v+w)f(x+v)f(x+w)+f(x) =g(1)g(0) =g(ξ) =d(df) x(v,w)+E(|v|+|w|) 2. \begin{aligned} f(x+v+w) - f(x+v) - f(x+w) + f(x) &= g(1) - g(0)\\ &= g'(\xi)\\ &= d(d f)_x(v,w) + E (|v|+|w|)^2. \end{aligned}

But the “second-order difference” f(x+v+w)f(x+v)f(x+w)+f(x)f(x+v+w) - f(x+v) - f(x+w) + f(x) is manifestly symmetric in vv and ww, so we have

d(df) x(v,w)d(df) x(w,v)=E(|v|+|w|) 2 d(d f)_x(v,w) - d(d f)_x(w,v) = E (|v|+|w|)^2

where lim (v,w)0E=0\lim_{(v,w)\to 0} E = 0. But bilinearity of the LHS then implies that it is identically zero.

The components of the bilinear map d(df)d(d f) are the second-order partial derivatives 2fx ix j\frac{\partial^2 f}{\partial x_i \partial x_j} of ff. Thus, this theorem says that if ff is twice differentiable, then the mixed partials are equal,

2fx ix j= 2fx jx i.\frac{\partial^2 f}{\partial x_i \partial x_j} = \frac{\partial^2 f}{\partial x_j \partial x_i}.

The second-order partial derivatives may exist without the mixed partials being equal; see below for a counterexample. However, the theorem shows that if we require ff to actually be twice differentiable rather than merely having second-order partial derivatives, then this cannot happen.

In particular, we have the following corollary, which is more commonly found in textbooks.

Corollary

(Symmetry of continuous higher derivatives). If ff has first and second-order partial derivatives, and the latter are continuous in a neighborhood of xx, then the mixed partial derivatives are equal,

2fx ix j= 2fx jx i.\frac{\partial^2 f}{\partial x_i \partial x_j} = \frac{\partial^2 f}{\partial x_j \partial x_i}.
Proof

Continuity of the second-order partials implies that the first-order partials are differentiable, and hence so is the differential dfd f.

A direct definition of the second derivative

Note that the proof of the theorem implies that if ff is twice differentiable at xx, then there exists a bilinear map 2f x: n× n m\partial^2 f_x : \mathbb{R}^n \times \mathbb{R}^n \to \mathbb{R}^m such that

f(x+v+w)f(x+v)f(x+w)+f(x)= 2f x(v,w)+E(v,w)(|v|+|w|) 2f(x+v+w) - f(x+v) - f(x+w) + f(x) = \partial^2 f_x(v,w) + E(v,w) ({|v|+|w|})^2

where lim v,w0E(v,w)=0\lim_{v,w\to 0} E(v,w) = 0. This is a condition that makes sense as a condition on an arbitrary ff without assuming differentiability. and if it holds, then the bilinear map 2f x\partial^2 f_x must be symmetric.

Moreover, as explained here, if ff is differentiable in a neighborhood of xx and satisfies this condition at xx, then it is in fact twice differentiable at xx. To see this, it suffices to show that each coordinate of dfd f is differentiable, so let w=δe iw=\delta e_i, with e ie_i a unit basis vector and δ\delta a real number 0\neq 0. Then we have

E(v,δe i)=f(x+v+δe i)f(x+v)f(x+δe i)+f(x) 2f x(v,δw)|v|δ E(v,\delta e_i) = \frac{f(x+v+\delta e_i) - f(x+v) - f(x+\delta e_i) + f(x) - \partial^2 f_x(v,\delta w)}{{|v|}{\delta}}

Taking the limit as δ0\delta\to 0 and using differentiability of ff at xx and x+vx+v (for sufficiently small vv), we get

lim δ0E(v,δe i)=df x+v(e i)df x(e i) 2f x(v,e i)|v|. \lim_{\delta \to 0} E(v,\delta e_i) = \frac{d f_{x+v}(e_i) - d f_x(e_i) - \partial^2 f_x(v,e_i)}{{|v|}}.

Now take the limit as v0v\to 0; on the left we get 00 by assumption, so the function ydf y(e i)y\mapsto d f_y(e_i) is differentiable at xx with derivative 2f x(,e i)\partial^2 f_x(-,e_i). Thus, dfd f is differentiable at xx.

On the other hand, this condition by itself does not even imply that ff is continuous. For instance, if ff is a \mathbb{Q}-linear map \mathbb{R}\to \mathbb{R}, then the second-order difference f(x+v+w)f(x+v)f(x+w)+f(x)f(x+v+w) - f(x+v) - f(x+w) + f(x) is identically zero.

Higher differentiability on tangent spaces and the chain rule

Let f: nf:\mathbb{R}^n\to \mathbb{R} be differentiable. Instead of asking whether df:UL( n,)d f : U \to L(\mathbb{R}^n,\mathbb{R}) is differentiable, we can ask whether its exponential transpose df:U× nd f : U\times \mathbb{R}^n \to \mathbb{R} is differentiable. (Note that U× nU\times \mathbb{R}^n is the tangent bundle of U nU\subseteq \mathbb{R}^n.) This amounts to asking that for xUx\in U and v nv\in \mathbb{R}^n, we have

df x+w(v+h)df x(v)=d 2f (x,v)(w,h)+E(|w|+|h|) d f_{x+w}(v+h) - d f_x(v) = d^2 f_{(x,v)}(w,h) + E(|w|+|h|)

for a linear map d 2f (x,v): 2nd^2 f_{(x,v)}:\mathbb{R}^{2n} \to \mathbb{R}, where lim (w,h)0E=0\lim_{(w,h)\to 0} E = 0. Setting h=0h=0, we see that this implies that df:UL( n,)d f : U \to L(\mathbb{R}^n,\mathbb{R}) is differentiable, with differential d(df) x=d 2f (x,v)(w,0)d(d f)_x = d^2 f_{(x,v)}(w,0) for any vv. And setting w=0w=0, we obtain df x(h)=d 2f (x,v)(0,h)d f_x(h) = d^2 f_{(x,v)}(0,h) for any vv. Thus, we can write

d 2f (x,v)(w,h)= 2f x(v,w)+df x(h) d^2 f_{(x,v)}(w,h) = \partial^2 f_x(v,w) + d f_x(h)

where 2f x\partial^2 f_x is the symmetric bilinear map from the previous section. Conversely, if ff is twice differentiable, then using linearity and continuity of dfd f it is easy to see that the above condition holds.

Therefore, these two kinds of twice-differentiability of ff are equivalent as conditions on ff, but the resulting second differential is different. In the second case, we get

d 2f= 2f+df= i,j 2fx ix jdx idx j+ ifx id 2x i.d^2f = \partial^2 f + d f = \sum_{i,j} \frac{\partial^2f}{\partial x_i \partial x_j} d x_i \, d x_j + \sum_i \frac{\partial f}{\partial x_i} d^2 x_i.

rather than merely the first term i,j 2fx ix jdx idx j\sum_{i,j} \frac{\partial^2f}{\partial x_i \partial x_j} d x_i \, d x_j. There are two advantages to the second approach (asking that df:U× nd f : U\times \mathbb{R}^n \to \mathbb{R} be differentiable).

Firstly, we can reformulate it in terms of ff by asking that there exists a linear map df x: n md f_x : \mathbb{R}^n \to \mathbb{R}^m and a bilinear map 2f x: n× n m\partial^2 f_x : \mathbb{R}^n \times \mathbb{R}^n \to \mathbb{R}^m such that

f(x+v+w+h)f(x+v)f(x+w)+f(x)= 2f x(v,w)+df x(h)+E(v,w,h)|v| 2|w| 2+|h| 2.f(x+v+w+h) - f(x+v) - f(x+w) + f(x) = \partial^2 f_x(v,w) + d f_x(h) + E(v,w,h) \sqrt{{|v|}^2{|w|}^2 + {|h|}^2}.

where lim v,w,h0E(v,w,h)=0\lim_{v,w,h \to 0} E(v,w,h) = 0. This holds if ff is twice differentiable, for then we can write

f(x+v+w+h) =f(x+v+w)+df x+v+w(h)+E|h| =f(x+v+w)+df x(h)+d(df) x(v+w,h)+E|v+w||h|+E|h| \begin{aligned} f(x+v+w+h) &= f(x+v+w) + d f_{x+v+w}(h) + E {|h|}\\ &= f(x+v+w) + d f_x(h) + d(d f)_x(v+w,h) + E {|v+w|}{|h|} + E{|h|} \end{aligned}

and d(df) x(v+w,h)d(d f)_x(v+w,h) can also be incorporated into the error term. Of course, setting h=0h=0 we obtain the characterization of twice-differentiability from the previous section. But setting v=w=0v=w=0, we find that ff is differentiable at xx with derivative df xd f_x. So here we have a direct characterization of the second derivative which also implies that the first derivative exists (although it seems that it doesn’t imply differentiability in a neighborhood of xx, so that the resulting “second derivative” may not actually be the derivative of the first derivative).

Secondly, the virtue of a second differential incorporating the first derivatives is that like the first differential dfd f, but unlike the bilinear map 2f\partial^2 f, it satisfies Cauchy's invariant rule. This means that we can express the chain rule for second differentials of composite maps simply by substitution: if y=f(u)y = f(u) and u=g(x)u = g(x), then finding d 2yd^2 y in terms of dud u and d 2ud^2 u, and dud u and d 2ud^2 u in terms of dxd x and d 2xd^2 x, and substituting, gives the correct expression for d 2yd^2 y in terms of dxd x and d 2xd^2 x.

In fact, this can be proven using the above direct characterization of the second differential, in essentially exactly the same way that we prove the ordinary chain rule for first derivatives. We sketch the proof, omitting the explicit error terms. We can write

g(f(x+v+w+h))g(f(x+v))g(f(x+w))+g(f(x))g(f(x+v+w+h)) - g(f(x+v)) - g(f(x+w)) + g(f(x))

as

g(f(x)+v+w+h)g(f(x)+v)g(f(x)+w)+g(f(x)) g(f(x) + v' + w' + h') - g(f(x) + v') - g(f(x) + w') + g(f(x))

where v=f(x+v)f(x)v' = f(x+v) - f(x) and w=f(x+w)f(x)w' = f(x+w) - f(x) and h=f(x+v+w+h)f(x+v)f(x+w)+f(x)h' = f(x+v+w+h) - f(x+v) - f(x+w) + f(x). Now by extra-strong twice differentiability of gg, this is approximately equal to

2g f(x)(v,w)+dg f(x)(h). \partial^2 g_{f(x)}(v',w') + d g_{f(x)}(h').

But by differentiability of ff, we have vdf x(v)v' \approx d f_x(v) and wdf x(w)w' \approx d f_x(w), while by extra-strong twice differentiability of ff we have h 2f x(v,w)+df x(h)h' \approx \partial^2 f_x(v,w) + d f_x(h). Thus, we obtain approximately

2g f(x)(df x(v),df x(w))+dg f(x)( 2f x(v,w)+df x(h)) \partial^2 g_{f(x)}(d f_x(v), d f_x(w)) + d g_{f(x)}(\partial^2 f_x(v,w) + d f_x(h))

which is exactly what we would get by substitution.

For maps between manifolds

If XX and YY are C kC^k-differentiable manifolds, then we may define what it means for a map f:XYf:X\to Y to be nn times differentiable, or C nC^n, for any nkn\le k, by asking that it yield such a map when restricted to any charts. The most common case is when XX and YY are smooth (infinitely differentiable) manifolds, so that we can define C nC^n functions between them for all nn\le \infty. (Analytic manifolds, which are necessary in order to define analyticity of ff, are somewhat rarer.)

A differentiable map between manifolds induces a map between their tangent bundles df:TXTYd f : T X \to T Y; this operation extends to a functor from C k+1C^{k+1} manifolds and C k+1C^{k+1} maps to C kC^k manifolds and C kC^k maps. See differentiation for more.

Examples and non-examples

Differentiability versus partial differentiability

The function f: 2f:\mathbb{R}^2 \to \mathbb{R} defined by

f(x,y)={y 3x 2+y 2 (x,y)(0,0) 0 (x,y)=(0,0) f(x,y) = \begin{cases} \frac{y^3}{x^2+y^2} &\quad (x,y) \neq (0,0)\\ 0 &\quad (x,y) = (0,0) \end{cases}

is continuous everywhere, and has directional derivatives (def. ) at (0,0)(0,0) in all directions, but is not differentiable at (0,0)(0,0) in the sense of def. . Note that the (unnormalized) directional derivative along the vector (a,b)(a,b) is b 3a 2+b 2\frac{b^3}{a^2+b^2}, which is not linear.

One may wonder whether existence of a linear map df xd f_x is enough, but this is also not the case. The function f: 2f:\mathbb{R}^2 \to \mathbb{R} defined by

f(x,y)={y 3x x0 0 x=0 f(x,y) = \begin{cases} \frac{y^3}{x} &\quad x\neq 0\\ 0 &\quad x=0 \end{cases}

has all directional derivatives at (0,0)(0,0) equaling 00, so that in particular there is a linear map df (0,0)d f_{(0,0)} whose values are the directional derivatives. But it is not differentiable at (0,0)(0,0); in fact, it is not even continuous at (0,0)(0,0). (Thus it also provides an example of a discontinuous function which has all directional derivatives.)

The discontinuity kind of gives this last one away; maybe a continuous function with all directional derivatives must be differentiable? But no, because if WW is a Weierstrass function? (continuous everywhere but differentiable nowhere) from \mathbb{R} to \mathbb{R} with period 2π2\pi, and atan2\operatorname{atan2} is a 22-argument arctangent function from 2\mathbb{R}^2 to \mathbb{R} (so that x 2+y 2sinatan2(y,x)=y\sqrt{x^2+y^2} \,\sin\,\operatorname{atan2}(y,x) = y and x 2+y 2cosatan2(y,x)=x\sqrt{x^2+y^2} \,\cos\,\operatorname{atan2}(y,x) = x for all xx and yy), then set

f(x,y)=(x 2+y 2)W(atan2(y,x)). f(x,y) = (x^2 + y^2) W(\operatorname{atan2}(y,x)) .

Then ff is, like WW, continuous everywhere and differentiable nowhere, yet the directional derivatives through the origin are all 00. There are further examples (including one that is analytic except at the origin) at Math StackExchange question 1497043.

Differentiability versus continuous and higher differentiability

Let kk be a natural number, and consider the function

f k(x)x ksin(1/x) f_k(x) \coloneqq x^k \sin(1/x)

from the real line to itself, with f(0)0f(0) \coloneqq 0. Away from 00, f kf_k is smooth (even analytic); but at 00, f 0f_0 is not continuous, f 1f_1 is continuous but not differentiable, f 2f_2 is differentiable but not continuously differentiable, and so on:

  • If k=2nk = 2 n is even, then f kf_k is differentiable nn times but not continuously differentiable nn times;
  • If k=2n+1k = 2 n + 1 is odd, then f kf_k is continuously differentiable nn times but not differentiable n+1n + 1 times.

(For a fractional value of kk, f kf_k has the same behaviour as f kf_{\lceil{k}\rceil}.)

Similarly, in two dimensions we can consider functions such as

f(x,y)=(x 2+y 2)sin(1x 2+y 2). f(x,y) = (x^2+y^2)\sin(\frac{1}{\sqrt{x^2+y^2}}).

together with f(0,0)=0f(0,0) = 0. This is smooth away from 00, and once differentiable at 00, even in the strong sense that it is well-approximated by a linear function near 00. However, its derivative is not continuous at 00. In particular, this shows that the converse of the theorem “if the partial derivatives exist and are continuous at a point, then the function is differentiable there” fails, even in higher dimensions.

Uniform differentiability is stronger than continuous differentiability but independent of twice differentiability. For an example that is uniformly differentiable but not twice differentiable, use f 3f_3 again. For an example that that is twice differentiable (and hence continuously differentiable) but not uniformly differentiable, use xx 3x \mapsto x^3. However, on a compact domain, any continuously differentiable function (and a fortiori any twice differentiable function) must be uniformly continuous (at least in classical and intuitionistic mathematics).

Symmetry of the second partial derivatives

The function f: 2f:\mathbb{R}^2 \to \mathbb{R} defined by

f(x,y)=xy(x 2y 2)x 2+y 2 f(x,y) = \frac{x y (x^2-y^2)}{x^2+y^2}

plus f(0,0)=0f(0,0) = 0, has partial derivatives 2fxy\frac{\partial^2f}{\partial x \partial y} and 2fyx\frac{\partial^2f}{\partial y \partial x} that both exist but are not equal at (0,0)(0,0) (nor are they continuous at (0,0)(0,0)). Therefore, by the theorem proven above, it is not twice differentiable at (0,0)(0,0).

Twice differentiability versus quadratic approximation

While differentiability means approximability by a linear function, twice differentiability does not mean approximability by a quadratic function. For example, the function

f(x)=x 3sin(1/x) f(x) = x^3 \sin(1/x)

(with f(0)=0f(0) =0, to make it continuous there) is not twice differentiable at x=0x=0, but it is well-approximated by the quadratic polynomial p(x)=0p(x) = 0 in the sense that their difference is o(x 2)o(x^2) as x0x\to 0. Even worse, the function

f(x)=e 1/x 2sin(e 1/x 2) f(x) = e^{-1/x^2} \sin(e^{1/x^2})

is not twice differentiable at x=0x=0, but it is well-approximated by a polynomial of any finite degree nn (namely, the zero polynomial), in the sense that their difference is o(x n)o(x^n) as x0x\to 0.

In some contexts, it is useful to say that functions such as these have “pointwise second derivatives”. More precisely, we say that a function ff has a pointwise k thk^{th} derivative at aa if it is continuous at aa and there exists a polynomial pp of degree kk such that

lim xaf(x)p(x)(xa) k=0. \lim_{x\to a} \frac{f(x) - p(x)}{(x-a)^k} = 0.

(We have to state continuity explicitly in order to match the usual notion of differentiability when k=1k = 1, since nothing in this limit constrains the value of f(a)f(a).) In this case, the pointwise k thk^{th} derivative is f pt (k)(a)=p (k)(a)f^{(k)}_{pt}(a) = p^{(k)}(a) (and it follows that all lower-order pointwise derivatives match as well). Thus we would say that while f(x)=x 3sin(1/x)f(x) = x^3 \sin(1/x) does not have a second derivative at 00, it does have a pointwise second derivative at 00, and f pt(0)=0f_{pt}''(0) = 0. Similarly, e 1/x 2sin(e 1/x 2)e^{-1/x^2} \sin(e^{1/x^2}) is pointwise smooth. See e.g. this MSE answer.

The polynomial pp may be thought of as the kkth-order Taylor polynomial of ff at aa, and the limit above is Taylor’s theorem with the Peano remainder. (Thus the subscript ptpt can be thought of as standing for ‘Peano–Taylor’ as well as for ‘pointwise’.) Note that while kk-times pointwise differentiability is weaker than kk-times differentiability for k>1k \gt 1 and equivalent for k=1k = 1, it is stronger for k=0k = 0 (since even 00-times pointwise differentiability requires continuity).

The second derivative as a quadratic form

In the definition of strong twice-differentiability, we cannot replace the symmetric bilinear map 2f x(v,w)\partial^2 f_x(v,w) by the corresponding quadratic form Q x(v)= 2f x(v,v)Q_x(v) = \partial^2f_x(v,v). In other words, if we suppose that

f(x+2v)2f(x+v)+f(x)=Q x(v)+E(v)|v| 2f(x+2v) - 2 f(x+v) + f(x) = Q_x(v) + E(v) {|v|}^2

where lim v0E(v)=0\lim_{v\to 0} E(v) = 0, for some quadratic form Q xQ_x, it does not follow that ff is twice differentiable at xx, even in one dimension. The same old counterexample

f(x)=x 3sin(1/x) f(x) = x^3 \sin(1/x)

(with f(0)=0f(0)=0) works: we have

f(0+2v)2f(0+v)+f(0)=2v 3(4sin(12v)sin(1v))f(0+2v) - 2 f(0+v) + f(0)= 2v^3(4\sin(\frac{1}{2v}) - \sin(\frac{1}{v}))

so E(v)=v(4sin(12v)sin(1v))0E(v) = v (4\sin(\frac{1}{2v}) - \sin(\frac{1}{v})) \to 0.

References

Early account, in the context of Cohomotopy, cobordism theory and the Pontryagin-Thom construction:

  • Lev Pontrjagin, Chapter I of: Smooth manifolds and their applications in Homotopy theory, Trudy Mat. Inst. im Steklov, No 45, Izdat. Akad. Nauk. USSR, Moscow, 1955 (AMS Translation Series 2, Vol. 11, 1959) (doi:10.1142/9789812772107_0001, pdf)

Last revised on December 1, 2023 at 20:14:06. See the history of this page for a list of all contributions to it.