Basics of Algebra, Topology, and Differential Calculus by Jean Gallier - HTML preview

PLEASE NOTE: This is an HTML preview only and some elements such as links or page numbers may be incorrect.
Download the book in PDF, ePub, Kindle for a complete version.

The following results can be shown.

Proposition 28.15. Let A be an open subset of

n

m

R , and let f : A → R

be a function.

For every a ∈ A, f : A → m

R

is a submersion at a iff there exists an open subset U of A

containing a, an open subset W ⊆ n−m

R

, and a diffeomorphism ϕ : U → f(U) × W , such

that,

f = π1 ◦ ϕ,

where π1 : f (U ) × W → f(U) is the first projection. Equivalently,

(f ◦ ϕ−1)(y1, . . . , ym, . . . , yn) = (y1, . . . , ym).

ϕ

U ⊆ A

/

f

&◆

f (U ) × W

π1

f (U ) ⊆ m

R

Futhermore, the image of every open subset of A under f is an open subset of F . (The same

result holds for

n

m

C

and C ).

Proposition 28.16. Let A be an open subset of

n

m

R , and let f : A → R

be a function.

For every a ∈ A, f : A →

m

R

is an immersion at a iff there exists an open subset U of

A containing a, an open subset V containing f (a) such that f (U ) ⊆ V , an open subset W

containing 0 such that W ⊆ m−n

R

, and a diffeomorphism ϕ : V → U × W , such that,

ϕ ◦ f = in1,

798

CHAPTER 28. DIFFERENTIAL CALCULUS

where in1 : U → U × W is the injection map such that in1(u) = (u, 0), or equivalently,

(ϕ ◦ f)(x1, . . . , xn) = (x1, . . . , xn, 0, . . . , 0).

f

U ⊆ A

/

in1

&▼

f (U ) ⊆ V

ϕ

U × W

(The same result holds for

n

m

C

and C ).

28.4

Tangent Spaces and Differentials

In this section, we discuss briefly a geometric interpretation of the notion of derivative. We

consider sets of points defined by a differentiable function. This is a special case of the notion

of a (differential) manifold.

Given two normed affine spaces E and F , let A be an open subset of E, and let f : A → F

be a function.

Definition 28.9. Given f : A → F as above, its graph Γ(f) is the set of all points

Γ(f ) = {(x, y) ∈ E × F | x ∈ A, y = f(x)}.

If Df is defined on A, we say that Γ(f ) is a differential submanifold of E × F of equation

y = f (x).

It should be noted that this is a very particular kind of differential manifold.

Example 28.3. If E =

2

R and F = R , letting f = (g, h), where g : R → R and h : R → R,

Γ(f ) is a curve in

3

2

R , of equations y = g(x), z = h(x). When E = R and F = R, Γ(f ) is a

surface in

3

R , of equation z = f (x, y).

We now define the notion of affine tangent space in a very general way. Next, we will see

what it means for manifolds Γ(f ), as in Definition 28.9.

Definition 28.10. Given a normed affine space E, given any nonempty subset M of E,

given any point a ∈ M, we say that a vector u ∈ E is tangent at a to M if there exist a

sequence (an)n∈ of points in M converging to a, and a sequence (λ

, with λ

N

n)n∈N

i ∈ R and

λn ≥ 0, such that the sequence (λn(an − a))n∈ converges to u.

N

The set of all vectors tangent at a to M is called the family of tangent vectors at a to

M and the set of all points of E of the form a + u where u belongs to the family of tangent

vectors at a to M is called the affine tangent family at a to M .

28.5. SECOND-ORDER AND HIGHER-ORDER DERIVATIVES

799

Clearly, 0 is always tangent, and if u is tangent, then so is every λu, for λ ∈ R, λ ≥ 0. If

u = 0, then the sequence (λn)n∈ must tend towards +∞. We have the following proposition.

N

Proposition 28.17. Let E and F be two normed affine spaces, let A be an open subset of

E, let a ∈ A, and let f : A → F be a function. If Df(a) exists, then the family of tangent

vectors at (a, f (a)) to Γ is a subspace Ta(Γ) of E × F , defined by the condition (equation)

(u, v) ∈ Ta(Γ) iff v = Df(a)(u),

and the affine tangent family at (a, f (a)) to Γ is an affine variety Ta(Γ) of E × F , defined

by the condition (equation)

(x, y) ∈ Ta(Γ) iff y = f(a) + Df(a)(x − a),

where Γ is the graph of f .

The proof is actually rather simple. We have Ta(Γ) = a + Ta(Γ), and since Ta(Γ) is a

subspace of E × F , the set Ta(Γ) is an affine variety. Thus, the affine tangent space at a

point (a, f (a)) is a familar object, a line, a plane, etc.

As an illustration, when E = 2

R and F = R, the affine tangent plane at the point (a, b, c)

to the surface of equation z = f (x, y), is defined by the equation

∂f

∂f

z = c +

(a, b)(x − a) +

(a, b)(y − b).

∂x

∂y

If E =

2

R and F = R , the tangent line at (a, b, c), to the curve of equations y = g(x),

z = h(x), is defined by the equations

y = b + Dg(a)(x − a),

z = c + Dh(a)(x − a).

Thus, derivatives and partial derivatives have the desired intended geometric interpreta-

tion as tangent spaces. Of course, in order to deal with this topic properly, we really would

have to go deeper into the study of (differential) manifolds.

We now briefly consider second-order and higher-order derivatives.

28.5

Second-Order and Higher-Order Derivatives

Given two normed affine spaces E and F , and some open subset A of E, if Df (a) is defined

for every a ∈ A, then we have a mapping Df : A → L(E; F ). Since L(E; F ) is a normed

vector space, if Df exists on an open subset U of A containing a, we can consider taking

the derivative of Df at some a ∈ A. If D(Df)(a) exists for every a ∈ A, we get a mapping

800

CHAPTER 28. DIFFERENTIAL CALCULUS

D2f : A → L(E; L(E; F )), where D2f(a) = D(Df)(a), for every a ∈ A. If D2f(a) exists,

then for every u ∈ E,

D2f (a)(u) = D(Df )(a)(u) = Du(Df )(a) ∈ L(E; F ).

Recall from Proposition 26.46, that the map app from L(E; F ) × E to F , defined such

that for every L ∈ L(E; F ), for every v ∈ E,

app(L, v) = L(v),

is a continuous bilinear map. Thus, in particular, given a fixed v ∈ E, the linear map

appv : L(E; F ) → F , defined such that appv(L) = L(v), is a continuous map.

Also recall from Proposition 28.6, that if h : A → G is a function such that Dh(a) exits,

and k : G → H is a continuous linear map, then, D(k ◦ h)(a) exists, and

k(Dh(a)(u)) = D(k ◦ h)(a)(u),

that is,

k(Duh(a)) = Du(k ◦ h)(a),

Applying these two facts to h = Df , and to k = appv, we have

Du(Df )(a)(v) = Du(appv ◦ Df)(a).

But (appv ◦ Df)(x) = Df(x)(v) = Dvf(x), for every x ∈ A, that is, appv ◦ Df = Dvf on A.

So, we have

Du(Df )(a)(v) = Du(Dvf )(a),

and since D2f (a)(u) = Du(Df )(a), we get

D2f (a)(u)(v) = Du(Dvf )(a).

Thus, when D2f (a) exists, Du(Dvf )(a) exists, and

D2f (a)(u)(v) = Du(Dvf )(a),

for all u, v ∈ E. We also denote Du(Dvf)(a) by D2u,vf(a), or DuDvf(a).

Recall from Proposition 26.45, that the map from L2(E, E; F ) to L(E; L(E; F )) defined

such that g → ϕ iff for every g ∈ L2(E, E; F ),

ϕ(u)(v) = g(u, v),

is an isomorphism of vector spaces. Thus, we will consider D2f (a) ∈ L(E; L(E; F )) as a con-

tinuous bilinear map in L2(E, E; F ), and we will write D2f(a)(u, v), instead of D2f(a)(u)(v).

28.5. SECOND-ORDER AND HIGHER-ORDER DERIVATIVES

801

Then, the above discussion can be summarized by saying that when D2f (a) is defined,

we have

D2f (a)(u, v) = DuDvf (a).

When E has finite dimension and (a0, (e1, . . . , en)) is a frame for E, we denote De D f (a)

j

ei

∂2f

∂2f

by

(a), when i = j, and we denote D D f (a) by

(a).

∂x

ei

ei

i∂xj

∂x2i

The following important lemma attributed to Schwarz can be shown, using Lemma 28.11.

Given a bilinear map f : E × E → F , recall that f is symmetric, if

f (u, v) = f (v, u),

for all u, v ∈ E.

Lemma 28.18. (Schwarz’s lemma) Given two normed affine spaces E and F , given any

open subset A of E, given any f : A → F , for every a ∈ A, if D2f(a) exists, then D2f(a) ∈

L2(E, E; F ) is a continuous symmetric bilinear map. As a corollary, if E is of finite dimen-

sion n, and (a0, (e1, . . . , en)) is a frame for E, we have

∂2f

∂2f

(a) =

(a).

∂xi∂xj

∂xj∂xi

Remark: There is a variation of the above lemma which does not assume the existence of

D2f (a), but instead assumes that DuDvf and DvDuf exist on an open subset containing a

and are continuous at a, and concludes that DuDvf (a) = DvDuf (a). This is just a different

result which does not imply Lemma 28.18, and is not a consequence of Lemma 28.18.

∂2f

∂2f

When E = 2

R , the only existence of

(a) and

(a) is not sufficient to insure the

∂x∂y

∂y∂x

existence of D2f (a).

When E if of finite dimension n and (a0, (e1, . . . , en)) is a frame for E, if D2f (a) exists,

for every u = u1e1 + · · · + unen and v = v1e1 + · · · + vnen in E, since D2f(a) is a symmetric

bilinear form, we have

n

∂2f

D2f (a)(u, v) =

uivj

(a),

∂x

i=1,j=1

i∂xj

which can be written in matrix form as:

∂2f

∂2f

∂2f

(a)

(a) . . .

(a)

∂x21

∂x1∂x2

∂x1∂xn

∂2f

∂2f

∂2f

(a)

(a)

. . .

(a)

D2f (a)(u, v) = U  ∂x

∂x2

∂x

1∂x2

2

2∂xn

 V

..

..

. .

..

.

.

.

.

∂2f

∂2f

∂2f

(a)

(a) . . .

(a) 

∂x1∂xn

∂x2∂xn

∂x2n

802

CHAPTER 28. DIFFERENTIAL CALCULUS

where U is the column matrix representing u, and V is the column matrix representing v,

over the frame (a0, (e1, . . . , en)).

The above symmetric matrix is called the Hessian of f at a. If F itself is of finite

dimension, and (b0, (v1, . . . , vm)) is a frame for F , then f = (f1, . . . , fm), and each component

D2f (a)i(u, v) of D2f (a)(u, v) (1 ≤ i ≤ m), can be written as

∂2f

i

∂2f

∂2f

(a)

i

(a) . . .

i

(a)

∂x21

∂x1∂x2

∂x1∂xn

∂2f

i

∂2fi

∂2fi

(a)

(a)

. . .

(a)

D2f (a)

 ∂x

∂x2

∂x

i(u, v) = U

1∂x2

2

2∂xn

 V

..

..

. .

..

.

.

.

.

∂2f

∂2f

∂2f

i

(a)

i

(a) . . .

i (a) 

∂x1∂xn

∂x2∂xn

∂x2n

Thus, we could describe the vector D2f (a)(u, v) in terms of an mn×mn-matrix consisting

of m diagonal blocks, which are the above Hessians, and the row matrix (U , . . . , U ) (m

times) and the column matrix consisting of m copies of V .

We now indicate briefly how higher-order derivatives are defined. Let m ≥ 2. Given

a function f : A → F as before, for any a ∈ A, if the derivatives Dif exist on A for all

i, 1 ≤ i ≤ m − 1, by induction, Dm−1f can be considered to be a continuous function

Dm−1f : A → Lm−1(Em−1; F ), and we define

Dmf (a) = D(Dm−1f )(a).

Then, Dmf (a) can be identified with a continuous m-multilinear map in Lm(Em; F ). We

can then show (as we did before), that if Dmf (a) is defined, then

Dmf (a)(u1, . . . , um) = Du . . . D f (a).

1

um

When E if of finite dimension n and (a0, (e1, . . . , en)) is a frame for E, if Dmf (a) exists,

for every j1, . . . , jm ∈ {1, . . . , n}, we denote De . . . D f(a) by

j

e

m

j1

∂mf

(a).

∂xj . . . ∂x

1

jm

Given a m-multilinear map f ∈ Lm(Em; F ), recall that f is symmetric if

f (uπ(1), . . . , uπ(m)) = f(u1, . . . , um),

for all u1, . . . , um ∈ E, and all permutations π on {1, . . . , m}. Then, the following general-

ization of Schwarz’s lemma holds.

28.5. SECOND-ORDER AND HIGHER-ORDER DERIVATIVES

803

Lemma 28.19. Given two normed affine spaces E and F , given any open subset A of E,

given any f : A → F , for every a ∈ A, for every m ≥ 1, if Dmf(a) exists, then Dmf(a) ∈

Lm(Em; F ) is a continuous symmetric m-multilinear map. As a corollary, if E is of finite

dimension n, and (a0, (e1, . . . , en)) is a frame for E, we have

∂mf

∂mf

(a) =

(a),

∂xj . . . ∂x

∂x

1

jm

π(j1) . . . ∂xπ(jm)

for every j1, . . . , jm ∈ {1, . . . , n}, and for every permutation π on {1, . . . , m}.

If E is of finite dimension n, and (a0, (e1, . . . , en)) is a frame for E, Dmf (a) is a symmetric

m-multilinear map, and we have

∂mf

Dmf (a)(u1, . . . , um) =

u1,j · · · u

(a),

1

m,jm ∂x . . . ∂x

j

j1

jm

where j ranges over all functions j : {1, . . . , m} → {1, . . . , n}, for any m vectors

uj = uj,1e1 + · · · + uj,nen.

The concept of C1-function is generalized to the concept of Cm-function, and Theorem

28.12 can also be generalized.

Definition 28.11. Given two normed affine spaces E and F , and an open subset A of E,

for any m ≥ 1, we say that a function f : A → F is of class Cm on A or a Cm-function on

A if Dkf exists and is continuous on A for every k, 1 ≤ k ≤ m. We say that f : A → F

is of class C∞ on A or a C∞-function on A if Dkf exists and is continuous on A for every

k ≥ 1. A C∞-function (on A) is also called a smooth function (on A). A Cm-diffeomorphism

f : A → B between A and B (where A is an open subset of E and B is an open subset

of B) is a bijection between A and B = f (A), such that both f : A → B and its inverse

f −1 : B → A are Cm-functions.

Equivalently, f is a Cm-function on A if f is a C1-function on A and Df is a Cm−1-

function on A.

We have the following theorem giving a necessary and sufficient condition for f to a

Cm-function on A. A generalization to the case where E = (E1, a1) ⊕ · · · ⊕ (En, an) also

holds.

Theorem 28.20. Given two normed affine spaces E and F , where E is of finite dimension

n, and where (a0, (u1, . . . , un)) is a frame of E, given any open subset A of E, given any

function f : A → F , for any m ≥ 1, the derivative Dmf is a Cm-function on A iff every

∂kf

partial derivative Du . . . D

f (or

(a)) is defined and continuous on A, for all

j

u

k

j1

∂xj . . . ∂x

1

jk

804

CHAPTER 28. DIFFERENTIAL CALCULUS

k, 1 ≤ k ≤ m, and all j1, . . . , jk ∈ {1, . . . , n}. As a corollary, if F is of finite dimension p,

and (b0, (v1, . . . , vp)) is a frame of F , the derivative Dmf is defined and continuous on A iff

∂kf

every partial derivative D

i

u

. . . D

f

(a)) is defined and continuous on A,

j

u

i (or

k

j1

∂xj . . . ∂x

1

jk

for all k, 1 ≤ k ≤ m, for all i, 1 ≤ i ≤ p, and all j1, . . . , jk ∈ {1, . . . , n}.

When E = R (or E = C), for any a ∈ E, Dmf(a)(1, . . . , 1) is a vector in F , called

the mth-order vector derivative. As in the case m = 1, we will usually identify the mul-

tilinear map Dmf (a) with the vector Dmf (a)(1, . . . , 1). Some notational conventions can

also be introduced to simplify the notation of higher-order derivatives, and we discuss such

conventions very briefly.

Recall that when E is of finite dimension n, and (a0, (e1, . . . , en)) is a frame for E, Dmf (a)

is a symmetric m-multilinear map, and we have

∂mf

Dmf (a)(u1, . . . , um) =

u1,j · · · u

(a),

1

m,jm ∂x . . . ∂x

j

j1

jm

where j ranges over all functions j : {1, . . . , m} → {1, . . . , n}, for any m vectors

uj = uj,1e1 + · · · + uj,nen.

We can then group the various occurrences of ∂xj corresponding to the same variable x ,

k

jk

and this leads to the notation

α1

α2

αn

· · ·

f (a),

∂x1

∂x2

∂xn

where α1 + α2 + · · · + αn = m.

If we denote (α1, . . . , αn) simply by α, then we denote

α1

α2

αn

· · ·

f

∂x1

∂x2

∂xn

by

α

∂αf,

or

f.

∂x

If α = (α1, . . . , αn), we let |α| = α1 + α2 + · · · + αn, α! = α1! · · · αn!, and if h = (h1, . . . , hn),

we denote hα1

1 · · · hαn

n

by hα.

In the next section, we survey various versions of Taylor’s formula.

28.6. TAYLOR’S FORMULA, FA À DI BRUNO’S FORMULA

805

28.6

Taylor’s formula, Faà di Bruno’s formula

We discuss, without proofs, several versions of Taylor’s formula. The hypotheses required in

each version become increasingly stronger. The first version can be viewed as a generalization

of the notion of derivative. Given an m-linear map f : Em → F , for any vector h ∈ E, we

abbreviate

f (h, . . . , h)

m

by f (hm). The version of Taylor’s formula given next is sometimes referred to as the formula

of Taylor–Young.

Theorem 28.21. (Taylor–Young) Given two normed affine spaces E and F , for any open

subset A ⊆ E, for any function f : A → F , for any a ∈ A, if Dkf exists in A for all k,

1 ≤ k ≤ m − 1, and if Dmf(a) exists, then we have:

1

1

f (a + h) = f (a) +

D1f (a)(h) + · · · +

Dmf (a)(hm) + h m (h),

1!

m!

for any h such that a + h ∈ A, and where limh→0, h=0 (h) = 0.

The above version of Taylor’s formula has applications to the study of relative maxima

(or minima) of real-valued functions. It is also used to study the local properties of curves

and surfaces.

The next version of Taylor’s formula can be viewed as a generalization of Lemma 28.11.

It is sometimes called the Taylor formula with Lagrange remainder or generalized mean value

theorem.

Theorem 28.22. (Generalized mean value theorem) Let E and F be two normed affine

spaces, let A be an open subset of E, and let f : A → F be a function on A. Given any

a ∈ A and any h = 0 in E, if the closed segment [a, a + h] is contained in A, Dkf exists in

A for all k, 1 ≤ k ≤ m, Dm+1f(x) exists at every point x of the open segment ]a, a + h[, and

max

Dm+1f (x) ≤ M,

x∈]a,a+h[

for some M ≥ 0, then

1

1

h m+1

f (a + h) − f(a) −

D1f (a)(h) + · · · +

Dmf (a)(hm)

≤ M

.

1!

m!

(m + 1)!

As a corollary, if L : Em+1 → F is a continuous (m + 1)-linear map, then

1

1

L(hm+1)

h m+1

f (a + h) − f(a) −

D1f (a)(h) + · · · +

Dmf (a)(hm) +

≤ M

,

1!

m!

(m + 1)!

(m + 1)!

where M = maxx∈]a,a+h[ Dm+1f(x) − L .

806

CHAPTER 28. DIFFERENTIAL CALCULUS

The above theorem is sometimes stated under the slightly stronger assumption that f is

a Cm-function on A. If f : A → R is a real-valued function, Theorem 28.22 can be refined a

little bit. This version is often called the formula of Taylor–MacLaurin.

Theorem 28.23. (Taylor–MacLaurin) Let E be a normed affine space, let A be an open

subset of E, and let f : A → R be a real-valued function on A. Given any a ∈ A and any

h = 0 in E, if the closed segment [a, a + h] is contained in A, if Dkf exists in A for all k,

1 ≤ k ≤ m, and Dm+1f(x) exists at every point x of the open segment ]a, a + h[, then there