Basics of Algebra, Topology, and Differential Calculus by Jean Gallier - HTML preview

/ Home / Mathematics (Academic) / Basics of Algebra, Topology, and Differential Calculus

PLEASE NOTE: This is an HTML preview only and some elements such as links or page numbers may be incorrect.
Download the book in PDF, ePub, Kindle for a complete version.

Chapter 25

The Rational Canonical Form and

Other Normal Forms

25.1

The Rational Canonical Form

Let E be a finite-dimensional vector space over a field K, and let f : E → E be an endomor-

phism of E. We know from Section 24.5 that there is a K[X]-module Ef associated with f ,

and that Mf is a finitely generated torsion module over the PID K[X]. In this chapter, we

show how Theorems from Sections 24.6 and 24.7 yield important results about the structure

of the linear map f .

Recall that the annihilator of a subspace V is an ideal (p) uniquely defined by a monic

polynomial p called the minimal polynomial of V .

Our first result is obtained by translating the primary decomposition theorem, Theorem

24.26. It is not too surprising that we obtain again Theorem 22.7!

Theorem 25.1. (Primary Decomposition Theorem) Let f : E → E be a linear map on the

finite-dimensional vector space E over the field K. Write the minimal polynomial m of f as

m = pr1

1 · · · prk ,

where the pi are distinct irreducible monic polynomials over K, and the ri are positive inte-

gers. Let

Wi = Ker (pi(f )ri),

i = 1, . . . , k.

Then

(a) E = W1 ⊕ · · · ⊕ Wk.

(b) Each Wi is invariant under f and the projection from W onto Wi is given by a poly-

nomial in f .

711

712

CHAPTER 25. NORMAL FORMS; THE RATIONAL CANONICAL FORM

Next, we apply the Invariant Factors Decomposition Theorem, Theorem 24.38, to Ef .

This theorem says that Ef is isomorphic to a direct sum

Ef ≈ K[X]/(p1) ⊕ · · · ⊕ K[X]/(pm)

of m ≤ n cyclic modules, where the pj are uniquely determined monic polynomials of degree

at least 1, such that

pm | pm−1 | · · · | p1.

Each cyclic module K[X]/(pi) is isomorphic to a cyclic subspace for f , say Vi, whose minimal

polynomial is pi.

It is customary to renumber the polynomials pi as follows. The n polynomials q1, . . . , qn

are defined by:

if 1 ≤ j ≤ n − m

qj(X) =

pn−j+1(X) if n − m + 1 ≤ j ≤ n.

Then, we see that

q1 | q2 | · · · | qn,

where the first n − m polynomials are equal to 1, and we have the direct sum

E = E1 ⊕ · · · ⊕ En,

where Ei is a cyclic subspace for f whose minimal polynomial is qi. In particular, Ei = (0)

for i = 1, . . . , n − m. Theorem 24.38 also says that the minimal polynomial of f is qn = p1.

We sum all this up in the following theorem.

Theorem 25.2. (Cyclic Decomposition Theorem, First Version) Let f : E → E be an

endomorphism on a K-vector space of dimension n.

There exist n monic polynomials

q1, . . . , qn ∈ K[X] such that

q1 | q2 | · · · | qn,

and E is the direct sum

E = E1 ⊕ · · · ⊕ En

of cyclic subspaces Ei = Z(ui; f ) for f , such that the minimal polynomial of the restriction

of f to Ei is qi. The polynomials qi satisfying the above conditions are unique, and qn is the

minimal polynomial of f .

In view of translation point (4) at the beginning of Section 24.5, we know that over the

basis

(ui, f(ui), . . . , f ni−1(ui))

25.1. THE RATIONAL CANONICAL FORM

713

of the cyclic subspace Ei = Z(ui; f ), with ni = deg(qi), the matrix of the restriction of f to

Ei is the companion matrix of pi(X), of the form

0

· · · 0

−a 

· · · 0

−a



1 

0

· · · 0

−a 



2 

 .

 .

 .

. . ... ... ..

.. 







. .



0

. 0 −an 

i−2

· · · 1 −ani−1

If we put all these bases together, we obtain a block matrix whose blocks are of the above

form. Therefore, we proved the following result.

Theorem 25.3. (Rational Canonical Form, First Version) Let f : E → E be an endomor-

phism on a K-vector space of dimension n. There exist n monic polynomials q1, . . . , qn ∈

K[X] such that

q1 | q2 | · · · | qn,

with q1 = · · · = qn−m = 1, and a basis of E such that the matrix X of f is a block matrix of

the form

A



n−m+1

· · ·



An−m+2 · · ·

0 





X =



. .

..  ,







· · · A





n−1

0 

· · ·

where each Ai is the companion matrix of qi. The polynomials qi satisfying the above condi-

tions are unique, and qn is the minimal polynomial of f .

Definition 25.1. A matrix X as in Theorem 25.3 is called a matrix in rational form. The

polynomials q1, . . . , qn arising in Theorems 25.2 and 25.3 are called the similarity invariants

(or invariant factors) of f .

Theorem 25.3 shows that every matrix is similar to a matrix in rational form. Such a

matrix is unique.

By Proposition 24.20, two linear maps f and f are similar iff there is an isomorphism

between Ef and E , and thus by the uniqueness part of Theorem 24.38, iff they have the

same similarity invariants q1, . . . , qn.

Proposition 25.4. If E and E are two finite-dimensional vector spaces and if f : E → E

and f : E → E are two linear maps, then f and f are similar iff they have the same

similarity invariants.

The effect of extending the fied K to a field L is the object of the next proposition.

714

CHAPTER 25. NORMAL FORMS; THE RATIONAL CANONICAL FORM

Proposition 25.5. Let f : E → E be a linear map on a K-vector space E, and let (q1, . . . , qn)

be the similarity invariants of f . If L is a field extension of K (which means that K ⊆ L),

and if E(L) = L ⊗K E is the vector space obtained by extending the scalars, and f(L) = 1L ⊗ f

the linear map of E(L) induced by f , then the similarity invariants of f(L) are (q1, . . . , qn)

viewed as polynomials in L[X].

Proof. We know that Ef is isomorphic to the direct sum

Ef ≈ K[X]/(q1K[X]) ⊕ · · · ⊕ K[X]/(qnK[X]),

so by tensoring with L[X] and using Propositions 24.12 and 23.7, we get

L[X] ⊗K[X] Ef ≈ L[X] ⊗K[X] (K[X]/(q1K[X]) ⊕ · · · ⊕ K[X]/(qnK[X]))

≈ L[X] ⊗K[X] (K[X]/(q1K[X])) ⊕ · · · ⊕ L[X] ⊗K[X] (K[X]/(qnK[X]))

≈ (K[X]/(q1K[X])) ⊗K[X] L[X] ⊕ · · · ⊕ (K[X]/(qnK[X])) ⊗K[X] L[X].

However, by Proposition 24.14, we have isomorphisms

(K[X]/(qiK[X])) ⊗K[X] L[X] ≈ L[X]/(qiL[X]),

so we get

L[X] ⊗K[X] Ef ≈ L[X]/(q1L[X]) ⊕ · · · ⊕ L[X]/(qnL[X]).

Since Ef is a K[X]-module, the L[X] module L[X] ⊗K[X] Ef is the module obtained from

Ef by the ring extension K[X] ⊆ L[X], and since f is a K[X]-linear map of Ef , it becomes

f(L[X]) on L[X] ⊗K[X] Ef , which is the same as f(L) viewed as an L-linear map of the space

E(L) = L ⊗K E, so L[X] ⊗K[X] Ef is actually isomorphic to E(L)f , and we have

(L)

E(L)f

≈ L[X]/(q

(L)

1L[X ]) ⊕ · · · ⊕ L[X ]/(qnL[X ]),

which shows that (q1, . . . , qn) are the similarity invariants of f(L).

Proposition justifies the terminology “invariant” in similarity invariants. Indeed, under

a field extension K ⊆ L, the similarity invariants of f(L) remain the same. This is not true

of the elementary divisors, which depend on the field; indeed, an irreducible polynomial

p ∈ K[X] may split over L[X]. Since qn is the minimal polynomial of f, the above reasoning

also shows that the minimal polynomial of f(L) remains the same under a field extension.

Proposition 25.5 has the following corollary.

Proposition 25.6. Let K be a field and let L ⊇ K be a field extension of K. For any

two square matrices X and Y over K, if there is an invertible matrix Q over L such that

Y = QXQ−1, then there is an invertible matrix P over K such that Y = P XP −1.

25.1. THE RATIONAL CANONICAL FORM

715

Recall from Theorem 24.21 that the sequence of K[X]-linear maps

/ E[X]

/ E[X] σ / E

is exact, and as a consequence, Ef is isomorphic to the quotient of E[X] by Im(X1 − f).

Furthermore, because E is a vector space, E[X] is a free module with basis (1⊗u1, . . . , 1⊗un),

where (u1, . . . , un) is a basis of E. By Theorem 24.38, we have an isomorphism

Ef ≈ K[X]/(q1K[X]) ⊕ · · · ⊕ K[X]/(qnK[X]),

and by Proposition 24.39, E[X]/Im(X1 − f) isomorphic to a direct sum

E[X]/Im(X1 − f) ≈ K[X]/(p1K[X]) ⊕ · · · ⊕ K[X]/(pmK[X]),

where p1, . . . , pm are the invariant factors of Im(X1−f) with respect to E[X]. Since E[X] ≈

E[X]/Im(X1 − f), by the uniqueness part of Theorem 24.38 and because the polynomials

are monic, we must have m = n and pi = qi, for i = 1, . . . , n. Therefore, we proved the

following crucial fact:

Proposition 25.7. For any linear map f : E → E over a K-vector space E of dimension n,

the similarity invariants of f are equal to the invariant factors of Im(X1 − f) with respect

to E[X].

Proposition 25.7 is the key to computing the similarity invariants of a linear map. This

can be done using a procedure to convert XI − U to its Smith normal form. Propositions

25.7 and 24.44 yield the following result.

Proposition 25.8. For any linear map f : E → E over a K-vector space E of dimension n,

if (q1, . . . , qn) are the similarity invariants of f , for any matrix U representing f with respect

to any basis, then for k = 1, . . . , n the product

dk(X) = q1(X) · · · qk(X)

is the gcd of the k × k-minors of the matrix XI − U.

Note that the matrix XI − U is nonother than the matrix that yields the characteristic

polynomial χf (X) = det(XI − U) of f.

Proposition 25.9. For any linear map f : E → E over a K-vector space E of dimension

n, if (q1, . . . , qn) are the similarity invariants of f , then the following properties hold:

(1) If χf (X) is the characteristic polynomial of f , then

χf (X) = q1(X) · · · qn(X).

716

CHAPTER 25. NORMAL FORMS; THE RATIONAL CANONICAL FORM

(2) The minimal polynomial m(X) = qn(X) of f divides the characteristic polynomial

χf X) of f .

(3) The characteristic polynomial χf X) divides m(X)n.

(4) E is cyclic for f iff m(X) = χ(X).

Proof. Property (1) follows from Proposition 25.8 for k = n. It also follows from Theorem

25.3 and the fact that for the companion matrix associated with qi, the characteristic poly-

nomial of this matrix is also qi. Property (2) is obvious from (1). Since each qi divides qi+1,

each qi divides qn, so their product χf (X) divides m(X)n = qn(X)n. The last condition says

that q1 = · · · = qn−1 = 1, which means that Ef has a single summand.

Observe that Proposition 25.9 yields another proof of the Cayley–Hamilton Theorem. It

also implies that a linear map is nilpotent iff its characteristic polynomial is Xn.

25.2

The Rational Canonical Form, Second Version

Let us now translate the Elementary Divisors Decomposition Theorem, Theorem 24.45, in

terms of Ef . We obtain the following result.

Theorem 25.10. (Cyclic Decomposition Theorem, Second Version) Let f : E → E be an

endomorphism on a K-vector space of dimension n. Then, E is the direct sum of of cyclic

subspaces Ej = Z(uj; f ) for f , such that the minimal polynomial of Ej is of the form pni,j ,

for some irreducible monic polynomials p1, . . . , pt ∈ K[X] and some positive integers ni,j,

such that for each i = 1, . . . , t, there is a sequence of integers

1 ≤ ni,1, . . . , ni,1 < ni,2, . . . , ni,2 < · · · < ni,s , . . . , n ,

i,si

mi,1

mi,2

mi,si

with si ≥ 1, and where ni,j occurs mi,j ≥ 1 times, for j = 1, . . . , si. Furthermore, the monic

polynomials pi and the integers r, t, ni,j, si, mi,j are uniquely determined.

Note that there are µ =

mi,j cyclic subspaces Z(uj; f ). Using bases for the cyclic

subspaces Z(uj; f ) as in Theorem 25.3, we get the following theorem.

Theorem 25.11. (Rational Canonical Form, Second Version) Let f : E → E be an en-

domorphism on a K-vector space of dimension n. There exist t distinct irreducible monic

polynomials p1, . . . , pt ∈ K[X] and some positive integers ni,j, such that for each i = 1, . . . , t,

there is a sequence of integers

1 ≤ ni,1, . . . , ni,1 < ni,2, . . . , ni,2 < · · · < ni,s , . . . , n ,

i,si

mi,1

mi,2

mi,si

25.3. THE JORDAN FORM REVISITED

717

with si ≥ 1, and where ni,j occurs mi,j ≥ 1 times, for j = 1, . . . , si, and there is a basis of E

such that the matrix X of f is a block matrix of the form

A



· · ·

 0

A2 · · ·

0 





X =

 ..

. .

..  ,





 0

· · · A





µ−1

0 

· · ·

Aµ

where each Aj is the companion matrix of some pni,j , and µ =

i,j . The monic polyno-

mials p1, . . . , pt and the integers r, t, ni,j, si, mi,j are uniquely determined

The polynomials pni,j are called the elementary divisors of f (and X). These polynomials

are factors of the minimal polynomial.

As we pointed earlier, unlike the similarity invariants, the elementary divisors may change

when we pass to a field extension.

We will now consider the special case where all the irreducible polynomials pi are of the

form X − λi; that is, when are the eigenvalues of f belong to K. In this case, we find again

the Jordan form.

25.3

The Jordan Form Revisited

In this section, we assume that all the roots of the minimal polynomial of f belong to K.

This will be the case if K is algebraically closed. The irreducible polynomials pi of Theorem

25.10 are the polynomials X − λi, for the distinct eigenvalues λi of f. Then, each cyclic

subspace Z(uj; f ) has a minimal polynomial of the form (X − λ)m, for some eigenvalue λ of

f and some m ≥ 1. It turns out that by choosing a suitable basis for the cyclic subspace

Z(uj; f ), the matrix of the restriction of f to Z(uj; f ) is a Jordan block.

Proposition 25.12. Let E be a finite-dimensional K-vector space and let f : E → E be a

linear map. If E is a cyclic K[X]-module and if (X − λ)n is the minimal polynomial of f,

then there is a basis of E of the form

((f − λid)n−1(u), (f − λid)n−2(u), . . . , (f − λid)(u), u),

for some u ∈ E. With respect to this basis, the matrix of f is the Jordan block

λ 1

· · · 0

0 λ

· · · 0





 .

. 





n(λ) =







. .



0

1

0 0

· · · λ

718

CHAPTER 25. NORMAL FORMS; THE RATIONAL CANONICAL FORM

Proof. Since E is a cyclic K[X]-module, there is some u ∈ E so that E is generated by

u, f (u), f 2(u), . . ., which means that every vector in E is of the form p(f )(u), for some

polynomial, p(X). We claim that u, f (u), . . . , f n−2(u), f n−1(u) generate E, which implies

that the dimension of E is at most n.

This is because if p(X) is any polynomial of degree at least n, then we can divide p(X)

by (X − λ)n, obtaining

p = (X − λ)nq + r,

where 0 ≤ deg(r) < n, and as (X − λ)n annihilates E, we get

p(f )(u) = r(f )(u),

which means that every vector of the form p(f )(u) with p(X) of degree ≥ n is actually a

linear combination of u, f (u), . . . , f n−2(u), f n−1(u).

We claim that the vectors

u, (f − λid)(u), . . . , (f − λid)n−2(u)(f − λid)n−1(u)

are linearly independent. Indeed, if we had a nontrivial linear combination

a0(f − λid)n−1(u) + a1(f − λid)n−2(u) + · · · + an−2(f − λid)(u) + an−1u = 0,

then the polynomial

a0(X − λ)n−1 + a1(X − λ)n−2 + · · · + an−2(X − λ) + an−1

of degree at most n − 1 would annihilate E, contradicting the fact that (X − λ)n is the

minimal polynomial of f (and thus, of smallest degree). Consequently, as the dimension of

E is at most n,

((f − λid)n−1(u), (f − λid)n−2(u), . . . , (f − λid)(u), u),

is a basis of E and since u, f (u), . . . , f n−2(u), f n−1(u) span E,

(u, f (u), . . . , f n−2(u), f n−1(u))

is also a basis of E.

Let us see how f acts on the basis

((f − λid)n−1(u), (f − λid)n−2(u), . . . , (f − λid)(u), u).

If we write f = f − λid + λid, as (f − λid)n annihilates E, we get

f ((f − λid)n−1(u)) = (f − λid)n(u) + λ(f − λid)n−1(u) = λ(f − λid)n−1(u)

and

f ((f − λid)k(u)) = (f − λid)k+1(u) + λ(f − λid)k(u),

0 ≤ k ≤ n − 2.

But this means precisely that the matrix of f in this basis is the Jordan block Jn(λ).

25.4. THE SMITH NORMAL FORM

719

Combining Theorem 25.11 and Proposition 25.12, we obtain a strong version of the

Jordan form.

Theorem 25.13. (Jordan Canonical Form) Let E be finite-dimensional K-vector space.

The following properties are equivalent:

(1) The eigenvalues of f all belong to K.

(2) There is a basis of E in which the matrix of f is upper (or lower) triangular.

(3) There exist a basis of E in which the matrix A of f is Jordan matrix. Furthermore, the

number of Jordan blocks Jr(λ) appearing in A, for fixed r and λ, is uniquely determined

by f .

Proof. The implication (1) =⇒ (3) follows from Theorem 25.11 and Proposition 25.12. The

implications (3) =⇒ (2) and (2) =⇒ (1) are trivial.

Compared to Theorem 22.14, the new ingredient is the uniqueness assertion in (3), which

is not so easy to prove.

Observe that the minimal polynomial of f is the least common multiple of the polynomials

(X − λ)r associated with the Jordan blocks Jr(λ) appearing in A, and the characteristic

polynomial of A is the product of these polynomials.

We now return to the problem of computing effectively the similarity invariants of a

matrix A. By Proposition 25.7, this is equivalent to computing the invariant factors of

XI − A. In principle, this can be done using Proposition 24.42. A procedure to do this

effectively for the ring A = K[X] is to convert XI − A to its Smith normal form. This will

also yield the rational canonical form for A.

25.4

The Smith Normal Form

The Smith normal form is the special case of Proposition 24.42 applied to the PID K[X]

where K is a field, but it also says that the matrices P and Q are products of elementary

matrices. It turns out that such a result holds for any Euclidean ring, and the proof is

basically the same.

Recall from Definition 20.9 that a Euclidean ring is an integral domain A such that there

exists a function σ : A → N with the following property: For all a, b ∈ A with b = 0, there

are some q, r ∈ A such that

a = bq + r

and σ(r) < σ(b).

Note that the pair (q, r) is not necessarily unique.

We make use of the elementary row and column operations P (i, k), Ei,j;β, and Ei,λ de-

scribed in Chapter 6, where we require the scalar λ used in Ei,λ to be a unit.

720

CHAPTER 25. NORMAL FORMS; THE RATIONAL CANONICAL FORM

Theorem 25.14. If M is an m × n matrix over a Euclidean ring A, then there exist some

invertible n × n matrix P and some invertible m × m matrix Q, where P and Q are products

of elementary matrices, and a m × n matrix D of the form

α



· · ·

0 · · · 0



· · ·

0 · · · 0

 ..

. .

..

 .

· · · .





D =  0

· · · α



0 · · · 0





 0

· · ·

0 · · · 0





 .

.. ..

..

 .

· · ·

. 

· · ·

0 · · · 0

for some nonzero αi ∈ A, such that

(1) α1 | α2 | · · · | αr, and

(2) M = QDP −1.

Proof. We follow Jacobson’s proof [57] (Chapter 3, Theorem 3.8). We proceed by induction

on m + n.

If m = n = 1, let P = (1) and Q = (1).

For the induction step, if M = 0, let P = In and Q = Im. If M = 0, the stategy is to

apply a sequence of elementary transformations that converts M to a matrix of the form

α



0 · · · 0

 0



M =  .



 ..