Other Normal Forms
25.1
The Rational Canonical Form
Let E be a finite-dimensional vector space over a field K, and let f : E → E be an endomor-
phism of E. We know from Section 24.5 that there is a K[X]-module Ef associated with f ,
and that Mf is a finitely generated torsion module over the PID K[X]. In this chapter, we
show how Theorems from Sections 24.6 and 24.7 yield important results about the structure
of the linear map f .
Recall that the annihilator of a subspace V is an ideal (p) uniquely defined by a monic
polynomial p called the minimal polynomial of V .
Our first result is obtained by translating the primary decomposition theorem, Theorem
24.26. It is not too surprising that we obtain again Theorem 22.7!
Theorem 25.1. (Primary Decomposition Theorem) Let f : E → E be a linear map on the
finite-dimensional vector space E over the field K. Write the minimal polynomial m of f as
m = pr1
1 · · · prk ,
k
where the pi are distinct irreducible monic polynomials over K, and the ri are positive inte-
gers. Let
Wi = Ker (pi(f )ri),
i = 1, . . . , k.
Then
(a) E = W1 ⊕ · · · ⊕ Wk.
(b) Each Wi is invariant under f and the projection from W onto Wi is given by a poly-
nomial in f .
(c) The minimal polynomial of the restriction f | Wi of f to Wi is pri.
i
711
712
CHAPTER 25. NORMAL FORMS; THE RATIONAL CANONICAL FORM
Next, we apply the Invariant Factors Decomposition Theorem, Theorem 24.38, to Ef .
This theorem says that Ef is isomorphic to a direct sum
Ef ≈ K[X]/(p1) ⊕ · · · ⊕ K[X]/(pm)
of m ≤ n cyclic modules, where the pj are uniquely determined monic polynomials of degree
at least 1, such that
pm | pm−1 | · · · | p1.
Each cyclic module K[X]/(pi) is isomorphic to a cyclic subspace for f , say Vi, whose minimal
polynomial is pi.
It is customary to renumber the polynomials pi as follows. The n polynomials q1, . . . , qn
are defined by:
1
if 1 ≤ j ≤ n − m
qj(X) =
pn−j+1(X) if n − m + 1 ≤ j ≤ n.
Then, we see that
q1 | q2 | · · · | qn,
where the first n − m polynomials are equal to 1, and we have the direct sum
E = E1 ⊕ · · · ⊕ En,
where Ei is a cyclic subspace for f whose minimal polynomial is qi. In particular, Ei = (0)
for i = 1, . . . , n − m. Theorem 24.38 also says that the minimal polynomial of f is qn = p1.
We sum all this up in the following theorem.
Theorem 25.2. (Cyclic Decomposition Theorem, First Version) Let f : E → E be an
endomorphism on a K-vector space of dimension n.
There exist n monic polynomials
q1, . . . , qn ∈ K[X] such that
q1 | q2 | · · · | qn,
and E is the direct sum
E = E1 ⊕ · · · ⊕ En
of cyclic subspaces Ei = Z(ui; f ) for f , such that the minimal polynomial of the restriction
of f to Ei is qi. The polynomials qi satisfying the above conditions are unique, and qn is the
minimal polynomial of f .
In view of translation point (4) at the beginning of Section 24.5, we know that over the
basis
(ui, f(ui), . . . , f ni−1(ui))
25.1. THE RATIONAL CANONICAL FORM
713
of the cyclic subspace Ei = Z(ui; f ), with ni = deg(qi), the matrix of the restriction of f to
Ei is the companion matrix of pi(X), of the form
0
0
0
· · · 0
−a
0
1
0
0
· · · 0
−a
1
0
1
0
· · · 0
−a
2
.
.
.
.
.
.
.
. . ... ... ..
..
. .
0
0
0
. 0 −an
i−2
0
0
0
· · · 1 −ani−1
If we put all these bases together, we obtain a block matrix whose blocks are of the above
form. Therefore, we proved the following result.
Theorem 25.3. (Rational Canonical Form, First Version) Let f : E → E be an endomor-
phism on a K-vector space of dimension n. There exist n monic polynomials q1, . . . , qn ∈
K[X] such that
q1 | q2 | · · · | qn,
with q1 = · · · = qn−m = 1, and a basis of E such that the matrix X of f is a block matrix of
the form
A
n−m+1
0
· · ·
0
0
0
An−m+2 · · ·
0
0
X =
.
.
.
.
.
..
..
. .
..
.. ,
0
0
· · · A
n−1
0
0
0
· · ·
0
An
where each Ai is the companion matrix of qi. The polynomials qi satisfying the above condi-
tions are unique, and qn is the minimal polynomial of f .
Definition 25.1. A matrix X as in Theorem 25.3 is called a matrix in rational form. The
polynomials q1, . . . , qn arising in Theorems 25.2 and 25.3 are called the similarity invariants
(or invariant factors) of f .
Theorem 25.3 shows that every matrix is similar to a matrix in rational form. Such a
matrix is unique.
By Proposition 24.20, two linear maps f and f are similar iff there is an isomorphism
between Ef and E , and thus by the uniqueness part of Theorem 24.38, iff they have the
f
same similarity invariants q1, . . . , qn.
Proposition 25.4. If E and E are two finite-dimensional vector spaces and if f : E → E
and f : E → E are two linear maps, then f and f are similar iff they have the same
similarity invariants.
The effect of extending the fied K to a field L is the object of the next proposition.
714
CHAPTER 25. NORMAL FORMS; THE RATIONAL CANONICAL FORM
Proposition 25.5. Let f : E → E be a linear map on a K-vector space E, and let (q1, . . . , qn)
be the similarity invariants of f . If L is a field extension of K (which means that K ⊆ L),
and if E(L) = L ⊗K E is the vector space obtained by extending the scalars, and f(L) = 1L ⊗ f
the linear map of E(L) induced by f , then the similarity invariants of f(L) are (q1, . . . , qn)
viewed as polynomials in L[X].
Proof. We know that Ef is isomorphic to the direct sum
Ef ≈ K[X]/(q1K[X]) ⊕ · · · ⊕ K[X]/(qnK[X]),
so by tensoring with L[X] and using Propositions 24.12 and 23.7, we get
L[X] ⊗K[X] Ef ≈ L[X] ⊗K[X] (K[X]/(q1K[X]) ⊕ · · · ⊕ K[X]/(qnK[X]))
≈ L[X] ⊗K[X] (K[X]/(q1K[X])) ⊕ · · · ⊕ L[X] ⊗K[X] (K[X]/(qnK[X]))
≈ (K[X]/(q1K[X])) ⊗K[X] L[X] ⊕ · · · ⊕ (K[X]/(qnK[X])) ⊗K[X] L[X].
However, by Proposition 24.14, we have isomorphisms
(K[X]/(qiK[X])) ⊗K[X] L[X] ≈ L[X]/(qiL[X]),
so we get
L[X] ⊗K[X] Ef ≈ L[X]/(q1L[X]) ⊕ · · · ⊕ L[X]/(qnL[X]).
Since Ef is a K[X]-module, the L[X] module L[X] ⊗K[X] Ef is the module obtained from
Ef by the ring extension K[X] ⊆ L[X], and since f is a K[X]-linear map of Ef , it becomes
f(L[X]) on L[X] ⊗K[X] Ef , which is the same as f(L) viewed as an L-linear map of the space
E(L) = L ⊗K E, so L[X] ⊗K[X] Ef is actually isomorphic to E(L)f , and we have
(L)
E(L)f
≈ L[X]/(q
(L)
1L[X ]) ⊕ · · · ⊕ L[X ]/(qnL[X ]),
which shows that (q1, . . . , qn) are the similarity invariants of f(L).
Proposition justifies the terminology “invariant” in similarity invariants. Indeed, under
a field extension K ⊆ L, the similarity invariants of f(L) remain the same. This is not true
of the elementary divisors, which depend on the field; indeed, an irreducible polynomial
p ∈ K[X] may split over L[X]. Since qn is the minimal polynomial of f, the above reasoning
also shows that the minimal polynomial of f(L) remains the same under a field extension.
Proposition 25.5 has the following corollary.
Proposition 25.6. Let K be a field and let L ⊇ K be a field extension of K. For any
two square matrices X and Y over K, if there is an invertible matrix Q over L such that
Y = QXQ−1, then there is an invertible matrix P over K such that Y = P XP −1.
25.1. THE RATIONAL CANONICAL FORM
715
Recall from Theorem 24.21 that the sequence of K[X]-linear maps
ψ
0
/ E[X]
/ E[X] σ / E
/
f
0
is exact, and as a consequence, Ef is isomorphic to the quotient of E[X] by Im(X1 − f).
Furthermore, because E is a vector space, E[X] is a free module with basis (1⊗u1, . . . , 1⊗un),
where (u1, . . . , un) is a basis of E. By Theorem 24.38, we have an isomorphism
Ef ≈ K[X]/(q1K[X]) ⊕ · · · ⊕ K[X]/(qnK[X]),
and by Proposition 24.39, E[X]/Im(X1 − f) isomorphic to a direct sum
E[X]/Im(X1 − f) ≈ K[X]/(p1K[X]) ⊕ · · · ⊕ K[X]/(pmK[X]),
where p1, . . . , pm are the invariant factors of Im(X1−f) with respect to E[X]. Since E[X] ≈
E[X]/Im(X1 − f), by the uniqueness part of Theorem 24.38 and because the polynomials
are monic, we must have m = n and pi = qi, for i = 1, . . . , n. Therefore, we proved the
following crucial fact:
Proposition 25.7. For any linear map f : E → E over a K-vector space E of dimension n,
the similarity invariants of f are equal to the invariant factors of Im(X1 − f) with respect
to E[X].
Proposition 25.7 is the key to computing the similarity invariants of a linear map. This
can be done using a procedure to convert XI − U to its Smith normal form. Propositions
25.7 and 24.44 yield the following result.
Proposition 25.8. For any linear map f : E → E over a K-vector space E of dimension n,
if (q1, . . . , qn) are the similarity invariants of f , for any matrix U representing f with respect
to any basis, then for k = 1, . . . , n the product
dk(X) = q1(X) · · · qk(X)
is the gcd of the k × k-minors of the matrix XI − U.
Note that the matrix XI − U is nonother than the matrix that yields the characteristic
polynomial χf (X) = det(XI − U) of f.
Proposition 25.9. For any linear map f : E → E over a K-vector space E of dimension
n, if (q1, . . . , qn) are the similarity invariants of f , then the following properties hold:
(1) If χf (X) is the characteristic polynomial of f , then
χf (X) = q1(X) · · · qn(X).
716
CHAPTER 25. NORMAL FORMS; THE RATIONAL CANONICAL FORM
(2) The minimal polynomial m(X) = qn(X) of f divides the characteristic polynomial
χf X) of f .
(3) The characteristic polynomial χf X) divides m(X)n.
(4) E is cyclic for f iff m(X) = χ(X).
Proof. Property (1) follows from Proposition 25.8 for k = n. It also follows from Theorem
25.3 and the fact that for the companion matrix associated with qi, the characteristic poly-
nomial of this matrix is also qi. Property (2) is obvious from (1). Since each qi divides qi+1,
each qi divides qn, so their product χf (X) divides m(X)n = qn(X)n. The last condition says
that q1 = · · · = qn−1 = 1, which means that Ef has a single summand.
Observe that Proposition 25.9 yields another proof of the Cayley–Hamilton Theorem. It
also implies that a linear map is nilpotent iff its characteristic polynomial is Xn.
25.2
The Rational Canonical Form, Second Version
Let us now translate the Elementary Divisors Decomposition Theorem, Theorem 24.45, in
terms of Ef . We obtain the following result.
Theorem 25.10. (Cyclic Decomposition Theorem, Second Version) Let f : E → E be an
endomorphism on a K-vector space of dimension n. Then, E is the direct sum of of cyclic
subspaces Ej = Z(uj; f ) for f , such that the minimal polynomial of Ej is of the form pni,j ,
i
for some irreducible monic polynomials p1, . . . , pt ∈ K[X] and some positive integers ni,j,
such that for each i = 1, . . . , t, there is a sequence of integers
1 ≤ ni,1, . . . , ni,1 < ni,2, . . . , ni,2 < · · · < ni,s , . . . , n ,
i
i,si
mi,1
mi,2
mi,si
with si ≥ 1, and where ni,j occurs mi,j ≥ 1 times, for j = 1, . . . , si. Furthermore, the monic
polynomials pi and the integers r, t, ni,j, si, mi,j are uniquely determined.
Note that there are µ =
mi,j cyclic subspaces Z(uj; f ). Using bases for the cyclic
subspaces Z(uj; f ) as in Theorem 25.3, we get the following theorem.
Theorem 25.11. (Rational Canonical Form, Second Version) Let f : E → E be an en-
domorphism on a K-vector space of dimension n. There exist t distinct irreducible monic
polynomials p1, . . . , pt ∈ K[X] and some positive integers ni,j, such that for each i = 1, . . . , t,
there is a sequence of integers
1 ≤ ni,1, . . . , ni,1 < ni,2, . . . , ni,2 < · · · < ni,s , . . . , n ,
i
i,si
mi,1
mi,2
mi,si
25.3. THE JORDAN FORM REVISITED
717
with si ≥ 1, and where ni,j occurs mi,j ≥ 1 times, for j = 1, . . . , si, and there is a basis of E
such that the matrix X of f is a block matrix of the form
A
1
0
· · ·
0
0
0
A2 · · ·
0
0
X =
.
.
.
.
.
..
..
. .
..
.. ,
0
0
· · · A
µ−1
0
0
0
· · ·
0
Aµ
where each Aj is the companion matrix of some pni,j , and µ =
m
i
i,j . The monic polyno-
mials p1, . . . , pt and the integers r, t, ni,j, si, mi,j are uniquely determined
The polynomials pni,j are called the elementary divisors of f (and X). These polynomials
i
are factors of the minimal polynomial.
As we pointed earlier, unlike the similarity invariants, the elementary divisors may change
when we pass to a field extension.
We will now consider the special case where all the irreducible polynomials pi are of the
form X − λi; that is, when are the eigenvalues of f belong to K. In this case, we find again
the Jordan form.
25.3
The Jordan Form Revisited
In this section, we assume that all the roots of the minimal polynomial of f belong to K.
This will be the case if K is algebraically closed. The irreducible polynomials pi of Theorem
25.10 are the polynomials X − λi, for the distinct eigenvalues λi of f. Then, each cyclic
subspace Z(uj; f ) has a minimal polynomial of the form (X − λ)m, for some eigenvalue λ of
f and some m ≥ 1. It turns out that by choosing a suitable basis for the cyclic subspace
Z(uj; f ), the matrix of the restriction of f to Z(uj; f ) is a Jordan block.
Proposition 25.12. Let E be a finite-dimensional K-vector space and let f : E → E be a
linear map. If E is a cyclic K[X]-module and if (X − λ)n is the minimal polynomial of f,
then there is a basis of E of the form
((f − λid)n−1(u), (f − λid)n−2(u), . . . , (f − λid)(u), u),
for some u ∈ E. With respect to this basis, the matrix of f is the Jordan block
λ 1
0
· · · 0
0 λ
1
· · · 0
.
.
.
.
.
J
.
.
.
.
.
n(λ) =
.
.
.
.
.
.
. .
0
0
0
.
1
0 0
0
· · · λ
718
CHAPTER 25. NORMAL FORMS; THE RATIONAL CANONICAL FORM
Proof. Since E is a cyclic K[X]-module, there is some u ∈ E so that E is generated by
u, f (u), f 2(u), . . ., which means that every vector in E is of the form p(f )(u), for some
polynomial, p(X). We claim that u, f (u), . . . , f n−2(u), f n−1(u) generate E, which implies
that the dimension of E is at most n.
This is because if p(X) is any polynomial of degree at least n, then we can divide p(X)
by (X − λ)n, obtaining
p = (X − λ)nq + r,
where 0 ≤ deg(r) < n, and as (X − λ)n annihilates E, we get
p(f )(u) = r(f )(u),
which means that every vector of the form p(f )(u) with p(X) of degree ≥ n is actually a
linear combination of u, f (u), . . . , f n−2(u), f n−1(u).
We claim that the vectors
u, (f − λid)(u), . . . , (f − λid)n−2(u)(f − λid)n−1(u)
are linearly independent. Indeed, if we had a nontrivial linear combination
a0(f − λid)n−1(u) + a1(f − λid)n−2(u) + · · · + an−2(f − λid)(u) + an−1u = 0,
then the polynomial
a0(X − λ)n−1 + a1(X − λ)n−2 + · · · + an−2(X − λ) + an−1
of degree at most n − 1 would annihilate E, contradicting the fact that (X − λ)n is the
minimal polynomial of f (and thus, of smallest degree). Consequently, as the dimension of
E is at most n,
((f − λid)n−1(u), (f − λid)n−2(u), . . . , (f − λid)(u), u),
is a basis of E and since u, f (u), . . . , f n−2(u), f n−1(u) span E,
(u, f (u), . . . , f n−2(u), f n−1(u))
is also a basis of E.
Let us see how f acts on the basis
((f − λid)n−1(u), (f − λid)n−2(u), . . . , (f − λid)(u), u).
If we write f = f − λid + λid, as (f − λid)n annihilates E, we get
f ((f − λid)n−1(u)) = (f − λid)n(u) + λ(f − λid)n−1(u) = λ(f − λid)n−1(u)
and
f ((f − λid)k(u)) = (f − λid)k+1(u) + λ(f − λid)k(u),
0 ≤ k ≤ n − 2.
But this means precisely that the matrix of f in this basis is the Jordan block Jn(λ).
25.4. THE SMITH NORMAL FORM
719
Combining Theorem 25.11 and Proposition 25.12, we obtain a strong version of the
Jordan form.
Theorem 25.13. (Jordan Canonical Form) Let E be finite-dimensional K-vector space.
The following properties are equivalent:
(1) The eigenvalues of f all belong to K.
(2) There is a basis of E in which the matrix of f is upper (or lower) triangular.
(3) There exist a basis of E in which the matrix A of f is Jordan matrix. Furthermore, the
number of Jordan blocks Jr(λ) appearing in A, for fixed r and λ, is uniquely determined
by f .
Proof. The implication (1) =⇒ (3) follows from Theorem 25.11 and Proposition 25.12. The
implications (3) =⇒ (2) and (2) =⇒ (1) are trivial.
Compared to Theorem 22.14, the new ingredient is the uniqueness assertion in (3), which
is not so easy to prove.
Observe that the minimal polynomial of f is the least common multiple of the polynomials
(X − λ)r associated with the Jordan blocks Jr(λ) appearing in A, and the characteristic
polynomial of A is the product of these polynomials.
We now return to the problem of computing effectively the similarity invariants of a
matrix A. By Proposition 25.7, this is equivalent to computing the invariant factors of
XI − A. In principle, this can be done using Proposition 24.42. A procedure to do this
effectively for the ring A = K[X] is to convert XI − A to its Smith normal form. This will
also yield the rational canonical form for A.
25.4
The Smith Normal Form
The Smith normal form is the special case of Proposition 24.42 applied to the PID K[X]
where K is a field, but it also says that the matrices P and Q are products of elementary
matrices. It turns out that such a result holds for any Euclidean ring, and the proof is
basically the same.
Recall from Definition 20.9 that a Euclidean ring is an integral domain A such that there
exists a function σ : A → N with the following property: For all a, b ∈ A with b = 0, there
are some q, r ∈ A such that
a = bq + r
and σ(r) < σ(b).
Note that the pair (q, r) is not necessarily unique.
We make use of the elementary row and column operations P (i, k), Ei,j;β, and Ei,λ de-
scribed in Chapter 6, where we require the scalar λ used in Ei,λ to be a unit.
720
CHAPTER 25. NORMAL FORMS; THE RATIONAL CANONICAL FORM
Theorem 25.14. If M is an m × n matrix over a Euclidean ring A, then there exist some
invertible n × n matrix P and some invertible m × m matrix Q, where P and Q are products
of elementary matrices, and a m × n matrix D of the form
α
1
0
· · ·
0
0 · · · 0
0
α
2
· · ·
0
0 · · · 0
..
..
. .
..
..
..
.
.
.
.
.
· · · .
D = 0
0
· · · α
r
0 · · · 0
0
0
· · ·
0
0 · · · 0
.
.
..
..
.. ..
..
.
.
· · ·
.
.
.
.
0
0
· · ·
0
0 · · · 0
for some nonzero αi ∈ A, such that
(1) α1 | α2 | · · · | αr, and
(2) M = QDP −1.
Proof. We follow Jacobson’s proof [57] (Chapter 3, Theorem 3.8). We proceed by induction
on m + n.
If m = n = 1, let P = (1) and Q = (1).
For the induction step, if M = 0, let P = In and Q = Im. If M = 0, the stategy is to
apply a sequence of elementary transformations that converts M to a matrix of the form
α
1
0 · · · 0
0
M = .
..