Rien n’est beau que le vrai.
—Hermann Minkowski
9.1
Inner Products, Euclidean Spaces
So far, the framework of vector spaces allows us to deal with ratios of vectors and linear
combinations, but there is no way to express the notion of length of a line segment or to talk
about orthogonality of vectors. A Euclidean structure allows us to deal with metric notions
such as orthogonality and length (or distance).
This chapter covers the bare bones of Euclidean geometry. Deeper aspects of Euclidean
geometry are investigated in Chapter 10. One of our main goals is to give the basic properties
of the transformations that preserve the Euclidean structure, rotations and reflections, since
they play an important role in practice. Euclidean geometry is the study of properties
invariant under certain affine maps called rigid motions. Rigid motions are the maps that
preserve the distance between points.
We begin by defining inner products and Euclidean spaces. The Cauchy–Schwarz in-
equality and the Minkowski inequality are shown. We define orthogonality of vectors and of
subspaces, orthogonal bases, and orthonormal bases. We prove that every finite-dimensional
Euclidean space has orthonormal bases. The first proof uses duality, and the second one
the Gram–Schmidt orthogonalization procedure. The QR-decomposition for invertible ma-
trices is shown as an application of the Gram–Schmidt procedure. Linear isometries (also
called orthogonal transformations) are defined and studied briefly. We conclude with a short
section in which some applications of Euclidean geometry are sketched. One of the most
important applications, the method of least squares, is discussed in Chapter 17.
For a more detailed treatment of Euclidean geometry, see Berger [6, 7], Snapper and
Troyer [95], or any other book on geometry, such as Pedoe [85], Coxeter [24], Fresnel [38],
Tisseron [105], or Cagnac, Ramis, and Commeau [17]. Serious readers should consult Emil
253
254
CHAPTER 9. EUCLIDEAN SPACES
Artin’s famous book [2], which contains an in-depth study of the orthogonal group, as well
as other groups arising in geometry. It is still worth consulting some of the older classics,
such as Hadamard [51, 52] and Rouché and de Comberousse [86]. The first edition of [51]
was published in 1898, and finally reached its thirteenth edition in 1947! In this chapter it is
assumed that all vector spaces are defined over the field R of real numbers unless specified
otherwise (in a few cases, over the complex numbers C).
First, we define a Euclidean structure on a vector space. Technically, a Euclidean struc-
ture over a vector space E is provided by a symmetric bilinear form on the vector space
satisfying some extra properties. Recall that a bilinear form ϕ : E × E → R is definite if for
every u ∈ E, u = 0 implies that ϕ(u, u) = 0, and positive if for every u ∈ E, ϕ(u, u) ≥ 0.
Definition 9.1. A Euclidean space is a real vector space E equipped with a symmetric
bilinear form ϕ : E × E → R that is positive definite. More explicitly, ϕ: E × E → R
satisfies the following axioms:
ϕ(u1 + u2, v) = ϕ(u1, v) + ϕ(u2, v),
ϕ(u, v1 + v2) = ϕ(u, v1) + ϕ(u, v2),
ϕ(λu, v) = λϕ(u, v),
ϕ(u, λv) = λϕ(u, v),
ϕ(u, v) = ϕ(v, u),
u = 0 implies that ϕ(u, u) > 0.
The real number ϕ(u, v) is also called the inner product (or scalar product) of u and v. We
also define the quadratic form associated with ϕ as the function Φ : E → R+ such that
Φ(u) = ϕ(u, u),
for all u ∈ E.
Since ϕ is bilinear, we have ϕ(0, 0) = 0, and since it is positive definite, we have the
stronger fact that
ϕ(u, u) = 0 iff u = 0,
that is, Φ(u) = 0 iff u = 0.
Given an inner product ϕ : E × E → R on a vector space E, we also denote ϕ(u, v) by
u · v or
u, v
or (u|v),
and
Φ(u) by u .
Example 9.1. The standard example of a Euclidean space is
n
R , under the inner product
· defined such that
(x1, . . . , xn) · (y1, . . . , yn) = x1y1 + x2y2 + · · · + xnyn.
This Euclidean space is denoted by n
E .
9.1. INNER PRODUCTS, EUCLIDEAN SPACES
255
There are other examples.
Example 9.2. For instance, let E be a vector space of dimension 2, and let (e1, e2) be a
basis of E. If a > 0 and b2 − ac < 0, the bilinear form defined such that
ϕ(x1e1 + y1e2, x2e1 + y2e2) = ax1x2 + b(x1y2 + x2y1) + cy1y2
yields a Euclidean structure on E. In this case,
Φ(xe1 + ye2) = ax2 + 2bxy + cy2.
Example 9.3. Let C[a, b] denote the set of continuous functions f : [a, b] → R. It is easily
checked that C[a, b] is a vector space of infinite dimension. Given any two functions f, g ∈
C[a, b], let
b
f, g =
f (t)g(t)dt.
a
We leave as an easy exercise that −, − is indeed an inner product on C[a, b]. In the case
where a = −π and b = π (or a = 0 and b = 2π, this makes basically no difference), one
should compute
sin px, sin qx ,
sin px, cos qx ,
and
cos px, cos qx ,
for all natural numbers p, q ≥ 1. The outcome of these calculations is what makes Fourier
analysis possible!
Example 9.4. Let E = Mn(R) be the vector space of real n×n matrices. If we view a matrix
A ∈ Mn(R) as a “long” column vector obtained by concatenating together its columns, we
can define the inner product of two matrices A, B ∈ Mn(R) as
n
A, B =
aijbij,
i,j=1
which can be conveniently written as
A, B = tr(A B) = tr(B A).
Since this can be viewed as the Euclidean product on
n2
R
, it is an inner product on Mn(R).
The corresponding norm
A
=
tr(A A)
F
is the Frobenius norm (see Section 7.2).
Let us observe that ϕ can be recovered from Φ. Indeed, by bilinearity and symmetry, we
have
Φ(u + v) = ϕ(u + v, u + v)
= ϕ(u, u + v) + ϕ(v, u + v)
= ϕ(u, u) + 2ϕ(u, v) + ϕ(v, v)
= Φ(u) + 2ϕ(u, v) + Φ(v).
256
CHAPTER 9. EUCLIDEAN SPACES
Thus, we have
1
ϕ(u, v) = [Φ(u + v) − Φ(u) − Φ(v)].
2
We also say that ϕ is the polar form of Φ.
If E is finite-dimensional and if ϕ : E × E → R is a bilinear form on E, given any basis
(e1, . . . , en) of E, we can write x =
n
x
y
i=1
iei and y =
n
j=1
j ej , and we have
n
n
n
ϕ(x, y) = ϕ
xiei,
yjej
=
xiyjϕ(ei, ej).
i=1
j=1
i,j=1
If we let G be the matrix G = (ϕ(ei, ej)), and if x and y are the column vectors associated
with (x1, . . . , xn) and (y1, . . . , yn), then we can write
ϕ(x, y) = x Gy = y G x.
Furhermore, observe that ϕ is symmetric iff G = G , and ϕ is positive definite iff the matrix
G is positive definite, that is,
x Gx > 0 for all x ∈ n
R , x = 0.
The matrix G associated with an inner product is called the Gram matrix of the inner
product with respect to the basis (e1, . . . , en).
Conversely, if A is a symmetric positive definite n × n matrix, it is easy to check that the
bilinear form
x, y = x Ay
is an inner product. If we make a change of basis from the basis (e1, . . . , en) to the basis
(f1, . . . , fn), and if the change of basis matrix is P (where the jth column of P consists of
the coordinates of fj over the basis (e1, . . . , en)), then with respect to coordinates x and y
over the basis (f1, . . . , fn), we have
x, y = x Gy = x P GP y ,
so the matrix of our inner product over the basis (f1, . . . , fn) is P GP . We summarize these
facts in the following proposition.
Proposition 9.1. Let E be a finite-dimensional vector space, and let (e1, . . . , en) be a basis
of E.
1. For any inner product −, − on E, if G = ( ei, ej ) is the Gram matrix of the inner
product −, − w.r.t. the basis (e1, . . . , en), then G is symmetric positive definite.
2. For any change of basis matrix P , the Gram matrix of −, − with respect to the new
basis is P GP .
9.1. INNER PRODUCTS, EUCLIDEAN SPACES
257
3. If A is any n × n symmetric positive definite matrix, then
x, y = x Ay
is an inner product on E.
We will see later that a symmetric matrix is positive definite iff its eigenvalues are all
positive.
One of the very important properties of an inner product ϕ is that the map u →
Φ(u)
is a norm.
Proposition 9.2. Let E be a Euclidean space with inner product ϕ, and let Φ be the corre-
sponding quadratic form. For all u, v ∈ E, we have the Cauchy–Schwarz inequality
ϕ(u, v)2 ≤ Φ(u)Φ(v),
the equality holding iff u and v are linearly dependent.
We also have the Minkowski inequality
Φ(u + v) ≤
Φ(u) +
Φ(v),
the equality holding iff u and v are linearly dependent, where in addition if u = 0 and v = 0,
then u = λv for some λ > 0.
Proof. For any vectors u, v ∈ E, we define the function T : R → R such that
T (λ) = Φ(u + λv),
for all λ ∈ R. Using bilinearity and symmetry, we have
Φ(u + λv) = ϕ(u + λv, u + λv)
= ϕ(u, u + λv) + λϕ(v, u + λv)
= ϕ(u, u) + 2λϕ(u, v) + λ2ϕ(v, v)
= Φ(u) + 2λϕ(u, v) + λ2Φ(v).
Since ϕ is positive definite, Φ is nonnegative, and thus T (λ) ≥ 0 for all λ ∈ R. If Φ(v) = 0,
then v = 0, and we also have ϕ(u, v) = 0. In this case, the Cauchy–Schwarz inequality is
trivial, and v = 0 and u are linearly dependent.
Now, assume Φ(v) > 0. Since T (λ) ≥ 0, the quadratic equation
λ2Φ(v) + 2λϕ(u, v) + Φ(u) = 0
cannot have distinct real roots, which means that its discriminant
∆ = 4(ϕ(u, v)2 − Φ(u)Φ(v))
258
CHAPTER 9. EUCLIDEAN SPACES
is null or negative, which is precisely the Cauchy–Schwarz inequality
ϕ(u, v)2 ≤ Φ(u)Φ(v).
If
ϕ(u, v)2 = Φ(u)Φ(v),
then the above quadratic equation has a double root λ0, and we have Φ(u + λ0v) = 0. If
λ0 = 0, then ϕ(u, v) = 0, and since Φ(v) > 0, we must have Φ(u) = 0, and thus u = 0. In this
case, of course, u = 0 and v are linearly dependent. Finally, if λ0 = 0, since Φ(u + λ0v) = 0
implies that u + λ0v = 0, then u and v are linearly dependent. Conversely, it is easy to check
that we have equality when u and v are linearly dependent.
The Minkowski inequality
Φ(u + v) ≤
Φ(u) +
Φ(v)
is equivalent to
Φ(u + v) ≤ Φ(u) + Φ(v) + 2 Φ(u)Φ(v).
However, we have shown that
2ϕ(u, v) = Φ(u + v) − Φ(u) − Φ(v),
and so the above inequality is equivalent to
ϕ(u, v) ≤
Φ(u)Φ(v),
which is trivial when ϕ(u, v) ≤ 0, and follows from the Cauchy–Schwarz inequality when
ϕ(u, v) ≥ 0. Thus, the Minkowski inequality holds. Finally, assume that u = 0 and v = 0,
and that
Φ(u + v) =
Φ(u) +
Φ(v).
When this is the case, we have
ϕ(u, v) =
Φ(u)Φ(v),
and we know from the discussion of the Cauchy–Schwarz inequality that the equality holds
iff u and v are linearly dependent. The Minkowski inequality is an equality when u or v is
null. Otherwise, if u = 0 and v = 0, then u = λv for some λ = 0, and since
ϕ(u, v) = λϕ(v, v) =
Φ(u)Φ(v),
by positivity, we must have λ > 0.
9.1. INNER PRODUCTS, EUCLIDEAN SPACES
259
Note that the Cauchy–Schwarz inequality can also be written as
|ϕ(u, v)| ≤
Φ(u) Φ(v).
Remark: It is easy to prove that the Cauchy–Schwarz and the Minkowski inequalities still
hold for a symmetric bilinear form that is positive, but not necessarily definite (i.e., ϕ(u, v) ≥
0 for all u, v ∈ E). However, u and v need not be linearly dependent when the equality holds.
The Minkowski inequality
Φ(u + v) ≤
Φ(u) +
Φ(v)
shows that the map u →
Φ(u) satisfies the convexity inequality (also known as triangle
inequality), condition (N3) of Definition 7.1, and since ϕ is bilinear and positive definite, it
also satisfies conditions (N1) and (N2) of Definition 7.1, and thus it is a norm on E. The
norm induced by ϕ is called the Euclidean norm induced by ϕ.
Note that the Cauchy–Schwarz inequality can be written as
|u · v| ≤ u v ,
and the Minkowski inequality as
u + v ≤ u + v .
Remark: One might wonder if every norm on a vector space is induced by some Euclidean
inner product. In general, this is false, but remarkably, there is a simple necessary and
sufficient condition, which is that the norm must satisfy the parallelogram law :
u + v 2 + u − v 2 = 2( u 2 + v 2).
If −, − is an inner product, then we have
u + v 2 = u 2 + v 2 + 2 u, v
u − v 2 = u 2 + v 2 − 2 u, v ,
and by adding and subtracting these identities, we get the parallelogram law and the equation
1
u, v = ( u + v 2 − u − v 2),
4
which allows us to recover −, − from the norm.
260
CHAPTER 9. EUCLIDEAN SPACES
Conversely, if
is a norm satisfying the parallelogram law, and if it comes from an
inner product, then this inner product must be given by
1
u, v = ( u + v 2 − u − v 2).
4
We need to prove that the above form is indeed symmetric and bilinear.
Symmetry holds because u − v = −(u − v) = v − u . Let us prove additivity in
the variable u. By the parallelogram law, we have
2( x + z 2 + y 2) = x + y + z 2 + x − y + z 2
which yields
x + y + z 2 = 2( x + z 2 + y 2) − x − y + z 2
x + y + z 2 = 2( y + z 2 + x 2) − y − x + z 2 ,
where the second formula is obtained by swapping x and y. Then by adding up these
equations, we get
1
1
x + y + z 2 = x 2 + y 2 + x + z 2 + y + z 2 −
x − y + z 2 −
y − x + z 2 .
2
2
Replacing z by −z in the above equation, we get
1
1
x + y − z 2 = x 2 + y 2 + x − z 2 + y − z 2 −
x − y − z 2 −
y − x − z 2 ,
2
2
Since x − y + z = −(x − y + z) = y − x − z and y − x + z = −(y − x + z) =
x − y − z , by subtracting the last two equations, we get
1
x + y, z = ( x + y + z 2 − x + y − z 2)
41
1
= ( x + z 2 − x − z 2) + ( y + z 2 − y − z 2)
4
4
= x, z + y, z ,
as desired.
Proving that
λx, y = λ x, y
for all λ ∈ R
is a little tricky. The strategy is to prove the identity for λ ∈ Z, then to promote it to Q,
and then to R by continuity.
Since
1
−u, v = ( −u + v 2 − −u − v 2)
41
= ( u − v 2 − u + v 2)
4
= − u, v ,
9.2. ORTHOGONALITY, DUALITY, ADJOINT OF A LINEAR MAP
261
the property holds for λ = −1. By linearity and by induction, for any n ∈ N with n ≥ 1,
writing n = n − 1 + 1, we get
λx, y = λ x, y
for all λ ∈ N,
and since the above also holds for λ = −1, it holds for all λ ∈ Z. For λ = p/q with p, q ∈ Z
and q = 0, we have
q (p/q)u, v = pu, v = p u, v ,
which shows that
(p/q)u, v = (p/q) u, v ,
and thus
λx, y = λ x, y
for all λ ∈ Q.
To finish the proof, we use the fact that a norm is a continuous map x → x . Then, the
continuous function t → 1 tu, v defined on
t
R − {0} agrees with u, v on Q − {0}, so it is
equal to u, v on R − {0}. The case λ = 0 is trivial, so we are done.
We now define orthogonality.
9.2
Orthogonality, Duality, Adjoint of a Linear Map
An inner product on a vector space gives the ability to define the notion of orthogonality.
Families of nonnull pairwise orthogonal vectors must be linearly independent. They are
called orthogonal families. In a vector space of finite dimension it is always possible to find
orthogonal bases. This is very useful theoretically and practically. Indeed, in an orthogonal
basis, finding the coordinates of a vector is very cheap: It takes an inner product. Fourier
series make crucial use of this fact. When E has finite dimension, we prove that the inner
product on E induces a natural isomorphism between E and its dual space E∗. This allows
us to define the adjoint of a linear map in an intrinsic fashion (i.e., independently of bases).
It is also possible to orthonormalize any basis (certainly when the dimension is finite). We
give two proofs, one using duality, the other more constructive using the Gram–Schmidt
orthonormalization procedure.
Definition 9.2. Given a Euclidean space E, any two vectors u, v ∈ E are orthogonal, or
perpendicular , if u · v = 0. Given a family (ui)i∈I of vectors in E, we say that (ui)i∈I is
orthogonal if ui · uj = 0 for all i, j ∈ I, where i = j. We say that the family (ui)i∈I is
orthonormal if ui · uj = 0 for all i, j ∈ I, where i = j, and ui = ui · ui = 1, for all i ∈ I.
For any subset F of E, the set
F ⊥ = {v ∈ E | u · v = 0, for all u ∈ F },
of all vectors orthogonal to all vectors in F , is called the orthogonal complement of F .
262
CHAPTER 9. EUCLIDEAN SPACES
Since inner products are positive definite, observe that for any vector u ∈ E, we have
u · v = 0 for all v ∈ E iff u = 0.
It is immediately verified that the orthogonal complement F ⊥ of F is a subspace of E.
Example 9.5. Going back to Example 9.3 and to the inner product
π
f, g =
f (t)g(t)dt
−π
on the vector space C[−π, π], it is easily checked that
π
if p = q, p, q ≥ 1,
sin px, sin qx =
0
if p = q, p, q ≥ 1,
π
if p = q, p, q ≥ 1,
cos px, cos qx =
0
if p = q, p, q ≥ 0,
and
sin px, cos qx = 0,
for all p ≥ 1 and q ≥ 0, and of course, 1, 1 = π dx = 2π.
−π
As a consequence, the family (sin px)p≥1∪(cos qx)q≥0 is orthogonal. It is not orthonormal,
√
√
but becomes so if we divide every trigonometric function by
π, and 1 by
2π.
We leave the following simple two results as exercises.
Proposition 9.3. Given a Euclidean space E, for any family (ui)i∈I of nonnull vectors in
E, if (ui)i∈I is orthogonal, then it is linearly independent.
Proposition 9.4. Given a Euclidean space E, any two vectors u, v ∈ E are orthogonal iff
u + v 2 = u 2 + v 2.
One of the most useful features of orthonormal bases is that they afford a very simple
method for computing the coordinates of a vector over any basis vector. Indeed, assume
that (e1, . . . , em) is an orthonormal basis. For any vector
x = x1e1 + · · · + xmem,
if we compute the inner product x · ei, we get
x · ei = x1e1 · ei + · · · + xiei · ei + · · · + xmem · ei = xi,
9.2. ORTHOGONALITY, DUALITY, ADJOINT OF A LINEAR MAP
263
since
1 if i = j,
ei · ej =
0 if i = j
is the property characterizing an orthonormal family. Thus,
xi = x · ei,
which means that xiei = (x · ei)ei is the orthogonal projection of x onto the subspace
generated by the basis vector ei. If the basis is orthogonal but not necessarily orthonormal,
then
x · e
x · e
x
i
i
i =
=
.
e
2
i · ei
ei
All this is true even for an infinite orthonormal (or orthogonal) basis (ei)i∈I.
However, remember that every vector x is expressed as a linear combination
x =
xiei
i∈I
where the family of scalars (xi)i∈I has finite support, which means that xi = 0 for all
i ∈ I − J, where J is a finite set. Thus, even though the family (sin px)p≥1 ∪ (cos qx)q≥0 is
orthogonal (it is not orthonormal, but becomes so if we divide every trigonometric function by
√
√
π, and 1 by
2π; we won’t because it looks messy!), the fact that a function f ∈ C0[−π, π]
can be written as a Fourier series as
∞
f (x) = a0 +
(ak cos kx + bk sin kx)
k=1
does not mean that (sin px)p≥1 ∪ (cos qx)q≥0 is a basis of this vector space of functions,
because in general, the families (ak) and (bk) do not have finite support! In order for this
infinite linear combination to make sense, it is necessary to prove that the partial sums
n
a0 +
(ak cos kx + bk sin kx)
k=1
of the series converge to a limit when n goes to infinity. This requires a topology on the
space.
A very important property of Euclidean spaces of finite dimension is that the inner
product induces a canonical bijection (i.e., independent of the choice of bases) between the
vector space E and its dual E∗.
Given a Euclidean space E, for any vector u ∈ E, let ϕu : E → R be the map defined
such that
ϕu(v) = u · v,
264
CHAPTER 9. EUCLIDEAN SPACES
for all v ∈ E.
Since the inner product is bilinear, the map ϕu is a linear form in E∗. Thus, we have a
map : E → E∗, defined such that
(u) = ϕu.
Theorem 9.5. Given a Euclidean space E, the map : E → E∗ defined such that
(u) = ϕu
is linear and injective. When E is also of finite dimension, the map : E → E∗ is a canonical
isomorphism.
Proof. That : E → E∗ is a linear map follows immediately from the fact that the inner
product is bilinear. If ϕu = ϕv, then ϕu(w) = ϕv(w) for all w ∈ E, which by definition of ϕu
means that
u · w = v · w
for all w ∈ E, which by bilinearity is equivalent to
(v − u) · w = 0
for all w ∈ E, which implies that u = v, since the inner product is positive definite. Thus,
: E → E∗ is injective. Finally, when E is of finite dimension n, we know that E∗ is also of
dimension n, and then : E → E∗ is bijective.
The inverse of the isomorphism : E → E∗ is denoted by : E∗ → E.
As a consequence of Theorem 9.5, if E is a Euclidean space of finite dimension, every
linear form f ∈ E∗ corresponds to a unique u ∈ E such that
f (v) = u · v,
for every v ∈ E. In particular, if f is not the null form, the kernel of f, which is a hyperplane
H, is precisely the set of vectors that are orthogonal to u.
Remarks:
(1) The “musical map” : E → E∗ is not surjective when E has infinite dimension. The