−−−−−→
−−−−−−−−→
= f (a)f (b) + f (b)f (a + v),
−−−−−−−−−→
= f (a)f (a + v).
−−−−−−−−→
−−−−−−−−−→
Thus, f (b)f (b + v) = f (a)f (a + v), which shows that the definition of f does not depend
on the choice of a ∈ E. The fact that f is unique is obvious: We must have f(v) =
−−−−−−−−−→
f (a)f (a + v).
The unique linear map f : E → E given by Lemma 19.7 is called the linear map associated
with the affine map f .
Note that the condition
f (a + v) = f (a) + f (v),
for every a ∈ E and every v ∈ E, can be stated equivalently as
−−−−−→
f (x) = f (a) + f (−
→
ax),
or f (a)f (x) = f (−
→
ax),
for all a, x ∈ E. Lemma 19.7 shows that for any affine map f : E → E , there are points
a ∈ E, b ∈ E , and a unique linear map f : E → E , such that
f (a + v) = b + f (v),
for all v ∈ E (just let b = f(a), for any a ∈ E). Affine maps for which f is the identity map
are called translations. Indeed, if f = id,
−−−→
f (x) = f (a) + f (−
→
ax) = f (a) + −
→
ax = x + −
→
xa + af (a) + −
→
ax
−−−→
−−−→
= x + −
→
xa + af (a) − −
→
xa = x + af (a),
and so
−−−→
−−−→
xf (x) = af (a),
−−−→
which shows that f is the translation induced by the vector af (a) (which does not depend
on a).
Since an affine map preserves barycenters, and since an affine subspace V is closed under
barycentric combinations, the image f (V ) of V is an affine subspace in E . So, for example,
the image of a line is a point or a line, and the image of a plane is either a point, a line, or
a plane.
It is easily verified that the composition of two affine maps is an affine map. Also, given
affine maps f : E → E and g : E → E , we have
g(f (a + v)) = g f (a) + f (v) = g(f (a)) + g f (v) ,
19.7. AFFINE MAPS
505
which shows that g ◦ f = g ◦ f. It is easy to show that an affine map f : E → E is injective
iff f : E → E is injective, and that f : E → E is surjective iff f : E → E is surjective. An
affine map f : E → E is constant iff f : E → E is the null (constant) linear map equal to 0
for all v ∈ E.
If E is an affine space of dimension m and (a0, a1, . . . , am) is an affine frame for E, then
for any other affine space F and for any sequence (b0, b1, . . . , bm) of m + 1 points in F , there
is a unique affine map f : E → F such that f(ai) = bi, for 0 ≤ i ≤ m. Indeed, f must be
such that
f (λ0a0 + · · · + λmam) = λ0b0 + · · · + λmbm,
where λ0+· · ·+λm = 1, and this defines a unique affine map on all of E, since (a0, a1, . . . , am)
is an affine frame for E.
Using affine frames, affine maps can be represented in terms of matrices. We explain how
an affine map f : E → E is represented with respect to a frame (a0, . . . , an) in E, the more
general case where an affine map f : E → F is represented with respect to two affine frames
(a0, . . . , an) in E and (b0, . . . , bm) in F being analogous. Since
f (a0 + x) = f (a0) + f (x)
for all x ∈ E, we have
−−−−−−−−→
−−−−→
a0f (a0 + x) = a0f (a0) + f (x).
−−−−→
−−−−−−−−→
Since x, a0f (a0), and a0f (a0 + x), can be expressed as
x = x −−→
−−→
1a0a1 + · · · + xna0an,
−−−−→
a
−−→
−−→
0f (a0)
= b1a0a1 + · · · + bna0an,
−−−−−−−−→
a
−−→
−−→
0f (a0 + x)
= y1a0a1 + · · · + yna0an,
if A = (ai j) is the n × n matrix of the linear map f over the basis (−−→
a0a1, . . . , −−→
a0an), letting x,
y, and b denote the column vectors of components (x1, . . . , xn), (y1, . . . , yn), and (b1, . . . , bn),
−−−−−−−−→
−−−−→
a0f (a0 + x) = a0f (a0) + f (x)
is equivalent to
y = Ax + b.
Note that b = 0 unless f (a0) = a0. Thus, f is generally not a linear transformation, unless it
has a fixed point, i.e., there is a point a0 such that f (a0) = a0. The vector b is the “translation
part” of the affine map. Affine maps do not always have a fixed point. Obviously, nonnull
translations have no fixed point. A less trivial example is given by the affine map
x1
1
0
x
1
→
1
+
.
x2
0 −1
x2
0
506
CHAPTER 19. BASICS OF AFFINE GEOMETRY
This map is a reflection about the x-axis followed by a translation along the x-axis. The
affine map
√
x1
1
− 3
x
1
→ √
1
+
x2
3/4
1/4
x2
1
can also be written as
√
x1
2
0
1/2
− 3/2
x
1
→
√
1
+
x2
0 1/2
3/2
1/2
x2
1
which shows that it is the composition of a rotation of angle π/3, followed by a stretch (by a
factor of 2 along the x-axis, and by a factor of 1 along the y-axis), followed by a translation.
2
It is easy to show that this affine map has a unique fixed point. On the other hand, the
affine map
x1
8/5
−6/5
x
1
→
1
+
x2
3/10
2/5
x2
1
has no fixed point, even though
8/5
−6/5
2
0
4/5 −3/5
=
,
3/10
2/5
0 1/2
3/5
4/5
and the second matrix is a rotation of angle θ such that cos θ = 4 and sin θ = 3. For more
5
5
on fixed points of affine maps, see the problems.
There is a useful trick to convert the equation y = Ax + b into what looks like a linear
equation. The trick is to consider an (n + 1) × (n + 1) matrix. We add 1 as the (n + 1)th
component to the vectors x, y, and b, and form the (n + 1) × (n + 1) matrix
A b
0 1
so that y = Ax + b is equivalent to
y
A b
x
=
.
1
0 1
1
This trick is very useful in kinematics and dynamics, where A is a rotation matrix. Such
affine maps are called rigid motions.
If f : E → E is a bijective affine map, given any three collinear points a, b, c in E,
with a = b, where, say, c = (1 − λ)a + λb, since f preserves barycenters, we have f(c) =
(1 − λ)f(a) + λf(b), which shows that f(a), f(b), f(c) are collinear in E . There is a converse
to this property, which is simpler to state when the ground field is K = R. The converse
states that given any bijective function f : E → E between two real affine spaces of the
same dimension n ≥ 2, if f maps any three collinear points to collinear points, then f is
affine. The proof is rather long (see Berger [6] or Samuel [87]).
19.8. AFFINE GROUPS
507
Given three collinear points a, b, c, where a = c, we have b = (1 − β)a + βc for some
unique β, and we define the ratio of the sequence a, b, c, as
−
→
β
ab
ratio(a, b, c) =
=
,
(1 − β)
−
→
bc
provided that β = 1, i.e., b = c. When b = c, we agree that ratio(a, b, c) = ∞. We warn our
−
→
readers that other authors define the ratio of a, b, c as −ratio(a, b, c) = ba
−
→ . Since affine maps
bc
preserve barycenters, it is clear that affine maps preserve the ratio of three points.
19.8
Affine Groups
We now take a quick look at the bijective affine maps. Given an affine space E, the set of
affine bijections f : E → E is clearly a group, called the affine group of E, and denoted by
GA(E). Recall that the group of bijective linear maps of the vector space E is denoted by
GL(E). Then, the map f → f defines a group homomorphism L : GA(E) → GL(E). The
kernel of this map is the set of translations on E.
The subset of all linear maps of the form λ id−
→, where λ ∈
E
R − {0}, is a subgroup
of GL(E), and is denoted by
∗
∗
R id−
→ (where λ id−
→(u) = λu, and
=
E
E
R
R − {0}). The
subgroup DIL(E) = L−1( ∗
R id−
→) of GA(E) is particularly interesting. It turns out that it
E
is the disjoint union of the translations and of the dilatations of ratio λ = 1. The elements
of DIL(E) are called affine dilatations.
Given any point a ∈ E, and any scalar λ ∈ R, a dilatation or central dilatation (or
homothety) of center a and ratio λ is a map Ha,λ defined such that
Ha,λ(x) = a + λ−
→
ax,
for every x ∈ E.
Remark: The terminology does not seem to be universally agreed upon. The terms affine
dilatation and central dilatation are used by Pedoe [85]. Snapper and Troyer use the term
dilation for an affine dilatation and magnification for a central dilatation [95]. Samuel uses
homothety for a central dilatation, a direct translation of the French “homothétie” [87]. Since
dilation is shorter than dilatation and somewhat easier to pronounce, perhaps we should use
that!
Observe that Ha,λ(a) = a, and when λ = 0 and x = a, Ha,λ(x) is on the line defined by
a and x, and is obtained by “scaling” −
→
ax by λ.
Figure 19.12 shows the effect of a central dilatation of center d. The triangle (a, b, c) is
magnified to the triangle (a , b , c ). Note how every line is mapped to a parallel line.
When λ = 1, Ha,1 is the identity. Note that Ha,λ = λ id−
→. When λ = 0, it is clear that
E
Ha,λ is an affine bijection. It is immediately verified that
Ha,λ ◦ Ha,µ = Ha,λµ.
508
CHAPTER 19. BASICS OF AFFINE GEOMETRY
a
a
b
b
d
c
c
Figure 19.12: The effect of a central dilatation
We have the following useful result.
Lemma 19.8. Given any affine space E, for any affine bijection f ∈ GA(E), if f = λ id−
→,
E
for some λ ∈ ∗
R with λ = 1, then there is a unique point c ∈ E such that f = Hc,λ.
Proof. The proof is straightforward, and is omitted. It is also given in Gallier [41].
Clearly, if f = id−
→, the affine map f is a translation. Thus, the group of affine dilatations
E
DIL(E) is the disjoint union of the translations and of the dilatations of ratio λ = 0, 1. Affine
dilatations can be given a purely geometric characterization.
Another point worth mentioning is that affine bijections preserve the ratio of volumes of
parallelotopes. Indeed, given any basis B = (u1, . . . , um) of the vector space E associated
with the affine space E, given any m + 1 affinely independent points (a0, . . . , am), we can
compute the determinant detB(−−→
a0a1, . . . , −−→
a0am) w.r.t. the basis B. For any bijective affine
map f : E → E, since
detB f (−−→
a0a1), . . . , f (−−→
a0am) = det f detB(−−→
a0a1, . . . , −−→
a0am)
and the determinant of a linear map is intrinsic (i.e., depends only on f , and not on the
particular basis B), we conclude that the ratio
detB f (−−→
a0a1), . . . , f (−−→
a0am) = det f
detB(−−→
a0a1, . . . , −−→
a0am)
is independent of the basis B. Since detB(−−→
a0a1, . . . , −−→
a0am) is the volume of the parallelotope
spanned by (a0, . . . , am), where the parallelotope spanned by any point a and the vectors
19.9. AFFINE GEOMETRY: A GLIMPSE
509
(u1, . . . , um) has unit volume (see Berger [6], Section 9.12), we see that affine bijections
preserve the ratio of volumes of parallelotopes. In fact, this ratio is independent of the
choice of the parallelotopes of unit volume. In particular, the affine bijections f ∈ GA(E)
such that det f
= 1 preserve volumes. These affine maps form a subgroup SA(E) of
GA(E) called the special affine group of E. We now take a glimpse at affine geometry.
19.9
Affine Geometry: A Glimpse
In this section we state and prove three fundamental results of affine geometry. Roughly
speaking, affine geometry is the study of properties invariant under affine bijections. We now
prove one of the oldest and most basic results of affine geometry, the theorem of Thales.
Lemma 19.9. Given any affine space E, if H1, H2, H3 are any three distinct parallel hyper-
planes, and A and B are any two lines not parallel to Hi, letting ai = Hi ∩A and bi = Hi ∩B,
then the following ratios are equal:
−−→
−−→
a1a3
b1b3
−−→ =
= ρ.
a
−−→
1a2
b1b2
−→
Conversely, for any point d on the line A, if a1d
−−→ = ρ, then d = a
a
3.
1a2
Proof. Figure 19.13 illustrates the theorem of Thales. We sketch a proof, leaving the details
as an exercise. Since H1, H2, H3 are parallel, they have the same direction H, a hyperplane
in E. Let u ∈ E − H be any nonnull vector such that A = a1 + Ru. Since A is not parallel to
H, we have E = H ⊕ Ru, and thus we can define the linear map p: E → Ru, the projection
on Ru parallel to H. This linear map induces an affine map f : E → A, by defining f such
that
f (b1 + w) = a1 + p(w),
for all w ∈ E. Clearly, f(b1) = a1, and since H1, H2, H3 all have direction H, we also have
f (b2) = a2 and f (b3) = a3. Since f is affine, it preserves ratios, and thus
−−→
−−→
a1a3
b1b3
−−→ =
.
a
−−→
1a2
b1b2
The converse is immediate.
We also have the following simple lemma, whose proof is left as an easy exercise.
Lemma 19.10. Given any affine space E, given any two distinct points a, b ∈ E, and for
any affine dilatation f different from the identity, if a = f (a), D = a, b is the line passing
through a and b, and D is the line parallel to D and passing through a , the following are
equivalent:
510
CHAPTER 19. BASICS OF AFFINE GEOMETRY
a1
b1
H1
H2
a2
b2
a3
b3
H3
A
B
Figure 19.13: The theorem of Thales
19.9. AFFINE GEOMETRY: A GLIMPSE
511
c
D
b
a
c
b
D
a
Figure 19.14: Pappus’s theorem (affine version)
(i) b = f (b);
(ii) If f is a translation, then b is the intersection of D with the line parallel to a, a
passing through b;
If f is a dilatation of center c, then b = D ∩ c, b .
The first case is the parallelogram law, and the second case follows easily from Thales’
theorem.
We are now ready to prove two classical results of affine geometry, Pappus’s theorem and
Desargues’s theorem. Actually, these results are theorems of projective geometry, and we
are stating affine versions of these important results. There are stronger versions that are
best proved using projective geometry.
Lemma 19.11. Given any affine plane E, any two distinct lines D and D , then for any
distinct points a, b, c on D and a , b , c on D , if a, b, c, a , b , c are distinct from the inter-
section of D and D (if D and D intersect) and if the lines a, b
and a , b are parallel,
and the lines b, c
and b , c are parallel, then the lines a, c
and a , c are parallel.
Proof. Pappus’s theorem is illustrated in Figure 19.14. If D and D are not parallel, let d
be their intersection. Let f be the dilatation of center d such that f (a) = b, and let g be the
dilatation of center d such that g(b) = c. Since the lines a, b and a , b are parallel, and
the lines b, c
and b , c are parallel, by Lemma 19.10 we have a = f (b ) and b = g(c ).
However, we observed that dilatations with the same center commute, and thus f ◦ g = g ◦ f,
and thus, letting h = g ◦ f, we get c = h(a) and a = h(c ). Again, by Lemma 19.10, the
512
CHAPTER 19. BASICS OF AFFINE GEOMETRY
lines a, c and a , c are parallel. If D and D are parallel, we use translations instead of
dilatations.
There is a converse to Pappus’s theorem, which yields a fancier version of Pappus’s
theorem, but it is easier to prove it using projective geometry. It should be noted that
in axiomatic presentations of projective geometry, Pappus’s theorem is equivalent to the
commutativity of the ground field K (in the present case, K = R). We now prove an affine
version of Desargues’s theorem.
Lemma 19.12. Given any affine space E, and given any two triangles (a, b, c) and (a , b , c ),
where a, b, c, a , b , c are all distinct, if a, b and a , b
are parallel and b, c and b , c
are
parallel, then a, c and a , c
are parallel iff the lines a, a , b, b , and c, c
are either
parallel or concurrent (i.e., intersect in a common point).
Proof. We prove half of the lemma, the direction in which it is assumed that a, c and a , c
are parallel, leaving the converse as an exercise. Since the lines a, b and a , b are parallel,
the points a, b, a , b are coplanar. Thus, either a, a
and b, b
are parallel, or they have
some intersection d. We consider the second case where they intersect, leaving the other
case as an easy exercise. Let f be the dilatation of center d such that f (a) = a . By Lemma
19.10, we get f (b) = b . If f (c) = c , again by Lemma 19.10 twice, the lines b, c and b , c
are parallel, and the lines a, c and a , c
are parallel. From this it follows that c = c .
Indeed, recall that b, c and b , c are parallel, and similarly a, c and a , c are parallel.
Thus, the lines b , c
and b , c are identical, and similarly the lines a , c
and a , c are
−→
−→
identical. Since a c and b c are linearly independent, these lines have a unique intersection,
which must be c = c .
The direction where it is assumed that the lines a, a , b, b and c, c , are either parallel
or concurrent is left as an exercise (in fact, the proof is quite similar).
Desargues’s theorem is illustrated in Figure 19.15.
There is a fancier version of Desargues’s theorem, but it is easier to prove it using pro-
jective geometry. It should be noted that in axiomatic presentations of projective geometry,
Desargues’s theorem is related to the associativity of the ground field K (in the present
case, K = R). Also, Desargues’s theorem yields a geometric characterization of the affine
dilatations. An affine dilatation f on an affine space E is a bijection that maps every line
D to a line f (D) parallel to D. We leave the proof as an exercise.
19.10
Affine Hyperplanes
We now consider affine forms and affine hyperplanes. In Section 19.5 we observed that the
set L of solutions of an equation
ax + by = c
19.10. AFFINE HYPERPLANES
513
a
a
b
b
d
c
c
Figure 19.15: Desargues’s theorem (affine version)
is an affine subspace of
2
A of dimension 1, in fact, a line (provided that a and b are not both
null). It would be equally easy to show that the set P of solutions of an equation
ax + by + cz = d
is an affine subspace of
3
A of dimension 2, in fact, a plane (provided that a, b, c are not all
null). More generally, the set H of solutions of an equation
λ1x1 + · · · + λmxm = µ
is an affine subspace of m
A , and if λ1, . . . , λm are not all null, it turns out that it is a subspace
of dimension m − 1 called a hyperplane.
We can interpret the equation
λ1x1 + · · · + λmxm = µ
in terms of the map f :
m
R
→ R defined such that
f (x1, . . . , xm) = λ1x1 + · · · + λmxm − µ
for all (x
m
1, . . . , xm) ∈ R . It is immediately verified that this map is affine, and the set H of
solutions of the equation
λ1x1 + · · · + λmxm = µ
is the null set, or kernel, of the affine map f :
m
A
→ R, in the sense that
H = f −1(0) = {x ∈ m
A
| f(x) = 0},
514
CHAPTER 19. BASICS OF AFFINE GEOMETRY
where x = (x1, . . . , xm).
Thus, it is interesting to consider affine forms, which are just affine maps f : E → R
from an affine space to R. Unlike linear forms f∗, for which Ker f∗ is never empty (since it
always contains the vector 0), it is possible that f −1(0) = ∅ for an affine form f. Given an
affine map f : E → R, we also denote f−1(0) by Ker f, and we call it the kernel of f. Recall
that an (affine) hyperplane is an affine subspace of codimension 1. The relationship between
affine hyperplanes and affine forms is given by the following lemma.
Lemma 19.13. Let E be an affine space. The following properties hold:
(a) Given any nonconstant affine form f : E → R, its kernel H = Ker f is a hyperplane.
(b) For any hyperplane H in E, there is a nonconstant affine form f : E → R such that
H = Ker f . For any other affine form g : E → R such that H = Ker g, there is some
λ ∈ R such that g = λf (with λ = 0).
(c) Given any hyperplane H in E and any (nonconstant) affine form f : E → R such that
H = Ker f , every hyperplane H parallel to H is defined by a