20.1
Multisets
This chapter contains a review of polynomials and their basic properties. First, multisets
are defined. Polynomials in one variable are defined next. The notion of a polynomial
function in one argument is defined. Polynomials in several variable are defined, and so is
the notion of a polynomial function in several arguments. The Euclidean division algorithm is
presented, and the main consequences of its existence are derived. Ideals are defined, and the
characterization of greatest common divisors of polynomials in one variables (gcd’s) in terms
of ideals is shown. We also prove the Bezout identity. Next, we consider the factorization of
polynomials in one variables into irreducible factors. The unique factorization of polynomials
in one variable into irreducible factors is shown. Roots of polynomials and their multiplicity
are defined. It is shown that a nonnull polynomial in one variable and of degree m over an
integral domain has at most m roots. The chapter ends with a brief treatment of polynomial
interpolation: Lagrange, Newton, and Hermite interpolants are introduced.
In this chapter, it is assumed that all rings considered are commutative. Recall that a
(commutative) ring A is an integral domain (or an entire ring) if 1 = 0, and if ab = 0, then
either a = 0 or b = 0, for all a, b ∈ A. This second condition is equivalent to saying that if
a = 0 and b = 0, then ab = 0. Also, recall that a = 0 is not a zero divisor if ab = 0 whenever
b = 0. Observe that a field is an integral domain.
Our goal is to define polynomials in one or more indeterminates (or variables) X1, . . . , Xn,
with coefficients in a ring A. This can be done in several ways, and we choose a definition
that has the advantage of extending immediately from one to several variables. First, we
need to review the notion of a (finite) multiset.
Definition 20.1. Given a set I, a (finite) multiset over I is any function M : I → N such
that M (i) = 0 for finitely many i ∈ I. The multiset M such that M(i) = 0 for all i ∈ I is
the empty multiset, and it is denoted by 0. If M (i) = k = 0, we say that i is a member of
M of multiplicity k. The union M1 + M2 of two multisets M1 and M2 is defined such that
(M1 + M2)(i) = M1(i) + M2(i), for every i ∈ I. If I is finite, say I = {1, . . . , n}, the multiset
531
532
CHAPTER 20. POLYNOMIALS, IDEALS AND PID’S
M such that M (i) = ki for every i, 1 ≤ i ≤ n, is denoted by k1 · 1 + · · · + kn · n, or more
simply, by (k1, . . . , kn), and deg(k1 · 1 + · · · + kn · n) = k1 + · · · + kn is the size or degree of
M . The set of all multisets over I is denoted by
(I)
(n)
N
, and when I = {1, . . . , n}, by N .
Intuitively, the order of the elements of a multiset is irrelevant, but the multiplicity of
each element is relevant, contrary to sets. Every i ∈ I is identified with the multiset Mi such
that M
(1)
i(i) = 1 and Mi(j) = 0 for j = i. When I = {1}, the set N
of multisets k · 1 can be
identified with N and {1}∗. We will denote k · 1 simply by k.
However, beware that when n ≥ 2, the set (n)
N
of multisets cannot be identified with the
set of strings in {1, . . . , n}∗, because multiset union is commutative, but concatenation
of strings in {1, . . . , n}∗ is not commutative when n ≥ 2. This is because in a multiset
k1 · 1 + · · · + kn · n, the order is irrelevant, whereas in a string, the order is relevant. For
example, 2 · 1 + 3 · 2 = 3 · 2 + 2 · 1, but 11222 = 22211, as strings over {1, 2}.
Nevertherless,
(n)
n
N
and the set N of ordered n-tuples under component-wise addition
are isomorphic under the map
k1 · 1 + · · · + kn · n → (k1, . . . , kn).
Thus, since the notation (k1, . . . , kn) is less cumbersome that k1 · 1 + · · · + kn · n, it will be
preferred. We just have to remember that the order of the ki is really irrelevant.
But when I is infinite, beware that
(I)
I
N
and the set N of ordered I-tuples are not
isomorphic.
We are now ready to define polynomials.
20.2
Polynomials
We begin with polynomials in one variable.
Definition 20.2. Given a ring A, we define the set PA(1) of polynomials over A in one
variable as the set of functions P : N → A such that P (k) = 0 for finitely many k ∈ N. The
polynomial such that P (k) = 0 for all k ∈ N is the null (or zero) polynomial and it is denoted
by 0. We define addition of polynomials, multiplication by a scalar, and multiplication of
polynomials, as follows: Given any three polynomials P, Q, R ∈ PA(1), letting ak = P (k),
bk = Q(k), and ck = R(k), for every k ∈ N, we define R = P + Q such that
ck = ak + bk,
R = λP such that
ck = λak,
where λ ∈ A,
20.2. POLYNOMIALS
533
and R = P Q such that
ck =
aibj.
i+j=k
We define the polynomial ek such that ek(k) = 1 and ek(i) = 0 for i = k. We also denote
e0 by 1 when k = 0. Given a polynomial P , the ak = P (k) ∈ A are called the coefficients
of P . If P is not the null polynomial, there is a greatest n ≥ 0 such that an = 0 (and thus,
ak = 0 for all k > n) called the degree of P and denoted by deg(P ). Then, P is written
uniquely as
P = a0e0 + a1e1 + · · · + anen.
When P is the null polynomial, we let deg(P ) = −∞.
There is an injection of A into PA(1) given by the map a → a1 (recall that 1 denotes e0).
There is also an injection of N into PA(1) given by the map k → ek. Observe that ek = ek1
(with e01 = e0 = 1). In order to alleviate the notation, we often denote e1 by X, and we call
X a variable (or indeterminate). Then, ek = ek1 is denoted by Xk. Adopting this notation,
given a nonnull polynomial P of degree n, if P (k) = ak, P is denoted by
P = a0 + a1X + · · · + anXn,
or by
P = anXn + an−1Xn−1 + · · · + a0,
if this is more convenient (the order of the terms does not matter anyway). Sometimes, it
will also be convenient to write a polynomial as
P = a0Xn + a1Xn−1 + · · · + an.
The set PA(1) is also denoted by A[X] and a polynomial P may be denoted by P (X).
In denoting polynomials, we will use both upper-case and lower-case letters, usually, P, Q,
R, S, p, q, r, s, but also f, g, h, etc., if needed (as long as no ambiguities arise).
Given a nonnull polynomial P of degree n, the nonnull coefficient an is called the leading
coefficient of P . The coefficient a0 is called the constant term of P . A polynomial of the
form akXk is called a monomial. We say that akXk occurs in P if ak = 0. A nonzero
polynomial P of degree n is called a monic polynomial (or unitary polynomial, or monic) if
an = 1, where an is its leading coefficient, and such a polynomial can be written as
P = Xn + an−1Xn−1 + · · · + a0
or
P = Xn + a1Xn−1 + · · · + an.
The choice of the variable X to denote e1 is standard practice, but there is nothing special
about X. We could have chosen Y , Z, or any other symbol, as long as no ambiguities
arise.
534
CHAPTER 20. POLYNOMIALS, IDEALS AND PID’S
Formally, the definition of PA(1) has nothing to do with X. The reason for using X is
simply convenience. Indeed, it is more convenient to write a polynomial as P = a0 + a1X +
· · · + anXn rather than as P = a0e0 + a1e1 + · · · + anen.
We have the following simple but crucial proposition.
Proposition 20.1. Given two nonnull polynomials P (X) = a0 +a1X +· · ·+amXm of degree
m and Q(X) = b0 + b1X + · · · + bnXn of degree n, if either am or bn is not a zero divisor,
then ambn = 0, and thus, P Q = 0 and
deg(P Q) = deg(P ) + deg(Q).
In particular, if A is an integral domain, then A[X] is an integral domain.
Proof. Since the coefficient of Xm+n in P Q is ambn, and since we assumed that either am or
an is not a zero divisor, we have ambn = 0, and thus, P Q = 0 and
deg(P Q) = deg(P ) + deg(Q).
Then, it is obvious that A[X] is an integral domain.
It is easily verified that A[X] is a commutative ring, with multiplicative identity 1X0 = 1.
It is also easily verified that A[X] satisfies all the conditions of Definition 2.9, but A[X] is
not a vector space, since A is not necessarily a field.
A structure satisfying the axioms of Definition 2.9 when K is a ring (and not necessarily a
field) is called a module. As we mentioned in Section 4.2, we will not study modules because
they fail to have some of the nice properties that vector spaces have, and thus, they are
harder to study. For example, there are modules that do not have a basis.
However, when the ring A is a field, A[X] is a vector space. But even when A is just a
ring, the family of polynomials (Xk)k∈ is a basis of A[X], since every polynomial P (X) can
N
be written in a unique way as P (X) = a0 + a1X + · · · + anXn (with P (X) = 0 when P (X)
is the null polynomial). Thus, A[X] is a free module.
Next, we want to define the notion of evaluating a polynomial P (X) at some α ∈ A. For
this, we need a proposition.
Proposition 20.2. Let A, B be two rings and let h : A → B be a ring homomorphism.
For any β ∈ B, there is a unique ring homomorphism ϕ : A[X] → B extending h such that
ϕ(X) = β, as in the following diagram (where we denote by h+β the map h+β : A∪{X} → B
such that (h + β)(a) = h(a) for all a ∈ A and (h + β)(X) = β):
A ∪ {X} ι /
h+β
%▲
▲
▲
▲
▲
▲
▲
▲
▲
▲
A[X]
ϕ
B
20.2. POLYNOMIALS
535
Proof. Let ϕ(0) = 0, and for every nonull polynomial P (X) = a0 + a1X + · · · + anXn, let
ϕ(P (X)) = h(a0) + h(a1)β + · · · + h(an)βn.
It is easily verified that ϕ is the unique homomorphism ϕ : A[X] → B extending h such that
ϕ(X) = β.
Taking A = B in Proposition 20.2 and h : A → A the identity, for every β ∈ A, there
is a unique homomorphism ϕβ : A[X] → A such that ϕβ(X) = β, and for every polynomial
P (X), we write ϕβ(P (X)) as P (β) and we call P (β) the value of P (X) at X = β. Thus, we
can define a function PA : A → A such that PA(β) = P (β), for all β ∈ A. This function is
called the polynomial function induced by P .
More generally, PB can be defined for any (commutative) ring B such that A ⊆ B. In
general, it is possible that PA = QA for distinct polynomials P, Q. We will see shortly
conditions for which the map P → PA is injective. In particular, this is true for A = R (in
general, any infinite integral domain). We now define polynomials in n variables.
Definition 20.3. Given n ≥ 1 and a ring A, the set PA(n) of polynomials over A in n
variables is the set of functions P :
(n)
N
→ A such that P (k1, . . . , kn) = 0 for finitely many
(k
(n)
1, . . . , kn) ∈ N
. The polynomial such that P (k1, . . . , kn) = 0 for all (k1, . . . , kn) is
the null (or zero) polynomial and it is denoted by 0. We define addition of polynomials,
multiplication by a scalar, and multiplication of polynomials, as follows: Given any three
polynomials P, Q, R ∈ PA(n), letting a(k1,...,kn) = P (k1, . . . , kn), b(k1,...,kn) = Q(k1, . . . , kn),
c
(n)
(k
, we define R = P + Q such that
1,...,kn) = R(k1, . . . , kn), for every (k1, . . . , kn) ∈ N
c(k1,...,kn) = a(k1,...,kn) + b(k1,...,kn),
R = λP , where λ ∈ A, such that
c(k1,...,kn) = λa(k1,...,kn),
and R = P Q, such that
c(k
a
1,...,kn) =
(i1,...,in)b(j1,...,jn).
(i1,...,in)+(j1,...,jn)=(k1,...,kn)
For every (k
(n)
1, . . . , kn) ∈ N
, we let e(k1,...,kn) be the polynomial such that
e(k1,...,kn)(k1, . . . , kn) = 1 and e(k1,...,kn)(h1, . . . , hn) = 0,
for (h1, . . . , hn) = (k1, . . . , kn). We also denote e(0,...,0) by 1. Given a polynomial P , the
a(k1,...,kn) = P (k1, . . . , kn) ∈ A, are called the coefficients of P . If P is not the null polynomial,
there is a greatest d ≥ 0 such that a
(n)
(k
, with d =
1,...,kn)
= 0 for some (k1, . . . , kn) ∈ N
k1 + · · · + kn, called the total degree of P and denoted by deg(P ). Then, P is written
uniquely as
P =
a(k1,...,kn)e(k1,...,kn).
(k
(n)
1,...,kn)∈N
When P is the null polynomial, we let deg(P ) = −∞.
536
CHAPTER 20. POLYNOMIALS, IDEALS AND PID’S
There is an injection of A into PA(n) given by the map a → a1 (where 1 denotes e(0,...,0)).
There is also an injection of (n)
N
into PA(n) given by the map (h1, . . . , hn) → e(h1,...,hn). Note
that e(h1,...,hn)e(k1,...,kn) = e(h1+k1,...,hn+kn). In order to alleviate the notation, let X1, . . . , Xn
be n distinct variables and denote e(0,...,0,1,0...,0), where 1 occurs in the position i, by Xi
(where 1 ≤ i ≤ n). With this convention, in view of e(h1,...,hn)e(k1,...,kn) = e(h1+k1,...,hn+kn), the
polynomial e(k1,...,kn) is denoted by Xk1
1 · · · X kn
n
(with e(0,...,0) = X01 · · · X0n = 1) and it is called
a primitive monomial . Then, P is also written as
P =
a(k1,...,kn)Xk1
1 · · · X kn
n .
(k
(n)
1,...,kn)∈N
We also denote PA(n) by A[X1, . . . , Xn]. A polynomial P ∈ A[X1, . . . , Xn] is also denoted
by P (X1, . . . , Xn).
As in the case n = 1, there is nothing special about the choice of X1, . . . , Xn as variables
(or indeterminates). It is just a convenience. After all, the construction of PA(n) has nothing
to do with X1, . . . , Xn.
Given a nonnull polynomial P of degree d, the nonnull coefficients a(k1,...,kn) = 0 such
that d = k1 + · · · + kn are called the leading coefficients of P . A polynomial of the form
a(k1,...,kn)Xk1
1 · · · X kn
n
is called a monomial . Note that deg(a(k1,...,kn)Xk1
1 · · · X kn
n ) = k1+· · ·+kn.
Given a polynomial
P =
a(k1,...,kn)Xk1
1 · · · X kn
n ,
(k
(n)
1,...,kn)∈N
a monomial a(k1,...,kn)Xk1
1 · · · X kn
n
occurs in the polynomial P if a(k1,...,kn) = 0.
A polynomial
P =
a(k1,...,kn)Xk1
1 · · · X kn
n
(k
(n)
1,...,kn)∈N
is homogeneous of degree d if
deg(Xk1
1 · · · X kn
n ) = d,
for every monomial a(k1,...,kn)Xk1
1 · · · X kn
n
occurring in P . If P is a polynomial of total degree
d, it is clear that P can be written uniquely as
P = P (0) + P (1) + · · · + P (d),
where P (i) is the sum of all monomials of degree i occurring in P , where 0 ≤ i ≤ d.
It is easily verified that A[X1, . . . , Xn] is a commutative ring, with multiplicative identity
1X01 · · · X0n = 1. It is also easily verified that A[X] is a module. When A is a field, A[X] is
a vector space.
Even when A is just a ring, the family of polynomials
(Xk1
1 · · · X kn
n )(k
(n)
1,...,kn)∈N
20.2. POLYNOMIALS
537
is a basis of A[X1, . . . , Xn], since every polynomial P (X1, . . . , Xn) can be written in a unique
way as
P (X1, . . . , Xn) =
a(k1,...,kn)Xk1
1 · · · X kn
n .
(k
(n)
1,...,kn)∈N
Thus, A[X1, . . . , Xn] is a free module.
Remark: The construction of Definition 20.3 can be immediately extended to an arbitrary
set I, and not just I = {1, . . . , n}. It can also be applied to monoids more general that (I)
N
.
Proposition 20.2 is generalized as follows.
Proposition 20.3. Let A, B be two rings and let h : A → B be a ring homomorphism. For
any β = (β1, . . . , βn) ∈ Bn, there is a unique ring homomorphism ϕ : A[X1, . . . , Xn] → B
extending h such that ϕ(Xi) = βi, 1 ≤ i ≤ n, as in the following diagram (where we denote
by h + β the map h + β : A ∪ {X1, . . . , Xn} → B such that (h + β)(a) = h(a) for all a ∈ A
and (h + β)(Xi) = βi, 1 ≤ i ≤ n):
A ∪ {X1, . . . , Xn} ι /
h+β
)❚
❚
❚
❚
❚
❚
❚
❚
❚
❚
❚
❚
❚
❚
❚
❚
❚
A[X1, . . . , Xn]
ϕ
B
Proof. Let ϕ(0) = 0, and for every nonull polynomial
P (X1, . . . , Xn) =
a(k1,...,kn)Xk1
1 · · · X kn
n ,
(k
(n)
1,...,kn)∈N
let
ϕ(P (X1, . . . , Xn)) =
h(a(k1,...,kn))βk1
1 · · · βkn
n .
It is easily verified that ϕ is the unique homomorphism ϕ : A[X1, . . . , Xn] → B extending h
such that ϕ(Xi) = βi.
Taking A = B in Proposition 20.3 and h : A → A the identity, for every β1, . . . , βn ∈ A,
there is a unique homomorphism ϕ : A[X1, . . . , Xn] → A such that ϕ(Xi) = βi, and for
every polynomial P (X1, . . . , Xn), we write ϕ(P (X1, . . . , Xn)) as P (β1, . . . , βn) and we call
P (β1, . . . , βn) the value of P (X1, . . . , Xn) at X1 = β1, . . . , Xn = βn. Thus, we can define a
function PA : An → A such that PA(β1, . . . , βn) = P (β1, . . . , βn), for all β1, . . . , βn ∈ A. This
function is called the polynomial function induced by P .
More generally, PB can be defined for any (commutative) ring B such that A ⊆ B. As
in the case of a single variable, it is possible that PA = QA for distinct polynomials P, Q.
We will see shortly that the map P → PA is injective when A = R (in general, any infinite
integral domain).
538
CHAPTER 20. POLYNOMIALS, IDEALS AND PID’S
Given any nonnull polynomial P (X1, . . . , Xn) =
(k
(n) a(k
1,...,kn)∈N
1,...,kn)X k1
1 · · · X kn
n
in
A[X1, . . . , Xn], where n ≥ 2, P (X1, . . . , Xn) can be uniquely written as
P (X1, . . . , Xn) =
Qk (X
n
1, . . . , Xn−1)X kn
n ,
where each polynomial Qk (X
n
1, . . . , Xn−1) is in A[X1, . . . , Xn−1]. Thus, even if A is a field,
A[X1, . . . , Xn−1] is not a field, which confirms that it is useful (and necessary!) to consider
polynomials over rings that are not necessarily fields.
It is not difficult to show that A[X1, . . . , Xn] and A[X1, . . . , Xn−1][Xn] are isomorphic
rings. This way, it is often possible to prove properties of polynomials in several variables
X1, . . . , Xn, by induction on the number n of variables. For example, given two nonnull
polynomials P (X1, . . . , Xn) of total degree p and Q(X1, . . . , Xn) of total degree q, since we
assumed that A is an integral domain, we can prove that
deg(P Q) = deg(P ) + deg(Q),
and that A[X1, . . . , Xn] is an integral domain.
Next, we will consider the division of polynomials (in one variable).
20.3
Euclidean Division of Polynomials
We know that every natural number n ≥ 2 can be written uniquely as a product of powers of
prime numbers and that prime numbers play a very important role in arithmetic. It would be
nice if every polynomial could be expressed (uniquely) as a product of “irreducible” factors.
This is indeed the case for polynomials over a field. The fact that there is a division algorithm
for the natural numbers is essential for obtaining many of the arithmetical properties of the
natural numbers. As we shall see next, there is also a division algorithm for polynomials in
A[X], when A is a field.
Proposition 20.4. Let A be a ring, let f (X), g(X) ∈ A[X] be two polynomials of degree
m = deg(f ) and n = deg(g) with f (X) = 0, and assume that the leading coefficient am of
f (X) is invertible. Then, there exist unique polynomials q(X) and r(X) in A[X] such that
g = f q + r
and
deg(r) < deg(f ) = m.
Proof. We first prove the existence of q and r. Let
f = amXm + am−1Xm−1 + · · · + a0,
and
g = bnXn + bn−1Xn−1 + · · · + b0.
If n < m, then let q = 0 and r = g. Since deg(g) < deg(f ) and r = g, we have deg(r) <
deg(f ).
20.3. EUCLIDEAN DIVISION OF POLYNOMIALS
539
If n ≥ m, we proceed by induction on n. If n = 0, then g = b0, m = 0, f = a0 = 0, and
we let q = a−1
0 b0 and r = 0. Since deg(r) = deg(0) = −∞ and deg(f ) = deg(a0) = 0 because
a0 = 0, we have deg(r) < deg(f ).
If n ≥ 1, since n ≥ m, note that
g1(X) = g(X) − bna−1
m X n−mf (X )
= bnXn + bn−1Xn−1 + · · · + b0 − bna−1
m X n−m(amX m + am−1X m−1 + · · · + a0)
is a polynomial of degree deg(g1) < n, since the terms bnXn and bna−1
m X n−mamX m of degree
n cancel out. Now, since deg(g1) < n, by the induction hypothesis, we can find q1 and r
such that
g1 = f q1 + r and deg(r) < deg(f ) = m,
and thus,
g1(X) = g(X) − bna−1
m X n−mf (X ) = f (X )q1(X ) + r(X ),
from which, letting q(X) = bna−1
m X n−m + q1(X ), we get
g = f q + r
and deg(r) < m = deg(f ).
We now prove uniqueness. If
g = f q1 + r1 = f q2 + r2,
with deg(r1) < deg(f ) and deg(r2) < deg(f ), we get
f (q1 − q2) = r2 − r1.
If q2 − q1 = 0, since the leading coefficient am of f is invertible, by Proposition 20.1, we have
deg(r2 − r1) = deg(f(q1 − q2)) = deg(f) + deg(q2 − q1),
and so, deg(r2−r1) ≥ deg(f), which contradicts the fact that deg(r1) < deg(f) and deg(r2) <
deg(f ). Thus, q1 = q2, and then also r1 = r2.
It should be noted that the proof of Proposition 20.4 actually provides an algorithm for
finding the quotient q and the remainder r of the division of g by f . This algorithm is
called the Euclidean algorithm, or division algorithm. Note that the division of g by f is
always possible when f is a monic polynomial, since 1 is invertible. Also, when A is a field,
am = 0 is always invertible, and thus, the division can always be performed. We say that f
divides g when r = 0 in the result of the division g = f q + r. We now draw some important
consequences of the existence of the Euclidean algorithm.
540
CHAPTER 20. POLYNOMIALS, IDEALS AND PID’S
20.4
Ideals, PID’s, and Greatest Common Divisors
First, we introduce the fundamental concept of an ideal.
Definition 20.4. Given a ring A, an ideal of A is any nonempty subset I of A satisfying
the following two properties:
(ID1) If a, b ∈ I, then b − a ∈ I.
(ID2) If a ∈ I, then ax ∈ I for every x ∈ A.
An ideal I is a principal ideal if there is some a ∈ I, called a generator, such that
I = {ax | x ∈ A}.
The equality I = {ax | x ∈ A} is also written as I = aA or as I =