rule. If n = 0 is an integer and y = x n , then yn = x, and so, assuming y is
differentiable,
d
d
yn =
x.
(1.7.45)
dx
dx
36
CHAPTER 1. DERIVATIVES
Hence
dy
nyn−1
= 1,
(1.7.46)
dx
from which if follows that
dy
1
1
1
1−n
1
1
1
=
=
y1−n =
x
−1
n
=
x n
,
(1.7.47)
dx
nyn−1
n
n
n
showing that the power rule works for rational powers of the form 1 . Note that
n
the above derivation is not complete since we began with the assumption that
1
y = x n is differentiable. Although it is beyond the scope of this text, it may be
shown that this assumption is justified for x > 0 if n is even, and for all x = 0
if n is odd.
Now if m = 0 is also an integer, we have, using the chain rule as above,
d
m
m
d
1
x n =
x n
dx
dx
m−1
1
1
1
= m x
−1
n
x n
n
m m−1
=
x
+ 1 −1
n
n
n
m m
=
x −1
n
.
(1.7.48)
n
Hence we now see that the power rule holds for any non-zero rational exponent.
Theorem 1.7.10. If r = 0 is any rational number, then
d xr = rxr−1.
(1.7.49)
dx
Example 1.7.18. With r = 1 in the previous theorem, we have
2
d √
1
1
x =
x− 12 =
√ ,
dx
2
2 x
in agreement with our earlier direct computation.
2
Example 1.7.19. If y = x 3 , then
dy
2
2
=
x− 13 =
.
dx
3
1
3x 3
Note that dy is not defined at x = 0, in agreement with our earlier result showing
dx
that y is not differentiable at 0.
4
Exercise 1.7.16.
Find the derivative of f (x) = 5x 5 .
We may now generalize 1.7.44 as follows: If u is a differentiable function of x and r = 0 is a rational number, then
d
du
ur = rur−1
.
(1.7.50)
dx
dx
1.7. PROPERTIES OF DERIVATIVES
37
√
Example 1.7.20. If f (x) =
x2 + 1, then
1
x
f (x) =
(x2 + 1)− 12 (2x) = √
.
2
x2 + 1
Example 1.7.21. If
1
g(t) =
,
t4 + 5
then
4t3
g (t) = (−1)(t4 + 5)−2(4t3) = −
.
(t4 + 5)2
Exercise 1.7.17.
Find the derivative of
4
y = √
.
x2 + 4
Exercise 1.7.18.
Find the derivative of f (x) = (x2 +3x−5)10(3x4 −6x+4)12.
1.7.7
Trigonometric functions
If y = sin(x) and w = cos(x), then, for any infinitesimal dx,
dy = sin(x + dx) − sin(x)
= sin(x) cos(dx) + sin(dx) cos(x) − sin(x)
= sin(x)(cos(dx) − 1) + cos(x) sin(dx)
(1.7.51)
and
dw = cos(x + dx) − cos(x)
= cos(x) cos(dx) − sin(x) sin(dx) − cos(x)
= cos(x)(cos(dx) − 1) − sin(x) sin(dx).
(1.7.52)
Hence, if dx = 0,
dy
sin(dx)
1 − cos(dx)
= cos(x)
− sin(x)
(1.7.53)
dx
dx
dx
and
dw
sin(dx)
1 − cos(dx)
= − sin(x)
+ cos(x)
.
(1.7.54)
dx
dx
dx
Now from (1.5.13) we know that
(dx)2
0 ≤ 1 − cos(dx) ≤
,
(1.7.55)
2
and so
1 − cos(x)
dx
0 ≤
≤
.
(1.7.56)
dx
2
38
CHAPTER 1. DERIVATIVES
Hence
1 − cos(dx)
(1.7.57)
dx
is an infinitesimal. Moreover, from (1.5.36), we know that
sin(dx)
1.
(1.7.58)
dx
Hence
dy
cos(x)(1) − sin(x)(0) = cos(x)
(1.7.59)
dx
and
dw
− sin(x)(1) + cos(x)(0) = − sin(x)
(1.7.60)
dx
That is, we have shown the following.
Theorem 1.7.11. For all real values x,
d sin(x) = cos(x)
(1.7.61)
dx
and
d cos(x) = − sin(x).
(1.7.62)
dx
Example 1.7.22. Using the chain rule,
d
d
cos(4x) = − sin(4x)
(4x) = −4 sin(4x).
dx
dt
Example 1.7.23. If f (t) = sin2(t), then, again using the chain rule,
d
f (t) = 2 sin(t)
sin(t) = 2 sin(t) cos(t).
dt
Example 1.7.24. If g(x) = cos(x2), then
g (x) = − sin(x2)(2x) = −2x cos(x2).
Example 1.7.25. If f (x) = sin3(4x), then, using the chain rule twice,
d
f (x) = 3 sin2(4x)
sin(4x) = 12 sin2(4x) cos(4x).
dx
Exercise 1.7.19.
Find the derivatives of
y = cos(3t + 6) and w = sin2(t) cos2(4t).
1.8. A GEOMETRIC INTERPRETATION OF THE DERIVATIVE
39
Exercise 1.7.20.
Verify the following:
d
d
(a)
tan(t) = sec2(t)
(b)
cot(t) = − csc2(t)
dt
dt
d
d
(c)
sec(t) = sec(t) tan(t)
(d)
csc(t) = − csc(t) cot(t)
dt
dt
Exercise 1.7.21.
Find the derivative of y = sec2(3t).
Exercise 1.7.22.
Find the derivative of f (t) = tan2(3t).
1.8
A geometric interpretation of the derivative
Recall that if y = f (x), then, for any real number ∆x,
∆y
f (x + ∆x) − f (x)
=
(1.8.1)
∆x
∆x
is the average rate of change of y with respect to x over the interval [x, x + ∆x]
(see (1.2.7)). Now if the graph of y is a straight line, that is, if f (x) = mx + b for some real numbers m and b, then (1.8.1) is m, the slope of the line. In fact, a straight line is characterized by the fact that (1.8.1) is the same for any values of x and ∆x. Moreover, (1.8.1) remains the same when ∆x is infinitesimal; that is, the derivative of y with respect to x is the slope of the line.
For other differentiable functions f , the value of (1.8.1) depends upon both x and ∆x. However, for infinitesimal values of ∆x, the shadow of (1.8.1), that is, the derivative dy , depends on x alone. Hence it is reasonable to think of dy
dx
dx
as the slope of the curve y = f (x) at a point x. Whereas the slope of a straight
line is constant from point to point, for other differentiable functions the value
of the slope of the curve will vary from point to point.
If f is differentiable at a point a, we call the line with slope f (a) passing
through (a, f (a)) the tangent line to the graph of f at (a, f (a)). That is, the
tangent line to the graph of y = f (x) at x = a is the line with equation
y = f (a)(x − a) + f (a).
(1.8.2)
Hence a tangent line to the graph of a function f is a line through a point on
the graph of f whose slope is equal to the slope of the graph at that point.
Example 1.8.1. If f (x) = x5 − 6x2 + 5, then
f (x) = 5x4 − 12x.
In particular, f
− 1
= 101 , and so the equation of the line tangent to the
2
16
graph of f at x = − 1 is
2
101
1
111
y =
x +
+
.
6
2
32
See Figure 1.8.1
40
CHAPTER 1. DERIVATIVES
20
y
y = 101
x + 1
+ 111
16
2
32
10
5
2
y = x
6
+ 5
−
x
0
2
1
0
1
2
−
−
x
10
−
20
−
Figure 1.8.1: A tangent line to the graph of f (x) = x5 − 6x2 + 5
Exercise 1.8.1.
Find an equation for the line tangent to the graph of
f (x) = 3x4 − 6x + 3
at x = 2.
Exercise 1.8.2.
Find an equation for the line tangent to the graph of
y = 3 sin2(x)
at x = π .
4
1.9
Increasing, decreasing, and local extrema
Recall that the slope of a line is positive if, and only if, the line rises from left
to right. That is, if m > 0, f (x) = mx + b, and u < v, then
f (v) = mv + b
= mv − mu + mu + b
= m(v − u) + mu + b
> mu + b
= f (u).
(1.9.1)
We should expect that an analogous statement holds for differentiable functions:
if f is differentiable and f (x) > 0 for all x in an interval (a, b), then f (v) > f (u) for any v > u in (a, b). This is in fact the case, although the inference requires
establishing a direct connection between slope at a point and the average slope
1.9. INCREASING, DECREASING, AND LOCAL EXTREMA
41
over an interval, or, in terms of rates of change, between the instantaneous
rate of change at a point and the average rate of change over an interval. The
mean-value theorem makes this connection.
1.9.1
The mean-value theorem
Recall that the extreme value property tells us that a continuous function on a
closed interval must attain both a minimum and a maximum value. Suppose f
is continuous on [a, b], differentiable on (a, b), and f attains a maximum value
at c with a < c < b. In particular, for any infinitesimal dx, f (c) ≥ f (c + dx),
and so, equivalently, f (c + dx) − f (c) ≤ 0. It follows that if dx > 0,
f (c + dx) − f (c) ≤ 0,
(1.9.2)
dx
and if dx < 0,
f (c + dx) − f (c) ≥ 0.
(1.9.3)
dx
Since both of these values must be infinitesimally close to the same real number,
it must be the case that
f (c + dx) − f (c)
0.
(1.9.4)
dx
That is, we must have f (c) = 0. A similar result holds if f has a minimum at
c, and so we have the following basic result.
Theorem 1.9.1. If f is differentiable on (a, b) and attains a maximum, or a
minimum, value at c, then f (c) = 0.
Now suppose f is continuous on [a, b], differentiable on (a, b), and f (a) =
f (b). If f is a constant function, then f (c) = 0 for all c in (a, b). If f is not
constant, then there is a point c in (a, b) at which f attains either a maximum
or a minimum value, and so f (c) = 0. In either case, we have the following
result, known as Rolle’s theorem.
Theorem 1.9.2. If f is continuous on [a, b], differentiable on (a, b), and f (a) =
f (b), then there is a real number c in (a, b) for which f (c) = 0.
More generally, suppose f is continuous on [a, b] and differentiable on (a, b).
Let
f (b) − f (a)
g(x) = f (x) −
(x − a) − f (a).
(1.9.5)
b − a
Note that g(x) is the difference between f (x) and the corresponding y value
on the line passing through (a, f (a)) and (b, f (b)). Moroever, g is continuous
on [a, b], differentiable on (a, b), and g(a) = 0 = g(b). Hence Rolle’s theorem
applies to g, so there must exist a point c in (a, b) for which g (c) = 0. Now
f (b) − f (a)
g (c) = f (x) −
,
(1.9.6)
b − a
42
CHAPTER 1. DERIVATIVES
3
y
2
1
3
y = x
3
−
x + 1
0
0
1
2
x
1
−
2
−
Figure 1.9.1: Graph of f (x) = x3 − 3x + 1 with its tangent line at x =
4
3
so we must have
f (b) − f (a)
0 = g (c) = f (c) −
.
(1.9.7)
b − a
That is,
f (b) − f (a)
f (c) =
,
(1.9.8)
b − a
which is our desired connection between instantaneous and average rates of
change, known as the mean-value theorem.
Theorem 1.9.3. If f is continuous on [a, b] and differentiable on (a, b), then
there exists a real number c in (a, b) for which
f (b) − f (a))
f (c) =
.
(1.9.9)
b − a
Example 1.9.1. Consider the function f (x) = x3 − 3x + 1 on the interval [0, 2].
By the mean-value theorem, there must exist at least one point c in [0, 2] for
which
f (2) − f (0)
3 − 1
f (c) =
=
= 1.
2 − 0
2
Now f (x) = 3x2 − 3, so f (c) = 1 implies 3c2 − 3 = 1. Hence c =
4 . Note
3
that this implies that the tangent line to the graph of f at x =
4 is parallel
3
to the line through the endpoints of the graph of f , that is, the points (0, 1) and
(2, 3). See Figure 1.9.1.
1.9. INCREASING, DECREASING, AND LOCAL EXTREMA
43
1.9.2
Increasing and decreasing functions
The preceding discussion leads us to the following definition and theorem.
Definition 1.9.1. We say a function f is increasing on an interval I if, whenever
a < b are points in I, f (a) < f (b). Similarly, we say f is decreasing on I if,
whenever a < b are points in I, f (a) > f (b).
Now suppose f is a defined on an interval I and f (x) > 0 for every x in I
which is not an endpoint of I. Then given any a and b in I, by the mean-value
theorem there exists a point c between a and b for which
f (b) − f (a) = f (c) > 0.
(1.9.10)
b − a
Since b − a > 0, this implies that f (b) > f (a). Hence f is increasing on I. A
similar argument shows that f is decreasing on I if f (x) < 0 for every x in I
which is not an endpoint of I.
Theorem 1.9.4. Suppose f is defined on an interval I. If f (x) > 0 for every
x in I which is not an endpoint of I, then f is increasing on I. If f (x) < 0 for
every x in I which is not an endpoint of I, then f is decreasing on I.
Example 1.9.2. Let f (x) = 2x3 − 3x2 − 12x + 1. Then
f (x) = 6x2 − 6x − 12 = 6(x2 − x − 2) = 6(x − 2)(x + 1).
Hence f (x) = 0 when x = −1 and when x = 2. Now x − 2 < 0 for x < 2
and x − 2 > 0 for x > 2, while x + 1 < 0 for x < −1 and x + 1 > 0 when
x > −1. Thus f (x) > 0 when x < −1 and when x > 2, and f (x) < 0 when
−1 < x < 2. It follows that f is increasing on the intervals (−∞, −1) and
(2, ∞), and decreasing on the interval (−1, 2).
Note that the theorem requires only that we know the sign of f at points
inside a given interval, not at the endpoints. Hence it actually allows us to make
the slightly stronger statement that f is increasing on the intervals (−∞, −1]
and [2, ∞), and decreasing on the interval [−1, 2].
Since f is increasing on (−∞, −1] and decreasing on [−1, 2], the point (−1, 8)
must be a high point on the graph of f , although not necessarily the highest
point on the graph. We say that f has a local maximum of 8 at x = −1.
Similarly, f is decreasing on [−1, 2] and increasing on [2, ∞), and so the point
(2, −19) must be a low point on the graph of f , although, again, not necessarily
the lowest point on the graph. We say that f has a local minimum of −19 at
x = 2. From this information, we can begin to see why the graph of f looks as
it does in Figure 1.9.2.
Definition 1.9.2. We say f has a local maximum at a point c if there exists an
interval (a, b) containing c for which f (c) ≥ f (x) for all x in (a, b). Similarly, we
say f has a local minimum at a point c if there exists an interval (a, b) containing
c for which f (c) ≤ f (x) for all x in (a, b). We say f has a local extremum at c
if f has either a local maximum or a local minimum at c.
44
CHAPTER 1. DERIVATIVES
y
30
20
10
3
2
y = 2x
3
12
−
x −
x + 1
0
3
2
1
0
1
2
3
4
−
−
−
x
10
−
20
−
30
−
40
−
Figure 1.9.2: Graph of f (x) = 2x3 − 3x2 − 12x + 1
We may now rephrase Theorem 1.9.1 as follows.
Theorem 1.9.5. If f is differentiable at c and has a local extremum at c, then
f (c) = 0.
As illustrated in the preceding example, we may identify local minimums of
a function f by locating those points at which f changes from decreasing to
increasing, and local maximums by locating those points at which f changes
from increasing to decreasing.
Example 1.9.3. Let f (x) = x + 2 sin(x). Then f (x) = 1 + 2 cos(x), and so
f (x) < 0 when, and only when,
1
cos(x) < − .
2
For 0 ≤ x ≤ 2π, this occurs when, and only when,
2π
4π
< x <
.
3
3
Since the cosine function has period 2π, if follows that f (x) < 0 when, and
only when, x is in an interval of the form
2π
4π
+ 2πn,
+ 2πn
3
3
for n = 0, ±1, ±2, . . .. Hence f is decreasing on these intervals and increasing
on intervals of the form
2π
2π
−
+ 2πn,
+ 2πn
,
3
3
1.10. OPTIMIZATION
45
y
10
y = x + 2 sin(x)
0
−10
0
10
x
−10
Figure 1.9.3: Graph of f (x) = x + 2 sin(x)
n = 0, ±1, ±2, . . .. It now follows that f has a local maximum at every point of
the form
2π
x =
+ 2πn
3
and a local minimum at every point of the form
4π
x =
+ 2πn.
3
From this information, we can begin to see why the graph of f looks as it does
in Figure 1.9.3.
Exercise 1.9.1.
Find the intervals where f (x) = x3 − 6x is increasing and
the intervals where f is decreasing. Use this information to identify any local
maximums or local minimums of f .
Exercise 1.9.2.
Find the intervals where f (x) = 5x3 − 3x5 is increasing and
the intervals where f is decreasing. Use this information to identify any local
maximums or local minimums of f .
Exercise 1.9.3.
Find the intervals where f (x) = x + sin(x) is increasing and
the intervals where f is decreasing. Use this information to identify any local
maximums or local minimums f .
1.10
Optimization
Optimization problems, that is, problems in which we seek to find the greatest or
smallest value of some quantity, are common in the applications of mathematics.
46
CHAPTER 1. DERIVATIVES
Because of the extreme-value property, there is a straightforward algorithm
for solving optimization problems involving continuous functions on closed and
bounded intervals. Hence we will treat this case first before considering functions
on other intervals.
Recall that if f (c) is the maximum, or minimum, value of f on some interval
I and f is differentiable at c, then f (c) = 0. Consequently, points at which the
derivative vanishes will play an important role in our work on optimization.
Definition 1.10.1. We call a real number c where f (c) = 0 a stationary point
of f .
1.10.1
Optimization on a closed interval
Suppose f is a continuous function on a closed and bounded interval [a, b]. By
the extreme-value property, f attains a maximum, as well as a minimum value,
on [a, b]. In particular, there is a real number c in [a, b] such that f (c) ≥ f (x)
for all x in [a, b]. If c is in (a, b) and f is differentiable at c, then we mu