Vector Calculus by Michael Corral - HTML preview

PLEASE NOTE: This is an HTML preview only and some elements such as links or page numbers may be incorrect.
Download the book in PDF, ePub, Kindle for a complete version.

steepest descent technique, which is based on an idea that we discussed in Section 2.4. Recall

that the negative gradient −∇ f gives the direction of the fastest rate of decrease of a function

f . The crux of the steepest descent idea, then, is that starting from some initial point, you

move a certain amount in the direction of −∇ f at that point. Wherever that takes you

2.6 Unconstrained Optimization: Numerical Methods

95

becomes your new point, and you then just keep repeating that procedure until eventually

(hopefully) you reach the point where f has its smallest value. There is a “pure” steepest

descent method, and a multitude of variations on it that improve the rate of convergence,

ease of calculation, etc. In fact, Newton’s algorithm can be interpreted as a modified steepest

descent method. For more discussion of this, and of nonlinear programming in general, see

BAZARAA, SHERALI and SHETTY.

Exercises

C

1. Recall Example 2.21 from the previous section, where we showed that the point (2, 1) was

a global minimum for the function f ( x, y) = ( x − 2)4 + ( x − 2 y)2. Notice that our computer

program can be modified fairly easily to use this function (just change the return values

in the fx, fy, fxx, fyy and fxy function definitions to use the appropriate partial derivative).

Either modify that program or write one of your own in a programming language of your

choice to show that Newton’s algorithm does lead to the point (2, 1). First use the initial

point (0, 3), then use the initial point (3, 2), and compare the results. Make sure that your

program attempts to do 100 iterations of the algorithm. Did anything strange happen

when your program ran? If so, how do you explain it? ( Hint: Something strange should

happen. )

2. There is a version of Newton’s algorithm for solving a system of two equations

f ( x, y)

( x, y)

1

= 0

and

f 2

= 0 ,

where f ( x, y) and f ( x, y) are smooth real-valued functions:

1

2

Pick an initial point ( x , y ). For n

0

0

= 0,1,2,3,..., define:

f ( x , y )

f ( x , y )

1

n

n

2

n

n

f ( x , y )

f ( x , y )

1

n

n

2

n

n

∂ f 1 ( x , y ) ∂f 2 ( x , y )

∂ f 1

∂ y

n

n

∂ y

n

n

( x , y )

∂ f 2 ( x , y )

n

n

n

n

x

,

y

∂x

∂x

, where

n+1 = xn

D( x , y )

n+1 = yn +

D( x , y )

n

n

n

n

∂ f

∂ f

∂ f

∂ f

D( x , y )

1 ( x , y )

2 ( x , y )

1 ( x , y )

2 ( x , y ) .

n

n

=

∂x

n

n

∂y

n

n

∂y

n

n

∂x

n

n

Then the sequence of points ( x , y )∞

converges to a solution. Write a computer program

n

n n=1

that uses this algorithm to find approximate solutions to the system of equations

f ( x, y)

( x, y)

1

= sin( xy) − x y = 0

and

f 2

= e 2 x − 2 x + 3 y = 0 .

Show that you get two different solutions when using (0, 0) and (1, 1) for the initial point

( x , y ).

0

0

96

CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

2.7 Constrained Optimization: Lagrange Multipliers

In Sections 2.5 and 2.6 we were concerned with finding maxima and minima of functions

without any constraints on the variables (other than being in the domain of the function).

What would we do if there were constraints on the variables? The following example illus-

trates a simple case of this type of problem.

Example 2.24. For a rectangle whose perimeter is 20 m, find the dimensions that will max-

imize the area.

Solution: The area A of a rectangle with width x and height y is A = xy. The perimeter P of the rectangle is then given by the formula P = 2 x+2 y. Since we are given that the perimeter

P = 20, this problem can be stated as:

Maximize : f ( x, y) = xy

given : 2 x + 2 y = 20

The reader is probably familiar with a simple method, using single-variable calculus, for

solving this problem. Since we must have 2 x + 2 y = 20, then we can solve for, say, y in

terms of x using that equation. This gives y = 10 − x, which we then substitute into f to get

f ( x, y) = xy = x(10 − x) = 10 x x 2. This is now a function of x alone, so we now just have to maximize the function f ( x) = 10 x x 2 on the interval [0,10]. Since f ′( x) = 10−2 x = 0 ⇒ x = 5

and f ′′(5) = −2 < 0, then the Second Derivative Test tells us that x = 5 is a local maximum

for f , and hence x = 5 must be the global maximum on the interval [0,10] (since f = 0 at

the endpoints of the interval). So since y = 10 − x = 5, then the maximum area occurs for a

rectangle whose width and height both are 5 m.

Notice in the above example that the ease of the solution depended on being able to solve

for one variable in terms of the other in the equation 2 x + 2 y = 20. But what if that were not

possible (which is often the case)? In this section we will use a general method, called the

Lagrange multiplier method 10, for solving constrained optimization problems:

Maximize (or minimize) : f ( x, y) (or f ( x, y, z))

given : g( x, y) = c (or g( x, y, z) = c) for some constant c

The equation g( x, y) = c is called the constraint equation, and we say that x and y are constrained by g( x, y) = c. Points ( x, y) which are maxima or minima of f ( x, y) with the condition that they satisfy the constraint equation g( x, y) = c are called constrained maximum

or constrained minimum points, respectively. Similar definitions hold for functions of three

variables.

The Lagrange multiplier method for solving such problems can now be stated:

10Named after the French mathematician Joseph Louis Lagrange (1736-1813).

2.7 Constrained Optimization: Lagrange Multipliers

97

Theorem 2.7. Let f ( x, y) and g( x, y) be smooth functions, and suppose that c is a scalar constant such that ∇ g( x, y) = 0 for all ( x, y) that satisfy the equation g( x, y) = c. Then to solve the constrained optimization problem

Maximize (or minimize) : f ( x, y)

given : g( x, y) = c ,

find the points ( x, y) that solve the equation ∇ f ( x, y) = λg( x, y) for some constant λ (the number λ is called the Lagrange multiplier). If there is a constrained maximum or minimum, then it must be such a point.

A rigorous proof of the above theorem requires use of the Implicit Function Theorem,

which is beyond the scope of this text.11 Note that the theorem only gives a necessary con-

dition for a point to be a constrained maximum or minimum. Whether a point ( x, y) that

satisfies ∇ f ( x, y) = λg( x, y) for some λ actually is a constrained maximum or minimum can sometimes be determined by the nature of the problem itself. For instance, in Example 2.24

it was clear that there had to be a global maximum.

So how can you tell when a point that satisfies the condition in Theorem 2.7 really is a

constrained maximum or minimum? The answer is that it depends on the constraint func-

tion g( x, y), together with any implicit constraints. It can be shown12 that if the constraint

equation g( x, y) = c (plus any hidden constraints) describes a bounded set B in R2, then the

constrained maximum or minimum of f ( x, y) will occur either at a point ( x, y) satisfying

f ( x, y) = λg( x, y) or at a “boundary” point of the set B.

In Example 2.24 the constraint equation 2 x+2 y = 20 describes a line in R2, which by itself

is not bounded. However, there are “hidden” constraints, due to the nature of the problem,

namely 0 ≤ x, y ≤ 10, which cause that line to be restricted to a line segment in R2 (including

the endpoints of that line segment), which is bounded.

Example 2.25. For a rectangle whose perimeter is 20 m, use the Lagrange multiplier method

to find the dimensions that will maximize the area.

Solution: As we saw in Example 2.24, with x and y representing the width and height,

respectively, of the rectangle, this problem can be stated as:

Maximize : f ( x, y) = xy

given : g( x, y) = 2 x + 2 y = 20

Then solving the equation ∇ f ( x, y) = λg( x, y) for some λ means solving the equations

11See TAYLOR and MANN, § 6.8 for more detail.

12Again, see TAYLOR and MANN.

98

CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

∂ f

∂g

∂ f

∂g

= λ

and

= λ

, namely:

∂x

∂x

∂y

∂y

y = 2 λ ,

x = 2 λ

The general idea is to solve for λ in both equations, then set those expressions equal (since

they both equal λ) to solve for x and y. Doing this we get

y

x

= λ =

x = y ,

2

2

so now substitute either of the expressions for x or y into the constraint equation to solve for

x and y:

20 = g( x, y) = 2 x + 2 y = 2 x + 2 x = 4 x

x = 5

y = 5

There must be a maximum area, since the minimum area is 0 and f (5, 5) = 25 > 0, so the

point (5, 5) that we found (called a constrained critical point) must be the constrained maxi-

mum.

∴ The maximum area occurs for a rectangle whose width and height both are 5 m.

Example 2.26. Find the points on the circle x 2 + y 2 = 80 which are closest to and farthest

from the point (1, 2).

Solution: The distance d from any point ( x, y) to the point (1, 2) is

d =

( x − 1)2 + ( y − 2)2 ,

and minimizing the distance is equivalent to minimizing the square of the distance. Thus

the problem can be stated as:

Maximize (and minimize) : f ( x, y) = ( x − 1)2 + ( y − 2)2

given : g( x, y) = x 2 + y 2 = 80

Solving ∇ f ( x, y) = λg( x, y) means solving the following equations:

2( x − 1) = 2 λx ,

2( y − 2) = 2 λy

Note that x = 0 since otherwise we would get −2 = 0 in the first equation. Similarly, y = 0.

So we can solve both equations for λ as follows:

x − 1

y − 2

= λ =

x y y = xy − 2 x

y = 2 x

x

y

2.7 Constrained Optimization: Lagrange Multipliers

99

y

Substituting this into g( x, y) = x 2 + y 2 = 80 yields 5 x 2 = 80,

x 2 + y 2 = 80

(4, 8)

so x = ±4. So the two constrained critical points are (4,8) and

(−4,−8). Since f (4,8) = 45 and f (−4,−8) = 125, and since there

must be points on the circle closest to and farthest from (1, 2),

(1, 2)

x

then it must be the case that (4, 8) is the point on the circle clos-

0

est to (1, 2) and (−4,−8) is the farthest from (1,2) (see Figure

2.7.1).

(−4,−8)

Notice that since the constraint equation x 2+ y 2 = 80 describes

a circle, which is a bounded set in R2, then we were guaranteed

Figure 2.7.1

that the constrained critical points we found were indeed the

constrained maximum and minimum.

The Lagrange multiplier method can be extended to functions of three variables.

Example 2.27.

Maximize (and minimize) : f ( x, y, z) = x + z

given : g( x, y, z) = x 2 + y 2 + z 2 = 1

Solution: Solve the equation ∇ f ( x, y, z) = λg( x, y, z):

1 = 2 λx

0 = 2 λy

1 = 2 λz

The first equation implies λ = 0 (otherwise we would have 1 = 0), so we can divide by λ in the

second equation to get y = 0 and we can divide by λ in the first and third equations to get

x = 1

2 λ = z. Substituting these expressions into the constraint equation g( x, y, z) = x 2 + y 2 +

z 2 = 1 yields the constrained critical points

1 ,0, 1 and −1 ,0, −1 . Since f 1 ,0, 1 >

2

2

2

2

2

2

f −1 , 0, −1 , and since the constraint equation x 2 + y 2 + z 2 = 1 describes a sphere (which is

2

2

bounded) in R3, then

1 ,0, 1

is the constrained maximum point and

−1 ,0, −1 is the

2

2

2

2

constrained minimum point.

So far we have not attached any significance to the value of the Lagrange multiplier λ. We

needed λ only to find the constrained critical points, but made no use of its value. It turns

out that λ gives an approximation of the change in the value of the function f ( x, y) that we

wish to maximize or minimize, when the constant c in the constraint equation g( x, y) = c is

changed by 1.

100

CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

For example, in Example 2.25 we showed that the constrained optimization problem

Maximize : f ( x, y) = xy

given : g( x, y) = 2 x + 2 y = 20

had the solution ( x, y) = (5,5), and that λ = x/2 = y/2. Thus, λ = 2.5. In a similar fashion we

could show that the constrained optimization problem

Maximize : f (