Linear Controller Design: Limits of Performance by Stephen Boyd and Craig Barratt - HTML preview

/ Home / Computer Sciences / Linear Controller Design: Limits of Performance

PLEASE NOTE: This is an HTML preview only and some elements such as links or page numbers may be incorrect.
Download the book in PDF, ePub, Kindle for a complete version.

CHAPTER 13 ELEMENTS OF CONVEX ANALYSIS

If is a convex functional on a (possibly in nite-dimensional) vector space ,

then we say

is a subgradient for at

is a linear functional on ,

and we have

( )

( ) + (

) for all

(13.3)

;

The subdi erential ( ) consists of all subgradients of at note that it is a set

of linear functionals on .

If = n, then every linear functional on has the form T for some vector

n, and our two de nitions of subgradient are therefore the same, provided we

ignore the distinction between the vector

n and the linear functional on n

given by the inner product with .g

13.1.2

Quasigradients

For quasiconvex functions, there is a concept analogous to the subgradient. Suppose

: n

is quasiconvex, which we recall from section 6.2.2 means that

( + (1 )~) max ( ) (~)

for all 0

;

We say that is a quasigradient for at if

( )

( ) whenever T(

) 0

(13.4)

;

This simply means that the hyperplane T(

) = 0 forms a simple cut for ,

;

exactly as in gure 13.2: if we are searching for a minimizer of , we can rule out

the half-space T(

) 0.

;

If is di erentiable and ( ) = 0, then ( ) is a quasigradient if is convex,

then (13.2) shows that any subgradient is also a quasigradient. It can be shown

that every quasiconvex function has at least one quasigradient at every point. Note

that the length of a quasigradient is irrelevant (for our purposes): all that matters

is its direction, or equivalently, the cutting-plane for that it determines.

Any algorithm for convex optimization that uses only the cutting-planes that

are determined by subgradients will also work for quasiconvex functions, if we sub-

stitute quasigradients for subgradients. It is not possible to form any deep-cut for

a quasiconvex function.

In the in nite-dimensional case, we will say that a linear functional

on is

a quasigradient for the quasiconvex functional at

( )

( ) whenever

(

) 0

;

As discussed above, this agrees with our de nition above for = n, provided we

do not distinguish between vectors and the associated inner product linear func-

tionals.

13.1 SUBGRADIENTS

297

13.1.3

Subgradients and Directional Derivatives

In this section we brie y discuss the directional derivative, a concept of di erential

calculus that is more familiar than the subgradient. We will not use this concept in

the optimization algorithms we present in the next chapter we mention it because

it is used in descent methods, the most common algorithms for optimization.

We de ne the directional derivative of at in the direction as

(

) = lim ( +

)

( )

;

h & 0

(the notation

0 means that converges to 0 from above). It can be shown

that for convex this limit always exists. Of course, if is di erentiable at , then

(

) = ( )T

We say that is a descent direction for at if (

) 0.

The directional derivative tells us how changes if is moved slightly in the

direction , since for small ,

(

)

( ) +

The steepest descent direction of at is de ned as

= argmin 0(

)

k=1

In general the directional derivatives, descent directions, and the steepest descent

direction of at can be described in terms of the subdi erential at (see the

Notes and References at the end of the chapter). In many cases it is considerably

more di cult to nd a descent direction or the steepest descent direction of at x

than a single subgradient of at .x

If is di erentiable at , and ( ) = 0, then

( ) is a descent direction

for at . It is not true, however, that the negative of any nonzero subgradient

provides a descent direction: we can have

( ), = 0, but

not a descent

direction for at . As an example, the level curves of a convex function are

shown in gure 13.4(a), together with a point and a nonzero subgradient . Note

that increases for any movement along the directions , so, in particular,

is not a descent direction. Negatives of the subgradients at non-optimal points

are, however, descent directions for the distance to a (or any) minimizer, i.e., if

( ) =

and ( ) =

, then (

) 0 for any

( ). Thus,

;

moving slightly in the direction

will decrease the distance to (any) minimizer

, as shown in gure 13.4(b).

A consequence of these properties is that the optimization algorithms described

in the next chapter do not necessarily generate sequences of decreasing functional

values (as would a descent method).

298