15.2.3 Pontryagin's Minimum Principle

Pontryagin's minimum principle^15.5 is closely related to the HJB equation and provides conditions that an optimal trajectory must satisfy. Keep in mind, however, that the minimum principle provides necessary conditions, but not sufficient conditions, for optimality. In contrast, the HJB equation offered sufficient conditions. Using the minimum principle alone, one is often not able to conclude that a trajectory is optimal. In some cases, however, it is quite useful for finding candidate optimal trajectories. Any trajectory that fails to satisfy the minimum principle cannot be optimal.

To understand the minimum principle, we first return to the case of discrete planning. As mentioned previously, the minimum principle is essentially given by (15.7). This can be considered as a specialization of the HJB equation to the special case of applying the optimal action

. This causes the $\min$ to disappear, but along with it the global properties of the HJB equation also vanish. The minimum principle expresses conditions along the optimal trajectory, as opposed to the cost-to-go function over the whole state space. Therefore, it can at best assure local optimality in the space of possible trajectories.

The minimum principle for the continuous case is essentially given by (15.15), which is the continuous-time counterpart to (15.7). However, it is usually expressed in terms of adjoint variables and a Hamiltonian function, in the spirit of Hamiltonian mechanics from Section 13.4.4.

Let $\lambda$ denote an

-dimensional vector of adjoint variables, which are defined as

Under the execution of the optimal action trajectory ${\tilde{u}}^*$ , the HJB equation implies that

The missing piece of information so far is how $\lambda$ evolves over time. It turns out that a system of the form

Remember that $\lambda$ is defined in (15.25) just to keep track of the change in

. It would be helpful to have an explicit form for (15.29). Suppose that

is selected by a feedback plan to yield $u^* = \pi ^*(x)$ . In this case, the Hamiltonian can be interpreted as a function of only

and $\lambda$ . Under this assumption, differentiating the Hamiltonian (15.26) with respect to

yields

The second term in (15.30) is actually ${\dot \lambda}_i$ , although it is hard to see at first. The total differential of $\lambda_i$ with respect to the state is

The transition equations given by ${\dot x}= f(x,u)$ and (15.36) specify the evolution of the system given by the minimum principle. These are analogous to Hamilton's equations (13.198), which were given in Section 13.4.4. The generalized momentum in that context becomes the adjoint variables here.

When applying the minimum principle, it is usually required to use the fact that the optimal action at all times must satisfy (15.28). Often, this is equivalently expressed as

Example 15..4 (Optimal Planning for the Double Integrator) Recall the double integrator system from Example 13.3. Let ${\ddot q} = u$ , ${\cal C}= {\mathbb{R}}$ , and $U = [-1,1] \cup \{u_T\}$ . Imagine a particle that moves in ${\mathbb{R}}$ . The action is a force in either direction and has at most unit magnitude. The state transition equation is ${\dot x}_1 = x_2$ and ${\dot x}_2 = u$ , and $X = {\mathbb{R}}^2$ . The task is to perform optimal motion planning between any two states ${x_{I}}, {x_{G}}\in X$ . From a given initial state ${x_{I}}$ , a goal state ${x_{G}}$ must be reached in minimum time. The cost functional is defined in this case as

for all $x \in X$ and and $u \in U$ such that $u \not = u_T$ .

Using (15.26), the Hamiltonian is defined as

$\displaystyle H(x,u,\lambda) = 1 + \lambda_1 x_2 + \lambda_2 u .$

(15.38)

The optimal action trajectory is obtained from (15.28) as

$\displaystyle u^*(t) = \operatornamewithlimits{argmin}_{u \in [-1,1]} \left\{ 1 + \lambda_1(t) x_2(t) + \lambda_2(t) u(t)\right\}.$

(15.39)

If $\lambda_2(t) < 0$ , then

, and if $\lambda_2(t) > 0$ , then

. Thus, the action may be assigned as $u^*(t) = -{\rm sgn}(\lambda_2(t))$ , if $\lambda_2(t) \not = 0$ . Note that these two cases are the ``bangs'' of the bang-bang control from Section 14.6.3, and they are also the extremal actions used for the planning algorithm in Section 14.4.1. At the boundary case in which $\lambda_2(t) = 0$ , any action in

may be chosen.

The only remaining task is to determine the values of the adjoint variables over time. The adjoint transition equation is obtained from (15.36) as ${\dot \lambda}_1 = 0$ and ${\dot \lambda}_2 = -\lambda_1$ . The solutions are $\lambda_1(t) = c_1$ and $\lambda_2(t) = c_2 - c_1 t$ , in which and are constants that can be determined at from (15.38) and (15.39). The optimal action depends only on the sign of $\lambda_2(t)$ . Since its solution is the equation of a line, it can change signs at most once. Therefore, there are four possible kinds of solutions, depending on the particular ${x_{I}}$ and ${x_{G}}$ :

Pure acceleration, , is applied for all time.
Pure deceleration, , is applied for all time.
Pure acceleration is applied up to some time and is followed immediately by pure deceleration until the final time.
Pure deceleration is applied up to some time followed immediately by pure acceleration until the final time.

For the last two cases,

is often called the switching time, at which point a discontinuity in ${\tilde{u}}^*$ occurs. These two are bang-bang solutions, which were described in Section 14.6.3. $\blacksquare$

This was one of the simplest possible examples, and the optimal solution was easily found because the adjoint variables are linear functions of time. Section 15.3 covers optimal solutions for the Dubins car, the Reeds-Shepp car, and the differential drive, all of which can be established using the minimum principle combined with some geometric arguments. As systems become more complicated, such analysis is unfortunately too difficult. In these cases, sampling-based methods, such as those of Chapter 14, must be used to determine optimal trajectories.

One common complication is the existence of singular arcs along the solution trajectory. These correspond to a degeneracy in

with respect to

over some duration of time. This could be caused, for example, by having

independent of

. In Example 15.4,

became independent of

when $\lambda_2(t) = 0$ ; however, there was no singular arc because this could only occur for an instant of time. If the duration had been longer, then there would be an interval of time over which the optimal action could not be determined. In general, if the Hessian (recall definition from (8.48)) of

with respect to

is a positive definite matrix, then there are no singular arcs (this is often called the Legendre-Clebsch condition). The minimum principle in this case provides a sufficient condition for local optimality in the space of possible state trajectories. If the Hessian is not positive definite for some interval

with

, then additional information is needed to determine the optimal trajectory over the singular arc from

Note that all of this analysis ignores the existence of obstacles. There is nothing to prevent the solutions from attempting to enter an obstacle region. The action set

and cost

can be adjusted to account for obstacles; however, determining an optimal solution from the minimum principle becomes virtually impossible, except in some special cases.

There are other ways to derive the minimum principle. Recall from Section 13.4.4 that Hamilton's equations can be derived from the Euler-Lagrange equation. It should not be surprising that the minimum principle can also be derived using variational principles [95,789]. The minimum principle can also be interpreted as a form of constrained optimization. This yields the interpretation of $\lambda$ as Lagrange multipliers. A very illuminating reference for further study of the minimum principle is Pontryagin's original works [801].