Example 9.17 was somewhat disheartening due to the existence of multiple Nash equilibria. In general, there could be any number of equilibria. How can each player know which one to play? If they each choose a different one, they are not guaranteed to fall into another equilibrium as in the case of saddle points of zero-sum games. Many of the equilibria can be eliminated by using Pareto optimality, which was explained in Section 9.1.1 and also appeared in Section 7.7.2 as a way to optimally coordinate multiple robots. The idea is to formulate the selection as a multi-objective optimization problem, which fits into Formulation 9.2.
Consider two-dimensional vectors of the form , in which
and
represent the costs
and
obtained under the
implementation of a Nash equilibrium denoted by
. For two
different equilibria
and
, the cost vectors
and
are obtained. In Example 9.17, these
were
and
. In general,
is said to be better than
if
,
, and at least
one of the inequalities is strict. In Example 9.17, the
equilibrium that produces
is clearly better than obtaining
because both players benefit.
The definition of ``better'' induces a partial ordering on the space
of Nash equilibria. It is only partial because some vectors are
incomparable. Consider, for example, and
. The first
one is preferable to
, and the second is preferred by
. In
game theory, the Nash equilibria that are minimal with respect to this
partial ordering are called admissible. They could alternatively be called
Pareto optimal.
The best situation is when a game has one Nash equilibrium. If there are multiple Nash equilibria, then there is some hope that only one of them is admissible. In this case, it is hoped that the rational players are intelligent enough to figure out that any nonadmissible equilibria should be discarded. Unfortunately, there are many games that have multiple admissible Nash equilibria. In this case, analysis of the game indicates that the players must communicate or collaborate in some way to eliminate the possibility of regret. Otherwise, regret is unavoidable in the worst case. It is also possible that there are no Nash equilibria, but, fortunately, by allowing randomized strategies, a randomized Nash equilibrium is always guaranteed to exist. This will be covered after the following two examples.
Both and
are Nash equilibria, which yield cost
vectors
and
, respectively. Neither solution is
better than the other; therefore, they are both admissible. There is
no way to avoid the possibility of regret unless the players cooperate
(you probably already knew this).
The following is one of the most famous nonzero-sum games.
The cost represents the number of years that the player will be
sentenced to prison. The cost matrices are assigned as
What should the players do? What would you do? If they could make a
coordinated decision, then it seems that a good choice would be for
both to refuse, which results in costs . In this case,
however, there would be regret because each player would think that he
had a chance to go free (receiving cost 0 by refusing). If they
were to play the game a second time, they might be inclined to change
their decisions.
The Nash equilibrium for this problem is for both of them to
cooperate, which results in . Thus, they pay a price for not
being able to communicate and coordinate their strategy. This
solution is also a security strategy for the players, because it
achieves the lowest cost using worst-case analysis.
Steven M LaValle 2020-08-14