Example 9.17 was somewhat disheartening due to the existence of multiple Nash equilibria. In general, there could be any number of equilibria. How can each player know which one to play? If they each choose a different one, they are not guaranteed to fall into another equilibrium as in the case of saddle points of zero-sum games. Many of the equilibria can be eliminated by using Pareto optimality, which was explained in Section 9.1.1 and also appeared in Section 7.7.2 as a way to optimally coordinate multiple robots. The idea is to formulate the selection as a multi-objective optimization problem, which fits into Formulation 9.2.
Consider two-dimensional vectors of the form , in which and represent the costs and obtained under the implementation of a Nash equilibrium denoted by . For two different equilibria and , the cost vectors and are obtained. In Example 9.17, these were and . In general, is said to be better than if , , and at least one of the inequalities is strict. In Example 9.17, the equilibrium that produces is clearly better than obtaining because both players benefit.
The definition of ``better'' induces a partial ordering on the space of Nash equilibria. It is only partial because some vectors are incomparable. Consider, for example, and . The first one is preferable to , and the second is preferred by . In game theory, the Nash equilibria that are minimal with respect to this partial ordering are called admissible. They could alternatively be called Pareto optimal.
The best situation is when a game has one Nash equilibrium. If there are multiple Nash equilibria, then there is some hope that only one of them is admissible. In this case, it is hoped that the rational players are intelligent enough to figure out that any nonadmissible equilibria should be discarded. Unfortunately, there are many games that have multiple admissible Nash equilibria. In this case, analysis of the game indicates that the players must communicate or collaborate in some way to eliminate the possibility of regret. Otherwise, regret is unavoidable in the worst case. It is also possible that there are no Nash equilibria, but, fortunately, by allowing randomized strategies, a randomized Nash equilibrium is always guaranteed to exist. This will be covered after the following two examples.
Both and are Nash equilibria, which yield cost
vectors and , respectively. Neither solution is
better than the other; therefore, they are both admissible. There is
no way to avoid the possibility of regret unless the players cooperate
(you probably already knew this).
The following is one of the most famous nonzero-sum games.
The cost represents the number of years that the player will be sentenced to prison. The cost matrices are assigned as
What should the players do? What would you do? If they could make a coordinated decision, then it seems that a good choice would be for both to refuse, which results in costs . In this case, however, there would be regret because each player would think that he had a chance to go free (receiving cost 0 by refusing). If they were to play the game a second time, they might be inclined to change their decisions.
The Nash equilibrium for this problem is for both of them to
cooperate, which results in . Thus, they pay a price for not
being able to communicate and coordinate their strategy. This
solution is also a security strategy for the players, because it
achieves the lowest cost using worst-case analysis.
Steven M LaValle 2020-08-14