Regret

Suppose that the players apply security strategies, $u^{*} = 2$ and $v^{*} = 4$ . This results in a cost of . How do the players feel after the outcome? ${{\rm P}_1}$ may feel satisfied because given that ${{\rm P}_2}$ selected $v^{*} = 4$ , it received the lowest cost possible. On the other hand, ${{\rm P}_2}$ may regret its decision in light of the action chosen by ${{\rm P}_1}$ . If it had known that would be chosen, then it could have picked to receive cost , which is better than . If the game were to be repeated, then ${{\rm P}_2}$ would want to change its strategy in hopes of tricking ${{\rm P}_1}$ to obtain a higher reward.

Is there a way to keep both players satisfied? Any time there is a gap between $\underline{L}^*$ and $\overline{L}^*$ , there is regret for one or both players. If and denote the amount of regret experienced by ${{\rm P}_1}$ and ${{\rm P}_2}$ , respectively, then the total regret is

$\displaystyle r_1 + r_2 = \overline{L}^*- \underline{L}^*.$

(9.50)

Thus, the only way to satisfy both players is to obtain upper and lower values such that $\underline{L}^*= \overline{L}^*$ . These are properties of the game, however, and they are not up to the players to decide. For some games, the values are equal, but for many $\underline{L}^*< \overline{L}^*$ . Fortunately, by using randomized strategies, the upper and lower values always coincide; this is covered in Section 9.3.3.

Steven M LaValle 2020-08-14