Regret

Suppose that the players apply security strategies, $ u^{*} = 2$ and $ v^{*} = 4$. This results in a cost of $ L(2,4) = 1$. How do the players feel after the outcome? $ {{\rm P}_1}$ may feel satisfied because given that $ {{\rm P}_2}$ selected $ v^{*} = 4$, it received the lowest cost possible. On the other hand, $ {{\rm P}_2}$ may regret its decision in light of the action chosen by $ {{\rm P}_1}$. If it had known that $ u=2$ would be chosen, then it could have picked $ v = 2$ to receive cost $ L(2,2)
= 2$, which is better than $ L(2,4) = 1$. If the game were to be repeated, then $ {{\rm P}_2}$ would want to change its strategy in hopes of tricking $ {{\rm P}_1}$ to obtain a higher reward.

Is there a way to keep both players satisfied? Any time there is a gap between $ \underline{L}^*$ and $ \overline{L}^*$, there is regret for one or both players. If $ r_1$ and $ r_2$ denote the amount of regret experienced by $ {{\rm P}_1}$ and $ {{\rm P}_2}$, respectively, then the total regret is

$\displaystyle r_1 + r_2 = \overline{L}^*- \underline{L}^*.$ (9.50)

Thus, the only way to satisfy both players is to obtain upper and lower values such that $ \underline{L}^*= \overline{L}^*$. These are properties of the game, however, and they are not up to the players to decide. For some games, the values are equal, but for many $ \underline{L}^*< \overline{L}^*$. Fortunately, by using randomized strategies, the upper and lower values always coincide; this is covered in Section 9.3.3.

Steven M LaValle 2020-08-14