- Suppose that a single-stage two-objective decision-making problem is defined in which there are two objectives and a continuous set of actions, . The cost vector is . Determine the set of Pareto-optimal actions.
- Let
0 - Use nondeterministic reasoning to find the minimax decision and worst-case cost.
- Use probabilistic reasoning to find the best expected-case decision and expected cost.

- Many reasonable decision rules are possible, other than those
considered in this chapter.
- Exercise 2(a) reflects extreme pessimism. Suppose instead that extreme optimism is used. Select the choice that optimizes the best-case cost for the matrix in Exercise 2.
- One approach is to develop a coefficient of optimism,
, which allows one to interpolate between the two extreme
scenarios. Thus, a decision, , is chosen by minimizing

Determine the optimal decision for this scenario under all possible choices for . Give your answer as a list of choices, each with a specified range of .

- Suppose that after making a decision, you observe the choice
made by nature. How does the cost that you received compare with the
best cost that could have been obtained if you chose something else,
given this choice by nature? This difference in costs can be
considered as
*regret*or minimum ``Doh!''^{9.11}Psychologists have argued that some people make choices based on minimizing regret. It reflects how badly you wish you had done something else after making the decision.- Develop an expression for the worst-case regret, and use it to make a minimax regret decision using the matrix from Exercise 2.
- Develop an expression for the expected regret, and use it to make a minimum expected regret decision.

- Using the matrix from Exercise 2, consider the set of all probability distributions for nature. Characterize the set of all distributions for which the minimax decision and the best expected decision results in the same choice. This indicates how to provide reverse justification for priors.
- Consider a Bayesian decision-theory scenario with cost function . Show that the decision rule never changes if is replaced by , for any and .
- Suppose that there are two classes,
, with
. The observation space, , is
. Recall from
probability theory that the normal (or Gaussian) probability density
function is

in which denotes the mean and denotes the variance. Suppose that is a normal density in which and . Suppose that is a normal density in which and . Find the optimal classification rule, . You are welcome to solve the problem numerically (by computer) or graphically (by careful function plotting). Carefully explain how you arrived at the answer in any case. - Let
0 - Use nondeterministic reasoning to find the minimax decision and worst-case cost.
- Use probabilistic reasoning to find the best expected-case decision and expected cost.
- Characterize the set of all probability distributions for which the minimax decision and the best expected decision results in the same choice.

- In a
*constant-sum game*, the costs for any and add to yield(9.96)

for some constant that is independent of and . Show that any constant-sum game can be transformed into a zero-sum game, and that saddle point solutions can be found using techniques for the zero-sum formulation. - Formalize Example 9.7 as a
zero-sum game, and compute security strategies for the players. What
is the expected value of the game?
- Suppose that for two zero-sum games, there
exists some nonzero
for which the cost matrix of one game
is obtained by multiplying all entries by in the cost matrix of
the other. Prove that these two games must have the same
deterministic and randomized saddle points.
- In the same spirit as Exercise 11, prove that
two zero-sum games have the same deterministic and randomized saddle
points if is added to all matrix entries.
- Prove that multiple Nash equilibria of a nonzero-sum game
specified by matrices and are interchangeable if as a
game yields the same Nash equilibria as the game .
- Analyze the game of Rock-Paper-Scissors for three players.
For each player, assign a cost of for losing, 0 for a tie, and
for winning. Specify the cost functions. Is it possible to
avoid regret? Does it have a deterministic Nash equilibrium? Can you
find a randomized Nash equilibrium?
- Compute the randomized equilibrium point for the following
zero-sum game:

Indicate the randomized strategies for the players and the resulting expected value of the game.

**Implementations**

- Consider estimating the value of an unknown parameter,
. The prior probability density is a normal,

with and . Suppose that a sequence, , , , , of observations is made and that each is a normal density with and . Suppose that represents your guess of the parameter value. The task is select to minimize the expectation of the cost, . Suppose that the ``true'' value of is . Determine the , the minimal action with respect to the expected cost after observing: for every .- Determine for .
- Determine for .
- Determine for .

- Implement an algorithm that computes a randomized saddle point
for zero-sum games. Assume that one player has no more than two
actions and the other may have any finite number of actions.
- Suppose that a -stage decision-making problem is defined using
multiple objectives. There is a finite state space and a finite
action set for each . A state transition equation,
, gives the next state from a current state and
input. There are cost functionals of the form
(9.99)

in which . Assume that if (for some goal region ) and otherwise. Assume that there is no termination action (which simplifies the problem). Develop a value-iteration approach that finds the complete set of Pareto-optimal plans efficiently as possible. If two or more plans produce the same cost vector, then only one representative needs to be returned.

Steven M LaValle 2020-08-14