This section briefly introduces the basic ideas of a framework that
has been highly popular in the artificial intelligence community in
recent years. It was developed and used primarily by machine learning
researchers [19,930], and therefore this section is
called reinforcement learning. The problem generally involves
computing optimal plans for probabilistic infinite-horizon problems.
The basic idea is to combine the problems of learning the probability
distribution,
, and computing the optimal plan into the
same algorithm.