Markov decision process
A Markov decision process is defined by the following data:
- A set
of states, - A set of actions
(which might be dependent on the state i.e. for each state we have a set ), - A model which determines for each set of states
and action what the transition probabilities there are from to given action , i.e. , and - A reward
for being at a given state. When provided with this we are looking for a policy that determines what action we take in each state.