Markov decision process

A Markov decision process is defined by the following data:

  • A set of states,
  • A set of actions (which might be dependent on the state i.e. for each state we have a set ),
  • A model which determines for each set of states and action what the transition probabilities there are from to given action , i.e. , and
  • A reward for being at a given state. When provided with this we are looking for a policy that determines what action we take in each state.