Alex's Notes

❯

❯

Markov decision process

Markov decision process

Apr 03, 20241 min read

probability
machine-learning

Markov decision process

A Markov decision process is defined by the following data:

A set of states,

A set of actions (which might be dependent on the state i.e. for each state we have a set ),

A model which determines for each set of states and action what the transition probabilities there are from to given action , i.e. , and

A reward for being at a given state. When provided with this we are looking for a policy that determines what action we take in each state.

Graph View

Backlinks

Week 11 - Markov Decision Processes
Week 12 - Reinforcement learning
Week 13 - Game theory
Week 2 - Temporal Difference learning
Bellman equation
Discounted rewards
Learning rate convergence
Policy (MDP)
Policy Iteration (MDP)
Q-function (RL)
Q-learning
Reinforcement learning
Return (RL)
Transitions (MDP)
Value function (RL)
Value iteration (MDP)

Created with Quartz v4.5.1 © 2025

GitHub
Discord Community