Alex's Notes

❯

❯

Policy (MDP)

May 14, 20251 min read

reinforcement-learning

Policy (MDP)

In a Markov decision processes a policy is how an actor will behave in a given situation, given by where . This concept can extend to become a probabilistic policy. Let be the set of probability distributions over . Then a probabilistic policy is given by where if is non-zero then .

Graph View

Backlinks

Week 1 - Chapter 3, Finite Markov Decision Process
Week 2 - Reinforcement learning basics
Week 2 - Temporal Difference learning
Action-advantage function (RL)
Quality function (RL)
Value function (RL)

Created with Quartz v4.5.1 © 2025

GitHub
Discord Community