Quality function (RL)
Similar to the value function a quality function accounts for both state and action. So functionally
this is the quality of taking action when you are in state . Given a policy
we can calculate the ideal quality function to be