Reinforcement learning

This is a branch of Machine Learning applied in a setting where a learner has a number of actions to take in a given state - each action will lead to a different reward and we want to maximise it. You are provided with previous transitions and want to build a policy in the sense of a Markov decision process.