Q-learning
Q-learning is a reinforcement learning class of algorithms which are value function based. It uses the approach of Incremental learning of the Q-function (RL). We use the model of transitions where the learning can provide the action each iteration.
Pick an initial estimation
We need to pick a learning rate
Lastly pick how we will choose an action for a given state.
Then we incrementally learn
Note as time changes we switch which state
Correctness
There is a theorem that states for a Markov decision process if we apply