Gradient decent

This algorithm uses differentiation to minimise error terms for models. Suppose we have a model with some parameters . Further suppose the error function is differentiable with respect to . Then we set some learning rate and we perturb the weights by until settles in some local minimum.

Note the convergence to a local minimum - rather than a global minimum. There are techniques to get around this but it is a hard problem to know you are in some global minimum.

Run time

Run time can be effected by the complexity of the model. Also there are different techniques to change the learning rate over time.