Backward propagation of errors (Back propagation)
This is the specific implementation of gradient decent applied to neural networks. There are two stages of this calculation:
- Forward pass through as you evaluate the neural network on test data.
- Backward propagation of that error as you use the chain rule for differentiation on each of the perceptrons.
For this to work all the perceptrons need to have differentiable activation function.