Bagging

This is one of the simplest Ensemble learning methods but out performs classic modelling methods in certain problems.

Suppose we are in the modelling framework. The bagging is the following process

  1. We choose a set of modelling algorithms for that could fit the problem - these could all be the same.
  2. We take random subsets of the training data , for , with replacement - so two samples could contain the same data point.
  3. We then train using algorithm with training data for .
  4. Then we have some method of averaging these models over our problem space to produce our final model.

Example

Suppose we want to use polynomial regression on a simple function with training data .

We could instead of running it once randomly select some then train using polynomial regression on .

Then we set our final .

Correctness

Bagging tends to lead to less overfitting so can help algorithms that are particularly prone to this like polynomial regression.