Bagging
This is one of the simplest Ensemble learning methods but out performs classic modelling methods in certain problems.
Suppose we are in the modelling framework. The bagging is the following process
- We choose a set of modelling algorithms
for that could fit the problem - these could all be the same. - We take random subsets of the training data
, for , with replacement - so two samples could contain the same data point. - We then train
using algorithm with training data for . - Then we have some method of averaging these models over our problem space to produce
our final model.
Example
Suppose we want to use polynomial regression on a simple function
We could instead of running it once randomly select some
Then we set our final
Correctness
Bagging tends to lead to less overfitting so can help algorithms that are particularly prone to this like polynomial regression.