Boosting
Boosting is an Ensemble learning method that uses multiple instances of another model to average them out to make a classifier that is better than any one individual one.
Suppose we are in the modelling framework where
We set
Let
Set our learning rate
Other choices of
Now define
Note that if
Once we have trained for some large number times
Pseudocode
Name(classifier, N, T):
Input:
classifier - some way to train a classifier given some training data and a distrubution.
N - the number of itterations we want to do.
T - pairs of points (a_t, b_t) to train on.
Output:
a moddel f on the space A to {-1, 1}.
1. Set D: T -> [0,1] to be D(t) = 1/|T|
2. For i = 1, ... , number_of_itterations do
2.1. let h_i = classifier(T, D)
2.2. Set alpha_i = calculate_learning_rate(h_i, T, D)
2.2. Set D = Update_distrubution(h_i, T, D, alpha_i)
3. Return sgn(sum(alpha_t h_t(x) for t in T))
calculate_learning_rate(h, T, D):
Input:
h - A classifier mapping points in A to a real number
T - pairs of points (a_t, b_t) to train on.
D - A distrubution over T
Output:
a learning rate for this itteration
1. Set e = 0
2. For each t in T
2.1. If b_t != sgn(h(a_t)) then
2.1.1 Set e = e + D(t)
3. Return ln((1-e)/e)/2
Update_distrubution(h, T, D, alpha)
Input:
h - A classifier mapping points in A to a real number
T - pairs of points (a_t, b_t) to train on.
D - A distrubution over T
alhpa - a learning rate
Output:
a new distrubution over T
1. For each t in T
2.1. Set C_t = D(t) exp(- alpha b_t h(a_t))
2. Set Z = sum(C_t for t in T)
3. Intialise new distrubution over T D'
4. For each t in T
4.1. Set D'(t) = C_t/Z
5. Return D'