Perceptron rule

This is a way to train a perceptron using the binary step activation function. To classify training data where and . We assume the perceptron we are training is a function defined by weights . For this we require a sufficiently small learning rate . The idea is to reply the data by nudging the weights () of the perceptron slowly towards an even split.

To simplify the treatment of we just extend where all of has a (i.e. ) in the additional position. Then we set and compare to zero, i.e. instead of checking if

Then checking a random training example with current weights we set where is the indicator function mapping true to and false to . The we change the weights and then redefine . Note as we have

Pseudocode

perceptron_rule(T):
	Input: Training data T = {((x_1, x_2, ..., x_n, x_{n+1} = -1), y)}
	Output: perceptron weights w_1, w_2, ...., w_n, w_{n+1} = theta.
1. Set w_i = 1 for 1 = 1, 2, ..., n+1.
2. hat{y}_t = set_y_hat(T, w_i)
3. While there exists t in T such that hat{y}_t not = y_t, and let count be the number of intterations:
	3.1. Choose random t in T such that hat{y}_t not = y_t
	3.2. Set eta = 1/count
	3.3. for i = 1, ... , n+1
		3.3.1. Set delta w_i = eta (y - hat(y)_t) x_i
		3.3.2. Set w = w_i + delta w_i
	3.4. hat{y}_t = set_y_hat(T, w_i)
4. Return w_i
 
set_y_hat(T, w_i):
	Input: Training data T = {((x_1, x_2, ..., x_n, x_{n+1} = -1), y)} and
		perceptron weights w_1, w_2, ...., w_n, w_{n+1} = theta.
	Output: For each t in T new hat(y)_t
1. for t = ((x_1, ..., x_{n_1}), y) in T:
	1.1. Set hat{y}_t = bool(sum_{i=1}^{n+1} x_i >= 0)
2. return hat{y}_t

Runtime

To make sure is low enough we might reduce it slowly over time (though we have to make sure the sequence of converge to zero whilst the sum of diverge to infinity) and then stop when for all our training data. In the example above I choose the sequence for our where is the iteration count.

This will only stop if is linearly separable. Moreover, The perceptron rule using binary step converges in finite time if the dataset is linearly separable.

Alex's Notes

Explorer

Perceptron rule

Perceptron rule

Pseudocode

Runtime

Graph View

Table of Contents

Backlinks