Week 3 - Instance Based Learning

Instance-based learning

Instance-based learning

Instance-based learning is a branch of machine learning where instead of performing generalisations on the training data immediately and not refering back to them. It instead stores the instances of the training data and uses the when called for a value.

Link to original

k-nearest neighbour

k-nearest neighbour

k-nearest neighbour

This algorithm is an Instance-based learning model and uses closest neighbours to determine what values should be.

In the modelling framework, suppose we are given:

  • training data ,
  • a metric on called ,
  • to be the number of points to consider, and
  • an averaging mechanism that takes a weighted set of values in and returned another value in - like the weighted geometric mean. Then to model our data when given a new input point we find the -nearest neighbours in our training data then take the average of their associated values in .

This model is very simple - however a lot of the complexity comes in defining and .

Pseudocode

KNN(T, d, k, V, a):
	Input:
		T our training data which is points in A and B
		d metric on the space of A
		k the number of points to average
		V a way of deciding how to pick out of the neighbours
		a in A
	Output:
		b in B the models prediction
1. Set N = get_k_nearest_neighbours(T, d, k, a)
2. Set b = V({(b',d(a',a)) for (a',b') in N})
3. Return b
 
get_k_nearest_neighbours(T, d, k, a):
	Input:
		T our training data which is points in A and B
		d metric on the space of A
		k the number of points to average
		V a way of deciding how to pick out of the neighbours
		a in A
	Output:
		N subset A which are the k-nearest neighbours to a in T
1. Let X be a set
2. For each (a',b') in T:
	2.1. Append ((a',b'), d(a',a)) to X
3. Sort X by the second parameter.
4. Let N = {t for the first k elements (t,d') in X}
     # You can also let N include elements of equal distance as the k'th
     # element
5. Return N

Run time

This algorithm is a lazy learner and puts of compute until it is executed. This means learning is but execution can take a while.

Correctness

This model has some Bias (modelling) for the representations it prefers:

  • Near points are similar,
  • The function it models is smooth with respect to the distance,
  • All features equally matter,
  • All points equally matter, and
  • Input points are represented by the training data.
Link to original

Run times

Quite often Instance-based learning is called “lazy” learning as you are delaying computation until query time. As you can see in this table below which assumes . As well as the input training data is sorted.

AlgorithmRunning timeSpace
1-NN learning1
1-NN query1
-NN learning1
-NN query1
Linear regression learning1
Linear regression learning11

Why is linear regression time to learn?

The curse of dimensionality

The curse of dimensionality

The curse of dimensionality

In the modelling framework the curse of dimensionality states as the dimension of our feature space grows, the amount of training data we need to generalise accurately grows exponentially.

Link to original

Locally weighted regression

You can use k-nearest neighbour in regression problems along with an algorithm like polynomial regression to replace you voting function . This means at each point you will construct a local polynomial to map that which you can piece together to fit a larger curve.

This has a nice property that whilst your modelling paradigm of your regression might limit you to certain types of functions - your actual output can be a far more complex function.