The name comes from regression to the mean, this was the technique they used to first talk about it - then the word regression for the technique stuck.
Polynomial Regression
Polynomial regression
Polynomial regression
In the modelling framework saying we are doing polynomial regression is saying that we are picking the modelling paradigm of functions of the form
where
Linear regression is the special case of polynomial regression where the degree of the polynomial is 1. Therefore we are restricted to the modelling paradigm of
where .
Whilst the may seem more restrictive than polynomial regression with a mutation of the input space linear regression is just as flexible.
The mean squared error is the square of the norm for two points in , i.e.
This is normally applied in the context of machine learning to assess models against some testing data. In the modelling framework we would say the
Calculate polynomial regression coefficients for MSE
Calculate polynomial regression coefficients for MSE
To demonstrate the calculation of the coefficients for polynomial regression with MSE suppose we have 1-dimensional input and training data. For an order polynomial we are looking for coefficients for that roughly do the following
in a completely not mathematically sound way we can do the following rearrangement
completing our fuzzy maths. We use this as our definition of and moreover this has some nice properties that actually minimises with respect to MSE. This can expand out to be multi-variate by increasing the size of and respectively.
As we increase the degree the fitting polynomial, the fit to the points we are training on gets better. However, at a point the utility of the curve outside of these points gets less.
This is easy to see by eye but how can we computationally infer this?
Cross validation
Cross validation
Cross validation
Suppose we are training a model on some training data. If we partition into folds for . Then cross validation is the practice of training the model on all but one fold then assessing it on using our objective function. The practice usually involves doing this for all possible folds and picking the model with least error.
When using cross validation to assess the accuracy of our fit in the example before, you can see it agrees with our intuition. Whilst the high order approximations are a closer fit for the training data, they are a worse fit for the test data. Therefore we could use cross validation to pick the best polynomial without breaking the integrity of the testing data.
Generally you need to find the right sweet spot between underfitting and overfitting by varying the degree of the polynomial.
Suppose we are in the modelling framework and have been given a discrete input variable for a regression problem. If has possible values we introduce dimensional space . Then we define the transformation
where are the basis vectors of .
We would then replace by in the input space .
In the case of boolean input variables, this is as simple as saying is for False and is for True.
You may wonder why no expand this to an -dimensional space. This would ruin independence of the variables, as we would know that always these values in these dimensional spaces have to add to 1 - this causes instability in our polynomial regression.