Non-linear methods
Transformation of the input space.
- polynomial regression: Use polynomial features,
- Step functions: Cut region up and take mean,
- Regression splines: Cut region up and fit polynomials with smooth intersections,
- What is the difference between this and Boosting?
- Smoothing splines: Similar to regression,
- Local regression: Similar to splines but with overlapping boundaries,
- Generalised Additive Models (GAM): Combining all the above,
Polynomial regression
Polynomial regression
Link to originalPolynomial regression
In the modelling framework saying we are doing polynomial regression is saying that we are picking the modelling paradigm of functions of the form
where
, , and we say
is the degree of our polynomial. If
was 1-dimensional this would be:
Here we will focus on
This has issues with error explosion outside a certain region.
Step functions
Step function methods
Link to originalStep function methods
Suppose we have a function
a step function method uses step function to model this. Here we choose intervals of the domain to define different functions on each region.
One of the simplest methods in step function methods is to choose meaningful intervals in your domain. Then setting
To apply this we use a one hot embedding.
We should only use this if there are natural breakpoints where we can input domain knowledge.
Piece-wise polynomials
Here instead of using constant values we switch this to a polynomial.
Basis function
These form a basis of your functional space.
Regression splines
A degree D spline is a piecewise polynomial with degree
Cublic spline
A cubic spline is given by
where
This donates a basis of the cubic splines.
This generalises to power
Degrees of freedom
There will be 4 + K degrees of freedom where
In comparison piecewise polynomial regression with
Natural cubic spline
A natural cubic spline has linear components before and after the first and last spline.
Choosing knots
- Place more knots where your function varies more rapidly.
- Place knots in uniform way, e.g. quantity of data.
- Use cross validation.
Smoothing splines
First choose a loss function:
If
Smooth splines by minimising the following loss function
If we solve this then:
- It a piecewise cubic polynomicals with knots at unique values of
, - It is continuous at the first and second derivatives at the knots,
- It is linear outside the boundary knots,
This is a natural cublic spline but not the same as if we got it using the technique above. This is due to
controls the level of shrinkage.
Choosing
controls the smoothness and hence the degrees of freedom. - As
increasing from zero to infinity goes from a step function to a line. i.e., degrees of freedom from to 2. - Use cross validation.
Local regression
- Choose a number of points to consider
- For a value find points close to it using a Gaussian.
- Weight points using the gaussian.
- Apply linear regression to get a curve.
- Then use this curve to make a preductions.
GAM
Here we apply different models for different variables.