Week 1 - Chapter 1 Machine Learning
This is my notes from the first chapter of MachineLearningTomMitchell.pdf.
1.1 Well posed learning problems
Machine Learning
Link to originalMachine Learning
A computer programme is said to learn from experience
with respect to some class of tasks and performance measure , if its performance at tasks in , as measured by , improves with experience .
Checkers
If we want a programme to learn at checkers we simply need to define
, , and !
- Task
: playing checkers, - Performance measure
: percentage of games won against opponents, and - Training experience
: playing practice games against itself.
In what follows we use Checkers to provide an example of the process you go through to design a learning system.
Training experience
There are important classifications of training experience.
For example when the pay off comes:
- Direct training examples - examples of exactly the correct thing at the time of a decision.
- In the checkers example, given a board position the right move to make.
- Normally the best training but harder to get.
- Indirect training examples - examples that leads to a positive outcome but not at the point of the decision.
- In the checkers example, a previous game where it knows a sequence of moves that leads to a win or loss.
- Indirect training faces the problem of credit assignment, which moves where correct?
Or how much autonomy the learner has over this process:
- Leader lead learning - A sequence of training examples built by the teacher.
- In the checkers example, if you feed it a selection of games or game states.
- Student lead learning - The student can ask questions about particularly confusing positions and get the correct decision.
- In the checkers example, this might be from allowing the machine to play itself.
Lastly how representative the test samples are.
- If a computer only plays itself it might not be able to bit the common ways a human plays.
- It is often assumed the training data follows the same distribution as the testing data.
- However, one of the goals of machine learning is to generalise beyond this.
- In practice this is one of the most violated assumptions from the theory.
Designing a learning system
Now we must make 3 more choices:
- the exact type of knowledge to be learned,
- a representation for this target knowledge, and
- a learning mechanism.
Type of knowledge to be learnt
This normally comes in the form of a target function.
For example, in the checkers example you may think the best thing to learn would be a function that choose the best move for a given state of the board
It is important to be specific with the target function, as it may help simplify design decisions later.
In the example above we can may explicit expectations on
- if it is won then
, - if it is lost then
, - if it is drawn then
, and - else
where is the end state assuming optimum play.
The function as you right it down might be
Choosing a representation of this target knowledge
This is equivalent to choosing a modelling paradigm.
In checkers you could represent
- A letteral function from board states to values,
- A neural network, or
- A function on board features that are predefined.
A learning Mechanism
Now given a function
- training data to build
of the form , - a method to evaluate this approximation - the objective function, and
- an algorithm to learn.
In the example it may be confusing how we define training data, as the training experience is it playing itself. Here we use a trick to make the training data using
The evaluation process for a linear function is commonly something like Mean squared error (MSE). Evaluating the output of the prediction
In the example as would choose the algorithm appropriate for the representation of
Summary
A lot of learning systems follow the design below.
Transclude of learning_system_diagram
- The performance system: Runs an instance of the task and returns a new solution.
- The critic: Uses that instance to generate training data.
- Generaliser: Uses this training data to update the model.
- Experiment generator: Decides what new experience would be best for the next round of training.
Whilst not all these components are required, training systems tend to follow this cycle.
In our example:
- The performance system would play a game starting for the provided state.
- The critic would transform that game into a training data.
- The generaliser would run the learning algorithm on it.
- The experiment generator could either start a game at the beginning of the game or a state of particular confusion for the model.
Learning problems can be summarised as searching through large “Hypothesis spaces” to find the best fitting function to a given curve.
Issues in Machine Learning
- What algorithms perform best for a given task?
- How much training data does it take to make a learning process converge?
- What is the most useful experience to next learn on?
- What function is worth learning for a given task?
- Can the learner automatically alter its representation to reduce restriction bias?