Lec05 InstanceBased

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

INSTANCE-BASE LEARNING

• Instance-based learning methods simply store the training examples


instead of learning explicit description of the target function.
– Generalizing the examples is postponed until a new instance must be classified.
– When a new instance is encountered, its relationship to the stored examples is
examined in order to assign a target function value for the new instance.
• Instance-based learning includes nearest neighbor, locally weighted
regression and case-based reasoning methods.
• Instance-based methods are sometimes referred to as lazy learning
methods because they delay processing until a new instance must be
classified.
• A key advantage of lazy learning is that instead of estimating the target
function once for the entire instance space, these methods can estimate
it locally and differently for each new instance to be classified.
Machine Learning 1
k-Nearest Neighbor Learning
• k-Nearest Neighbor Learning algorithm assumes all instances
correspond to points in the n-dimensional space Rn
• The nearest neighbors of an instance are defined in terms of Euclidean
distance.
• Euclidean distance between the instances xi = <xi1,…,xin> and
xj = <xj1,…,xjn> are:
n
2
d ( xi, xj )   ( x
r 1
ir  xjr )

• For a given query instance xq, f(xq) is calculated the function values of
k-nearest neighbor of xq

Machine Learning 2
k-Nearest Neighbor Learning
• Store all training examples <xi,f(xi)>
• Calculate f(xq) for a given query instance xq using k-nearest neighbor
• Nearest neighbor: (k=1)
– Locate the nearest traing example xn, and estimate f(xq) as
– f(xq)  f(xn)
• k-Nearest neighbor:
– Locate k nearest traing examples, and estimate f(xq) as
– If the target function is real-valued, take mean of f-values of k
nearest neighbors.
f(xq) =
– If the target function is discrete-valued, take a vote among f-values
of k nearest neighbors.
Machine Learning 3
When To Consider Nearest Neighbor
• Instances map to points in Rn
• Less than 20 attributes per instance
• Lots of training data
• Advantages
– Training is very fast
– Learn complex target functions
– Can handle noisy data
– Does not loose any information
• Disadvantages
– Slow at query time
– Easily fooled by irrelevant attributes

Machine Learning 4
Distance-Weighted kNN

Machine Learning 5
Curse of Dimensionality

Machine Learning 6
Locally Weighted Regression
• KNN forms local approximation to f for each query point xq
• Why not form an explicit approximation f(x) for region surrounding xq
 Locally Weighted Regression
• Locally weighted regression uses nearby or distance-weighted training examples to
form this local approximation to f.
• We might approximate the target function in the neighborhood surrounding x, using a
linear function, a quadratic function, a multilayer neural network.
• The phrase "locally weighted regression" is called
– local because the function is approximated based only on data near the query
point,
– weighted because the contribution of each training example is weighted by its
distance from the query point, and
– regression because this is the term used widely in the statistical learning
community for the problem of approximating real-valued functions.

Machine Learning 7
Locally Weighted Regression
• Given a new query instance xq, the general approach in locally
weighted regression is to construct an approximation f that fits the
training examples in the neighborhood surrounding xq.
• This approximation is then used to calculate the value f(xq), which is
output as the estimated target value for the query instance.

Machine Learning 8
Locally Weighted Linear Regression

Kernel function K is the function of distance that is used to determine


the weight of each training example.

Machine Learning 9
Radial Basis Functions
• One approach to function approximation that is closely related to distance-weighted
regression and also to artificial neural networks is learning with radial basis functions.
• The learned hypothesis is a function of the form

Machine Learning 10
Radial Basis Function Networks

Each hidden unit produces an activation


determined by a Gaussian function
centered at some instance xu.

Therefore, its activation will be close to


zero unless the input x is near xu.

The output unit produces a linear


combination of the hidden unit
activations.
Machine Learning 11
Case-based reasoning
• Instance-based methods
– lazy
– classification based on classifications of near (similar) instances
– data: points in n-dim. space
• Case-based reasoning
– as above, but data represented in symbolic form
• New distance metrics required

Machine Learning 12
Lazy & eager learning
• Lazy: generalize at query time
– kNN, CBR
• Eager: generalize before seeing query
– Radial basis, ID3, …
• Difference
– eager must create global approximation
– lazy can create many local approximation
– lazy can represent more complex functions using same H (H = linear
functions)

Machine Learning 13

You might also like