Lec05 InstanceBased
Lec05 InstanceBased
Lec05 InstanceBased
• For a given query instance xq, f(xq) is calculated the function values of
k-nearest neighbor of xq
Machine Learning 2
k-Nearest Neighbor Learning
• Store all training examples <xi,f(xi)>
• Calculate f(xq) for a given query instance xq using k-nearest neighbor
• Nearest neighbor: (k=1)
– Locate the nearest traing example xn, and estimate f(xq) as
– f(xq) f(xn)
• k-Nearest neighbor:
– Locate k nearest traing examples, and estimate f(xq) as
– If the target function is real-valued, take mean of f-values of k
nearest neighbors.
f(xq) =
– If the target function is discrete-valued, take a vote among f-values
of k nearest neighbors.
Machine Learning 3
When To Consider Nearest Neighbor
• Instances map to points in Rn
• Less than 20 attributes per instance
• Lots of training data
• Advantages
– Training is very fast
– Learn complex target functions
– Can handle noisy data
– Does not loose any information
• Disadvantages
– Slow at query time
– Easily fooled by irrelevant attributes
Machine Learning 4
Distance-Weighted kNN
Machine Learning 5
Curse of Dimensionality
Machine Learning 6
Locally Weighted Regression
• KNN forms local approximation to f for each query point xq
• Why not form an explicit approximation f(x) for region surrounding xq
Locally Weighted Regression
• Locally weighted regression uses nearby or distance-weighted training examples to
form this local approximation to f.
• We might approximate the target function in the neighborhood surrounding x, using a
linear function, a quadratic function, a multilayer neural network.
• The phrase "locally weighted regression" is called
– local because the function is approximated based only on data near the query
point,
– weighted because the contribution of each training example is weighted by its
distance from the query point, and
– regression because this is the term used widely in the statistical learning
community for the problem of approximating real-valued functions.
Machine Learning 7
Locally Weighted Regression
• Given a new query instance xq, the general approach in locally
weighted regression is to construct an approximation f that fits the
training examples in the neighborhood surrounding xq.
• This approximation is then used to calculate the value f(xq), which is
output as the estimated target value for the query instance.
Machine Learning 8
Locally Weighted Linear Regression
Machine Learning 9
Radial Basis Functions
• One approach to function approximation that is closely related to distance-weighted
regression and also to artificial neural networks is learning with radial basis functions.
• The learned hypothesis is a function of the form
Machine Learning 10
Radial Basis Function Networks
Machine Learning 12
Lazy & eager learning
• Lazy: generalize at query time
– kNN, CBR
• Eager: generalize before seeing query
– Radial basis, ID3, …
• Difference
– eager must create global approximation
– lazy can create many local approximation
– lazy can represent more complex functions using same H (H = linear
functions)
Machine Learning 13