Live 1 - AI - K Nearest Neighbors
Live 1 - AI - K Nearest Neighbors
Live 1 - AI - K Nearest Neighbors
k - Nearest Neighbors
Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 2
k - Nearest Neighbors
ü Uses ‘feature similarity’ to predict the values of new datapoints
ü The new data point will be assigned a value based on how closely it
matches the points in the training set
Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 3
k - Nearest Neighbors
ü The kNN requires
v An integer k
v A set of labeled examples (training data)
v A metric to measure “closeness”
ü Example 1: Classification
v 2D
v 2 classes
v k=3
v Euclidean distance
v 2 sea bass, 1 salmon
Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 4
k - Nearest Neighbors
ü Example 2: Classification
v 2D
v Three classes
v k=5
v Euclidean distance
Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 5
k - Nearest Neighbors
ü Example 3: Classification
v Three-class 2D problem
v non-linearly separable
v k=5
v Euclidean distance
Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 6
k - Nearest Neighbors
ü Example 4: Classification
v Three-class 2D problem
v non-linearly separable
v k=5
v Euclidean distance
Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 7
k - Nearest Neighbors
ü Algorithm
v Step 1: Load training data and test data
v Step 2: Choose k
v Step 3:
Ø Calculate distance between test data and other data points
Ø Identify k nearest neighbors
Ø Use class labels of nearest neighbors to determine the class label
of test data (e.g., by taking majority vote)
v Step 4: End
Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 8
k - Nearest Neighbors
ü How to choose k?
v If infinite number of samples available, the larger is k the better
v In practice: # samples is finite
v Rule of thumb: k = sqrt(n), n: number of examples
v k = 1: for efficiency, but can be sensitive to “noise”
Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 9
k - Nearest Neighbors
ü How to choose k?
v Larger k may improve performance, but too large k destroys locality
v Smaller k: higher variance (less stable)
v Larger k: higher bias (less precise)
Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 10
k - Nearest Neighbors
ü How well does KNN work?
v If we have lots of samples, kNN works well
Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 11
k - Nearest Neighbors
ü Minkowski distance
ü Mahattan distance
ü Euclidean distance
ü Chebyshev distance
Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 12
k - Nearest Neighbors
ü Best distance?
Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 13
k - Nearest Neighbors
ü Euclidian distance
Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 14
k - Nearest Neighbors
ü Euclidian distance
Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 15
k - Nearest Neighbors
ü Feature nomalization
v Linearly scale to 0 mean, variance 1
Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 16
k - Nearest Neighbors
ü Feature weighting
v Scale each feature by its importance for classification
wi
Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 17
k - Nearest Neighbors
ü Computational complexity
v Basic kNN algorithm stores all examples
v Very expensive for a large number of samples
Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 18
k - Nearest Neighbors
ü kNN - a lazy learning algorithm
v Defers data processing until it receives a request to classify unlabeled
data
v Replies to a request for information by combining its stored training
data
v Discards the constructed answer and any intermediate results
v Lazy algorithms have fewer computational costs than eager
algorithms during training but greater storage requirements and
higher computational costs on recall
Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 19
k - Nearest Neighbors
ü Advantages
v Can be applied to the data from any distribution
v Very simple and intuitive
v Good classification if the number of samples is large enough
v Uses local information, which can yield highly adaptive behavior
v Very easy for parallel implementations
ü Disadvantages
v Choosing k may be tricky
v Test stage is computationally expensive
v Need large number of samples for accuracy
v Large storage requirements
v Highly susceptible to the curse of dimensionality
Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 20
k - Nearest Neighbors
ü Sources:
v https://www.csd.uwo.ca/courses/CS4442b/L3-ML-knn.pdf
v http://research.cs.tamu.edu/prism/lectures/pr/pr_l8.pdf
v http://web.iitd.ac.in/~bspanda/KNN%20presentation.pdf
v V. B. Surya Prasath et. al., Effects of Distance Measure Choice on KNN
Classifier Performance - A Review, Big Data. 7. 10.1089/big.2018.0175.
Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 21