Introduction To Kmeans
Introduction To Kmeans
Introduction To Kmeans
Recall the first property of clusters – it states that the points within a cluster should be
similar to each other. So, our aim here is to minimize the distance between the
points within a cluster.
We have these 8 points and we want to apply k-means to create clusters for these
points. Here’s how we can do it.
Here you can see that the points which are closer to the red point are assigned to the
red cluster whereas the points which are closer to the green point are assigned to the
green cluster.
The step of computing the centroid and assigning all the points to the cluster based on
their distance from the centroid is a single iteration. But wait – when should we stop this
process? It can’t run till eternity, right?
We can stop the algorithm if the centroids of newly formed clusters are not changing.
Even after multiple iterations, if we are getting the same centroids for all the clusters, we
can say that the algorithm is not learning any new pattern and it is a sign to stop the
training.
Another clear sign that we should stop the training process if the points remain in the
same cluster even after training the algorithm for multiple iterations.
Finally, we can stop the training if the maximum number of iterations is reached.
Suppose if we have set the number of iterations as 100. The process will repeat for 100
iterations before stopping.