|written 5 months ago by||• modified 5 months ago|
|written 5 months ago by|
The k-means clustering method is an unsupervised machine learning technique used to identify clusters of data objects in a dataset.
There are many different types of clustering methods, but k-means is one of the oldest and most approachable.
These traits make implementing k-means clustering in Python reasonably straightforward, even for novice programmers and data scientists.
Basically, in the process of clustering, one can identify which observations are alike and classify them significantly in that manner.
Keeping this perspective in mind, k-means clustering is the most straightforward and frequently practiced clustering method to categorize a dataset into a bunch of k classes.
Disadvantages of K-means Clustering:
The algorithm demands for the inferred specification of the number of clusters/centers.
An algorithm goes down for non-linear sets of data and is unable to deal with noisy data and outliers.
It is not directly applicable to categorical data since only operatable when the mean is provided.
Also, Euclidean distance can weigh unequally the underlying factors.
The algorithm is not variant to non-linear transformation, i.e provides different results with different portrayals of data.