Fuzzy clustering - AbsoluteAstronomy.com

Fuzzy clustering is a class of algorithm

Algorithm

In mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...

s for cluster analysis in which the allocation of data points to clusters is not "hard" (all-or-nothing) but "fuzzy" in the same sense as fuzzy logic

Fuzzy logic

Fuzzy logic is a form of many-valued logic; it deals with reasoning that is approximate rather than fixed and exact. In contrast with traditional logic theory, where binary sets have two-valued logic: true or false, fuzzy logic variables may have a truth value that ranges in degree between 0 and 1...

Explanation of clustering

Data clustering

Cluster analysis or clustering is the task of assigning a set of objects into groups so that the objects in the same cluster are more similar to each other than to those in other clusters....

is the process of dividing data elements into classes or clusters so that items in the same class are as similar as possible, and items in different classes are as dissimilar as possible. Depending on the nature of the data and the purpose for which clustering is being used, different measures of similarity may be used to place items into classes, where the similarity measure controls how the clusters are formed. Some examples of measures that can be used as in clustering include distance, connectivity, and intensity.

In hard clustering, data is divided into distinct clusters, where each data element belongs to exactly one cluster. In fuzzy clustering (also referred to as soft clustering), data elements can belong to more than one cluster, and associated with each element is a set of membership levels. These indicate the strength of the association between that data element and a particular cluster. Fuzzy clustering is a process of assigning these membership levels, and then using them to assign data elements to one or more clusters.

One of the most widely used fuzzy clustering algorithms is the Fuzzy C-Means (FCM) Algorithm
(Bezdek 1981). The FCM algorithm attempts to partition a finite collection of n elements

into a collection of c fuzzy clusters with respect to some given criterion.
Given a finite set of data, the algorithm returns a list of c cluster centres

and a partition matrix

, where each element u_ij tells
the degree to which element x_i belongs to cluster c_j . Like the k-means algorithm, the FCM
aims to minimize an objective function. The standard function is:

which differs from the k-means objective function by the addition of the membership values
u_ij and the fuzzifier m. The fuzzifier m determines the level of cluster fuzziness. A large
m results in smaller memberships u_ij and hence, fuzzier clusters. In the limit m = 1, the
memberships u_ij converge to 0 or 1, which implies a crisp partitioning. In the absence of
experimentation or domain knowledge, m is commonly set to 2. The basic FCM Algorithm,
given n data points (x1, . . . , xn) to be clustered, a number of c clusters with (c1, . . . , cc) the center of the clusters, and m the level of cluster fuzziness with,

Fuzzy c-means clustering

In fuzzy clustering

Fuzzy clustering

Fuzzy clustering is a class of algorithms for cluster analysis in which the allocation of data points to clusters is not "hard" but "fuzzy" in the same sense as fuzzy logic.- Explanation of clustering :...

, each point has a degree of belonging to clusters, as in fuzzy logic

Fuzzy logic

, rather than belonging completely to just one cluster. Thus, points on the edge of a cluster, may be in the cluster to a lesser degree than points in the center of cluster. An overview and comparison of different fuzzy clustering algorithms is available.

Any point x has a set of coefficients giving the degree of being in the kth cluster w_k(x). With fuzzy c-means, the centroid of a cluster is the mean of all points, weighted by their degree of belonging to the cluster:

The degree of belonging, w_k(x), is related inversely to the distance from x to the cluster centrer as calculated on the previous pass. It also depends on a parameter m that controls how much weight is given to the closest centre. The fuzzy c-means algorithm is very similar to the k-means algorithm:

Choose a number of clusters
Determining the number of clusters in a data set
Determining the number of clusters in a data set, a quantity often labeled k as in the k-means algorithm, is a frequent problem in data clustering, and is a distinct issue from the process of actually solving the clustering problem....

.
Assign randomly to each point coefficients for being in the clusters.
Repeat until the algorithm has converged (that is, the coefficients' change between two iterations is no more than , the given sensitivity threshold) :
- Compute the centroid for each cluster, using the formula above.
- For each point, compute its coefficients of being in the clusters, using the formula above.

The algorithm minimizes intra-cluster variance as well, but has the same problems as k-means; the minimum is a local minimum, and the results depend on the initial choice of weights.

The expectation-maximization algorithm

Expectation-maximization algorithm

In statistics, an expectation–maximization algorithm is an iterative method for finding maximum likelihood or maximum a posteriori estimates of parameters in statistical models, where the model depends on unobserved latent variables...

is a more statistically formalized method which includes some of these ideas: partial membership in classes.

Fuzzy c-means has been a very important tool for image processing in clustering objects in an image. In the 70's, mathematicians introduced the spatial term into the FCM algorithm to improve the accuracy of clustering under noise.

External links

The source of this article is wikipedia, the free encyclopedia. The text of this article is licensed under the GFDL.

Explanation of clustering

Fuzzy c-means clustering

See also

External links