FCM Data Clustering

Cluster data using fuzzy c-means algorithm in the Live Editor

Since R2025a

Description

The FCM Data Clustering task clusters data using the fuzzy c-means (FCM) algorithm, where each data point belongs to a cluster to a degree that is specified by a membership grade. For example, a data point that lies close to the center of a cluster will have a high degree of membership in that cluster, and another data point that lies far away from the center of a cluster will have a low degree of membership to that cluster. The FCM Data Clustering task automatically generates MATLAB^® code for your live script. For more information about Live Editor tasks, see Add Interactive Tasks to a Live Script.

The task returns these output arguments from the fcm function:

centers — Cluster centers
U — Fuzzy partition matrix indicating the degree of membership of each data point in each cluster
objFcn — Objective function values for each clustering iteration
info — Detailed clustering results

For more information on the FCM algorithm, see Fuzzy Clustering.

FCM Data Clustering Live Task showing a sample cluster plot for data with three clusters

Open the Task

To add the FCM Data Clustering task to a live script in the MATLAB Editor:

On the Live Editor tab, select Task > FCM Data Clustering.
In a code block in the script, enter a relevant keyword, such as fcm or clustering. Select FCM Data Clustering from the suggested command completions.

Examples

expand all

Cluster Data Using FCM in Live Editor

Open Live Script

Use the FCM Data Clustering task in the Live Editor to interactively cluster data using the fuzzy c-means (FCM) algorithm. You can experiment with different clustering configurations, such as the number of clusters or distance metric.

Load the five sample data sets. These data sets have different numbers of clusters and data distributions.

load fcmdata

Each data set contains two columns that represent the two features for each data point.

To cluster the data, open the FCM Data Clustering task in the Live Editor. On the Live Editor tab, select Task > FCM Data Clustering.

Select the data to cluster. For this example, under Input data, select fcmdata3.

FCM Data Clustering task showing the Input data drop-down expanded and the pointer over the the third entry, fcmdata3.

Under Clustering Options, configure the clustering algorithm. For this example, set the Number of clusters to [2 3 4 5]. The task computes clusters for each cluster count in Number of clusters and returns the clustering results for the optimal number of clusters.

Keep the remaining options at their default values.

FCM Data clustering task with the Clustering Options section expanded. The Number of clusters parameter is highlighted with four cluster values, 2, 3, 4, and 5.

To cluster the data, click the Run current section button . The task clusters the data and plots the results. The task also returns the cluster centers, partition matrix, and objective function values as centers, U, and objFcn, respectively.

The nondiagonal plot shows each data point classified into the cluster for which it has the highest membership value. The diagonal axes show the marginal cluster membership sets for each feature.

The clustering terminates after around 12 iterations. The output argument objFcn returns the objective function value for each iteration. The final minimum objective function value is around 4.6.

To improve the clustering results and reduce the final objective function value, cluster the data using Mahalanobis distance rather than the default Euclidean distance. The Mahalanobis distance metric generally performs better for nonspherical clusters.

Under Distance metric, select Mahalanobis.

For this clustering operation, display the cluster centers. Under Display Results, select the Show cluster centers.

To avoid overwriting the previous clustering results, in the top section of the task, modify the output argument names to centers2, U2, objFcn2, and info2.

Also, since the previous clustering operation found that four clusters was optimal, set Number of clusters to 4.

At the top of the task, the output argument names are changed. Under Clustering Options, Number of clusters is set to 4, the Distance metric value is Mahalanobis, and the Show cluster centers parameter is selected.

Run the task to cluster the data. The resulting marginal cluster membership values have sharper transitions. Also, the minimum objective function value in objFcn2 is around 0.17, which is significantly lower than the first clustering operation.

Related Examples

Parameters

expand all

Select Data

`Input data` — Data set to be clustered
matrix

Specify input data as a matrix with N_d rows, where N_d is the number of data points. The number of columns in the data is equal to the data dimensionality, that is, the number of features in each data point.

Clustering Options

`Number of clusters` — Number of clusters
`auto` (default) | integer greater than 1 | vector of integers

Number of clusters to create, N_c, specified as one of these values:

auto — Cluster the data ten times (N_c = 2 through 11).
Integer greater than 1 — Cluster the data once using the specified number of clusters.
Vector of integers greater than 1 — Cluster the data multiple times, once for each value in the vector.

When Number of clusters is auto or a vector, the task returns cluster centers for the optimal number of clusters, which it determines using a validity index. The output argument info returns the clustering results for the other values of C.

`Exponent` — Exponent for fuzzy partition matrix
`2` (default) | scalar greater than 1

This parameter controls the amount of fuzzy overlap between clusters, with larger values indicating a greater degree of overlap.

If your data set is wide with significant overlap between potential clusters, then the calculated cluster centers can be very close to each other. In this case, each data point has approximately the same degree of membership in all clusters. To improve your clustering results, decrease this value, which limits the amount of fuzzy overlap during clustering.

`Maximum iterations` — Maximum number of iterations
`100` (default) | positive integer

Maximum number of iterations for the FCM algorithm, specified as a positive integer.

`Minimum improvement` — Minimum improvement in objective function
`1e-05` (default) | positive scalar

Minimum improvement in the objective function between two consecutive iterations, specified as a positive scalar. The FCM algorithm stops when the objective function improves by an amount less than Minimum improvement.

`Distance metric` — Method for computing distance
`Euclidean` (default) | `Mahalanobis` | `Fuzzy maximum likelihood estimation`

Select one of these methods for computing the distance between data points and cluster centers:

Euclidean — Compute distance using a Euclidean distance metric, which corresponds to the classical FCM algorithm.
Mahalanobis — Compute distance using a Mahalanobis distance metric, which corresponds to the Gustafson-Kessel FCM algorithm.
Fuzzy maximum likelihood estimation — Compute distance using fuzzy maximum likelihood estimation (FMLE), which corresponds to the Gath-Geva FCM algorithm.

`Custom cluster centers` — Initial cluster centers
`[]` (default) | matrix

Specify an initial estimate of the cluster centers as an N_c-by-N_f matrix, where N_c is the number of clusters and N_f is the number of data features.

When Custom cluster centers is empty, the FCM algorithm randomly initializes the cluster center values.

`Verbose` — Information display flag
off (default) | on

Select this parameter to display the objective function value during clustering.

Display Results

`Select to show matrix of cluster plots` — Plot clustering results
on (default) | off

Select this parameter to plot the clustering results.

`Show results for optimal cluster configuration` — Plot optimal clustering result
on (default) | off

Select this parameter to plot the results for the optimal number of clusters. The cluster plots show results that correspond to the centers and U output arguments.

Dependencies

To enable this parameter, select the Select to show matrix of cluster plots parameter.
When you select this parameter, the task clears the Specify a cluster configuration parameter.

`Specify a cluster configuration` — Clustering result to plot
off (default) | on

Select this parameter to plot the results for a specified number of clusters, which you enter in the text box. The cluster plots show results for the corresponding elements of the FuzzyPartitionMatrix and ClusterCenters fields of the info output argument.

Dependencies

To enable this parameter, select the Select to show matrix of cluster plots parameter.
If the Number of clusters parameter is an integer, you can select only that number of clusters for plotting.
When you select this parameter, the task clears the Show results for optimal cluster configuration parameter.

`Show cluster centers` — Display cluster centers
off (default) | on

Select this parameter to display the cluster centers in the plots.

Dependencies

To enable this parameter, select the Select to show matrix of cluster plots parameter.

`Show legend` — Display legend
off (default) | on

Select this parameter to display a legend in the cluster plot.

Dependencies

To enable this parameter, select the Select to show matrix of cluster plots parameter.

Version History

Introduced in R2025a

FCM Data Clustering

Description

Open the Task

Examples

Cluster Data Using FCM in Live Editor

Related Examples

Parameters

Select Data

Input data — Data set to be clustered matrix

Clustering Options

Number of clusters — Number of clusters auto (default) | integer greater than 1 | vector of integers

Exponent — Exponent for fuzzy partition matrix 2 (default) | scalar greater than 1

Maximum iterations — Maximum number of iterations 100 (default) | positive integer

Minimum improvement — Minimum improvement in objective function 1e-05 (default) | positive scalar

Distance metric — Method for computing distance Euclidean (default) | Mahalanobis | Fuzzy maximum likelihood estimation

Custom cluster centers — Initial cluster centers [] (default) | matrix

Verbose — Information display flag off (default) | on

Display Results

Select to show matrix of cluster plots — Plot clustering results on (default) | off

Show results for optimal cluster configuration — Plot optimal clustering result on (default) | off

Dependencies

Specify a cluster configuration — Clustering result to plot off (default) | on

Dependencies

Show cluster centers — Display cluster centers off (default) | on

Dependencies

Show legend — Display legend off (default) | on

Dependencies

Version History

See Also

`Input data` — Data set to be clustered
matrix

`Number of clusters` — Number of clusters
`auto` (default) | integer greater than 1 | vector of integers

`Exponent` — Exponent for fuzzy partition matrix
`2` (default) | scalar greater than 1

`Maximum iterations` — Maximum number of iterations
`100` (default) | positive integer

`Minimum improvement` — Minimum improvement in objective function
`1e-05` (default) | positive scalar

`Distance metric` — Method for computing distance
`Euclidean` (default) | `Mahalanobis` | `Fuzzy maximum likelihood estimation`

`Custom cluster centers` — Initial cluster centers
`[]` (default) | matrix

`Verbose` — Information display flag
off (default) | on

`Select to show matrix of cluster plots` — Plot clustering results
on (default) | off

`Show results for optimal cluster configuration` — Plot optimal clustering result
on (default) | off

`Specify a cluster configuration` — Clustering result to plot
off (default) | on

`Show cluster centers` — Display cluster centers
off (default) | on

`Show legend` — Display legend
off (default) | on