Linkages other than Ward in evalcluster

Hi,
Is there any way to change the linkage to any desired linkage (e.g., average, complete, single, cetroid, etc.) in evalcluster?
Right now it says that linkage is set to Ward by default
Please advise

Risposte (1)

Hamoon
Hamoon il 14 Set 2015
Why don't you use linkage or clusterdata ?

4 Commenti

Tigran
Tigran il 14 Set 2015
Modificato: Tigran il 14 Set 2015
I do use linkage for clustering itself.
However, after clustering the data I need to evaluate/validate the resulting clusters. In other words I would like to find the optimal number of clusters for each linkage used. Internal validation criteria (e.g., Calinski-Harabasz, Silhouette, or Davies-Bouldin) take care of this. But, the evalcluster has set linkage to Ward by default (if Euclidean distance is used as a metric), and I was wondering of this can be modified?
Here is a cut from the description: 'If Clust is 'linkage', and Distance is either 'sqEuclidean' or 'Euclidean', then the clustering algorithm uses Euclidean distance and Ward linkage.'
Hamoon
Hamoon il 14 Set 2015
Modificato: Hamoon il 14 Set 2015
Yes, you can change it. you can define a function handle using clusterdata for that. look at this code for example:
myfunc = @(x,k) clusterdata(x,'linkage','average','maxclust',k);
eva = evalclusters(x,myfunc,'CalinskiHarabasz',...
'klist',[1:6]);
for myfunc, x is the input data and k is the number of clusters. then evalclusters evaluate performance of myfunc clustering (here linkage with average method)
you can change the options for linkage in myfunc, for example you can write this:
myfunc = @(x,k) clusterdata(x,'linkage','weighted','maxclust',k);
check this example:
load fisheriris;
myfunc = @(x,k) clusterdata(x,'linkage','weighted','maxclust',k);
eva = evalclusters(meas,myfunc,'CalinskiHarabasz',...
'klist',[1:6]);
to find out which options you have for myfunc when you want to use linkage, check linkage and clusterdata
Is it clear enough?
Hello, can't we estimate the number of k cluster before doing the clustering? This is unclear to me, especially why using 'maxclust' with k cluster without knowing in advance the best clustering method and number? Please could you provide more precision?
Chris
Chris il 14 Feb 2017
Modificato: Chris il 14 Feb 2017
I think for 'maxclust' you put the maximum number of clusters you want evalclusters to test, i.e. the maximum value of 'klist'. Please correct me if I'm wrong.

Accedi per commentare.

Richiesto:

il 14 Set 2015

Modificato:

il 14 Feb 2017

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by