File Exchange

image thumbnail

Kernel Kmeans

version 1.8.0.0 (4.1 KB) by Mo Chen
kernel kmeans algorithm

12 Downloads

Updated 11 Mar 2017

View License

This function performs kernel kmeans algorithm. When the linear kernel (i.e., inner product) is used, the algorithm is equivalent to standard kmeans algorithm. Several nonlinear kernel functions are also provided. Upon request, I also include a prediction function for out-of-sample inference. Please try following code for a demo:
clear; close all;
d = 2;
k = 3;
n = 500;
[X,label] = kmeansRnd(d,k,n);
init = ceil(k*rand(1,n));
[y,mse,model] = knKmeans(X,init,@knLin);
plotClass(X,y)
idx = 1:2:n;
Xt = X(:,idx);
t = knKmeansPred(model, Xt);
plotClass(Xt,t)
This function is now a part of the PRML toolbox (http://www.mathworks.com/matlabcentral/fileexchange/55826-pattern-recognition-and-machine-learning-toolbox).

Comments and Ratings (22)

can anyone send me the full code via email ducduy55@gmail.com?

xiaodong ,do you solve the problem you asked? i have the same question!

Okey Erik

can anyone send me the full code on kernel k-means in matlab?

Xiaodong

Can this code return the cluster centers?

how to plot the original data with color?

Jung, Tajana, I found the same problem in my Matlab version R2014b.
The function "unique" on line 18 changes the variable "label" from a row vector to a column vector.
This caused the error in my case.

To fix it, transpose the variable "label" after line 18.

label = label'.

Hope it helps

Jung and Tajana, I ran the sample code without any problems but I'm wondering if your data array is the right size (2 X 1000).

In either case, you might want to see if the 'any' function in your MATLAB version can compare a vector to a scalar or to another vector. If you get the same 'Matrix dimensions must agree' error then that may be the problem.

Jung, were you able to resolve this issue? I'm also having quite a bit of trouble with this, and can't resolve the error.

Hi,

I get following error message when pasted 'sample code' in code description? thanks.

>> load data;
K=x'*x; % use linear kernel
label=knkmeans(K,3);
spread(x,label)
Error using ~=
Matrix dimensions must
agree.

Error in knkmeans (line 26)
end

Ajay Singh

Santiago

Mo Chen

Hi, Phillip,
You computation is right, only not very efficient. Check my new code

Phillip

Hello I encountered the same problem as john luckily i had the book, I added the following code. The energy is the sum squared clustering cost function. I have been optimizing my kernel hyper-parameters to minimise this energy. Been working fairly well so thanks. Not an expert so could be wrong.

A=zeros(size(S'));
for i=1:1:size(K,1)
A(i,label(i))=1;
end
D=diag(1./sum(A));
energy = trace(K)-trace(sqrt(D)*A'*K*A*sqrt(D));

The code appears broken to me:

>> load data;
K=x'*x; % use linear kernel
label=knkmeans(K,3);
scatterd(x,label)
??? Undefined function or variable 'val'.

Error in ==> knkmeans at 31
energy = sum(val)+trace(K);

Mo Chen

Hi mathieu,
As indicated in the description, this algorithm is explained in
reference: [1] Kernel Methods for Pattern Analysis
by John Shawe-Taylor, Nello Cristianini

Seyed Salim

Mathieu, you can refer to machine learning and pattern recognition by Bishop, 2005. Alternatively this is for free: www-stat.stanford.edu/~hastie/Papers/ESLII.pdf

Mathieu

I see, reading the code I do not manage to understand what are the principles behind the algorithm. Do you have a reference that I could get from the web or do you advise to buy the book ?

Mo Chen

This happens for standard kmeans too, which is caused by the nature of the algorithm. The reason is that when you set a very big number for k, after several iterations, some clusters might become empty.

Mathieu

It seems that if I request N clusters, the algorithms outputs k clusters, k<=N clusters and most of the time k<<N. I was wondering if this is by construction. If yes, could provide me with an explanation ?

Updates

1.8.0.0

tweak

1.7.0.0

update description

1.7.0.0

fix incompatibility issue due the stupid API change of function unique()

1.7.0.0

Improve the code and fix a bug of returning energy

1.6.0.0

n/a

1.5.0.0

fix a minor bug of returning energy

1.2.0.0

remove empty clusters

1.1.0.0

add sample data and detail description

MATLAB Release Compatibility
Created with R2016b
Compatible with any release
Platform Compatibility
Windows macOS Linux