3 visualizzazioni (ultimi 30 giorni)
zheng il 10 Mar 2011
[IDX,C] = kmeans(X,k,param1,val1)
here, 'start' is as param1, Matrix is as val. It is the method used to choose the initial cluster centroid positions.
Matlab help exaplained as: "k-by-p matrix of centroid starting locations. In this case, you can pass in [] for k, and kmeans infers k from the first dimension of the matrix."
Here is function I try to use: [IDX,C]=kmeans(data,[],'Distance','sqEuclidean','emptyaction','singleton','Start',data);
Question 1: is "data" that Matrix which help talked about? Question 2: if it is, the new problem coming as below "??? Error using ==> NaN Out of memory. Type HELP MEMORY for your options.
Error in ==> kmeans at 298 if online, Del = NaN(n,k); end % reassignment criterion"
In my case, dimension of data is 334795x2.
##### 0 CommentiMostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

### Risposta accettata

Matt Tearle il 23 Mar 2011
The 'start' parameter defines the initial centroid locations. As the help explains, it should be k-by-p where k is the number of groups you're splitting the data into. If you use the data matrix itself, then k is the same as the number of data points! That is, your asking kmeans to group n points into n groups! The result will be that each data point is in its own group. If you have 300,000 points, you'll also run out of memory, it seems.
This is what you're trying to do:
X = rand(20,2);
g = kmeans(X,[],'start',X)
gscatter(X(:,1),X(:,2),g)
This is what you should be doing:
g = kmeans(X,[],'start',[0.25,0.75;0.25,0.25;0.75,0.25;0.75,0.75])
gscatter(X(:,1),X(:,2),g)
Note that the data (X) is 20-by-2. The starting matrix is 4-by-2, so kmeans makes 4 groups out of the 20 points.
##### 1 CommentoMostra -1 commenti meno recentiNascondi -1 commenti meno recenti
Vladimir Borgiani il 29 Dic 2015
Is there any way to run K-means to discover the best centroids but using the inputs instead of calculating the best centroid? Example, we supply a matrix with positions X and Y and instruct k-means to find the N best centroids as long as they are existent locations.

Accedi per commentare.

### Più risposte (3)

zheng il 17 Mar 2011
##### 0 CommentiMostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Tom Lane il 31 Mar 2011
You have written
kmeans(data,[],'Distance','sqEuclidean','emptyaction','singleton','Start',data)
You don't want to specify "data" as both the input data and the starting guess at the centroid locations. Suppose you want kmeans with k=7. You want the 'Start' value to have 7 rows, one for each centroid.
##### 0 CommentiMostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Check out the color reduction using kmeans example here: http://imageprocessingblog.com/?p=178
##### 0 CommentiMostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

### Categorie

Scopri di più su k-Means and k-Medoids Clustering in Help Center e File Exchange

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by