K-mean for Wine data set

8 visualizzazioni (ultimi 30 giorni)
Ganesh
Ganesh il 27 Ago 2013
Risposto: Paul Munro il 21 Feb 2023
Hi,
I performed a K-mean algorithm command on the wine data set from UCI respiratory. This dataset contains chemical analysis of 178 wines, derived from three different cultivars. Wine type is based on 13 continuous features.
Here's the command load 'wine_data.txt';
[IDX,C,sumd,D] = kmeans(wine_data,3,... 'start','sample',... 'Replicates',100,... 'maxiter',1000, 'display','final');
The final Best total sum of distances is 2.37069e+06. This result is way far from the reported K-means solution from the literature, which is aournd 18,061. Is the K-mean solution of Matlab stuck in local minima? Please advice. Thanks.
  1 Commento
the cyclist
the cyclist il 27 Ago 2013
For anyone who is interested in helping out on this one, the data set is here: http://archive.ics.uci.edu/ml/datasets/Wine

Accedi per commentare.

Risposte (4)

Shashank Prasanna
Shashank Prasanna il 27 Ago 2013
Ganesh, what distance metric does the 'literature' use?
The kmeans default is 'sqEuclidean'. You have to make sure you are comparing the same metric. Try changing it to cityblock or any of the other options:

Ganesh
Ganesh il 27 Ago 2013
Thanks for the reply Shashank The literature used 'sqEuclidean' and so did I.
  1 Commento
tryhard
tryhard il 29 Ago 2013
Could you post a link to the relevant article. I get the same result you do. It seems like they might have performed pre-processing on the data of some sort.

Accedi per commentare.


gheorghe gardu
gheorghe gardu il 1 Nov 2015
I would like to ask if you could post the Matlab code that you have used ? I would like to thank you in advance.

Paul Munro
Paul Munro il 21 Feb 2023
The large distance sum you report makes me think that you did not rescale the data. Variable 13 is in the thousands and will overwhelm the effect of the other variables. You will probably get better results if you rescale the variables separately (Z scoring for example).

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by