I think I have some clue, but it would be highly recommended that somebody from MathWorks Team verify it.
So my clue is this:
- Kmeans needs to choose some initial clusters positions. It can select randomly k INPUT POINTS to start.
- If you set rng(seed), seed=const. you will always get SAME row indices from data matrix as a starting cluster position.
- If you shuffle input data (input points locations are the same, only order in data structure is shuffled), even if you set rng(seed), seed=const. , you will get SAME row indices, BUT points under that indices are DIFFERENT !
- That means that kmeans will converge differently for shuffled input data points.
This would explain also my puzzle in another question: https://www.mathworks.com/matlabcentral/answers/448832-bug-evalclusters-is-sensitive-to-rows-points-order
What do you think MathWorks experts? :) Does k-means select input data points as a starting centroids locations?