Unsupervised clustering of categorical data

Daniel Guignard
Daniel Guignard il 23 Nov 2021
Risposto: Pratyush Roy il 1 Dic 2021
Hi everyone,
I wanted to cluster a time serie dataset which has 30 timepoints and more than 50'000 rows. The dataset is categorical (from 1 to 6) which represent different categories.
The problem with my current clustergram method using the euclidian distance metrics, is that it will cluster the category 5 closer to 6. I don't want that, those categories are not somehow related. How is it possible to remove this bias in the clustering?
Hope my question is clear, thanks for your further help!
Image Analyst
Image Analyst il 23 Nov 2021
Could be clearer if you attached a .mat file with your table, as many rows as will fit into 5 MB (attachment size limit).

Pratyush Roy
Pratyush Roy il 1 Dic 2021
Hi Daniel,
The link here might be helpful for clustering categorical or non-numeric data.
Hope this helps!




