How to group different sensor's data based on their similarities?

1 visualizzazione (ultimi 30 giorni)
I have multiple sensors’ data over one year. I wanted to know if there are any unsupervised methods to divide and group sensors’ data that have close characteristics/behavior.
For example, if I have electricity consumption data for 1000 buildings stored in a table with 1000 columns, how I can divide or cluster these columns such that those that have close characteristics are placed in a specific group?
I appreciate your time in advance.
Thank you.
Time D1 D2 D3 D4 D5 Dn
____________________ _______ _______ _______ _______ _______ .... _______
01-Jan-2020 00:00:00 2.9675 32.502 23.454 3.5067 . .
01-Jan-2020 00:01:00 -6.298 -96.793 -64.711 -9.9581 . .
01-Jan-2020 00:02:00 -5.5285 -75.355 -54.29 -8.215 . .
01-Jan-2020 00:03:00 -1.4514 -34.475 -24.879 -3.468 . .
01-Jan-2020 00:04:00 3.9736 66.112 42.284 6.639 . .
01-Jan-2020 00:05:00 3.1481 64.577 41.262 6.9614 . .
01-Jan-2020 00:06:00 -44.042 -699.24 -414.33 -75.339 . .
01-Jan-2020 00:07:00 4.4172 69.015 37.355 6.6763 . .
01-Jan-2020 00:08:00 23.509 284.8 186.89 32.597 . .
01-Jan-2020 00:09:00 17.329 214.71 124.45 20.634 . .
  6 Commenti
Walter Roberson
Walter Roberson il 25 Giu 2022
Modificato: dpb il 25 Giu 2022
principal component analysis, and cross-correlation might help
smoa
smoa il 25 Giu 2022
Modificato: smoa il 25 Giu 2022
Thank you @Walter Roberson for your suggestions. I will try corr(x) to see their correlation and perhaps find those that are close to each other.

Accedi per commentare.

Risposte (1)

Abhas
Abhas il 28 Mag 2025
Modificato: Abhas il 28 Mag 2025
Hi @smoa,
You can use several learning methods in MATLAB to cluster your building electricity consumption data by similar characteristics. Here are some effective approaches for your scenario:
1. K-means Clustering: This is ideal for your use case as it:
  • Groups buildings with similar consumption patterns
  • Identifies representative centroids for each cluster
  • Is efficient for large datasets (1000 buildings)
  • Provides clear membership assignments
2. Hierarchical Clustering: This creates a dendrogram that shows:
  • Relationships between all buildings
  • How clusters merge at different similarity levels
  • Flexibility to choose the number of clusters after analysis
  • Good for exploring the natural grouping structure
3. PCA + Clustering: This two-step approach will:
  • Reduce the dimensionality of your time series data
  • Identify the most important consumption patterns
  • Make clustering more effective by removing noise
  • Improve visualization of the clusters
4. Dynamic Time Warping (DTW): Particularly useful for energy data because:
  • It handles temporal shifts in consumption patterns
  • Buildings with similar patterns but different peak times can be grouped
  • It's more robust to phase differences than Euclidean distance
5. Spectral Clustering: Good for identifying complex relationships:
  • Can find non-convex cluster shapes
  • Often performs better on complex real-world data
  • Considers the global structure of your dataset
You may refer to the below MathWorks documentation links to know more about each of them:
  1. K-Means: https://www.mathworks.com/help/stats/kmeans.html
  2. Hierarchical: https://www.mathworks.com/help/stats/hierarchical-clustering.html
  3. PCA: https://www.mathworks.com/help/stats/pca.html
  4. DTW: https://www.mathworks.com/help/signal/ref/dtw.html
  5. Special Clustering: https://www.mathworks.com/help/stats/spectral-clustering.html
I hope this helps!

Categorie

Scopri di più su MATLAB in Help Center e File Exchange

Prodotti


Release

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by