Azzera filtri
Azzera filtri

about high dimension and low sample using PCA

1 visualizzazione (ultimi 30 giorni)
Yimin Chen
Yimin Chen il 8 Nov 2016
Risposto: Aditya il 27 Giu 2024
I am using PCA to detect the abnormality in time-series data. Currently I have high dimension and low sample dataset (15*530 data matrix). I am wondering if I can use PCA to obtain the statistic such as T^2 and SPE. I noticed that some articles stated that it is improper to use PCA to obtain the statistics under such case.

Risposte (1)

Aditya
Aditya il 27 Giu 2024
Using PCA to detect abnormalities in time-series data, especially with a high-dimensional and low-sample dataset, can be challenging. The primary concern is that PCA may not provide reliable results when the number of features (dimensions) significantly exceeds the number of samples. This is because PCA relies on the covariance matrix, which can be poorly estimated in such scenarios.
% Simulate high-dimensional, low-sample data
rng(0);
DATASET = rand(15, 530);
% Apply PCA
[coeff, score, latent] = pca(DATASET);
% Calculate T² statistic
T2 = sum((score ./ sqrt(latent')).^2, 2);
% Calculate SPE (Q-statistic)
reconstructed = score * coeff';
SPE = sum((DATASET - reconstructed).^2, 2);
% Set threshold for T² and SPE (e.g., 95% confidence level)
alpha = 0.05;
T2_threshold = chi2inv(1 - alpha, size(coeff, 2));
SPE_threshold = prctile(SPE, 95);
% Detect abnormalities
abnormal_T2 = T2 > T2_threshold;
abnormal_SPE = SPE > SPE_threshold;
disp('Abnormalities detected by T²:');
disp(abnormal_T2);
disp('Abnormalities detected by SPE:');
disp(abnormal_SPE);

Categorie

Scopri di più su Dimensionality Reduction and Feature Extraction in Help Center e File Exchange

Tag

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by