Azzera filtri
Azzera filtri

permutation in regression learner app

3 visualizzazioni (ultimi 30 giorni)
Parisa Ahmadi Ghomroudi
Parisa Ahmadi Ghomroudi il 12 Ott 2021
Risposto: Shubham il 15 Mag 2024
I am using statistical and machine learning toolbox to find a best model to predict my variable of interest. After having caculated my best model how can I compute permuation to asses the p value?
  2 Commenti
Ive J
Ive J il 24 Ott 2021
how can I compute permuation to asses the p value... What do you mean exactly by computing permutation?
Parisa Ahmadi Ghomroudi
Parisa Ahmadi Ghomroudi il 25 Ott 2021
Thank you for your reply, the best model for my data is boosted tree and I want to find Predictor importance of my data by permutation. I could not find an option in Matlab toolbox. I tried oobPermutedPredictorImportance but it seems it is only suitable for BaggedEnsemble.

Accedi per commentare.

Risposte (1)

Shubham
Shubham il 15 Mag 2024
Hi Parisa,
To compute permutation tests for assessing the significance (p-value) of your best model's performance in MATLAB, using the Statistical and Machine Learning Toolbox, you can follow a general approach. This involves shuffling the labels or responses of your dataset multiple times and recalculating the model's performance for each shuffle. By comparing the original model's performance against the distribution of performances from these permutations, you can estimate how likely it is to observe your model's performance by chance.
Here's a step-by-step guide to performing a permutation test:
1. Fit Your Best Model
First, fit your model using the original dataset. This involves selecting your predictors (features) and response variable, then training the model accordingly.
% Assuming X are your predictors and Y is your response variable
bestModel = fitlm(X, Y); % Example for a linear model, adjust according to your model type
% Compute the performance of your model, e.g., R-squared, RMSE, accuracy...
originalPerformance = bestModel.Rsquared.Ordinary; % Adjust this metric as needed
2. Permutation Test
To perform the permutation test, you will shuffle the response variable Y multiple times, refit the model for each shuffled dataset, and compute its performance metric.
numPermutations = 1000; % Number of permutations
performanceShuffled = zeros(numPermutations, 1); % Preallocate array for performance metrics
for i = 1:numPermutations
% Shuffle the response variable
Y_shuffled = Y(randperm(length(Y)));
% Fit the model to the shuffled dataset
modelShuffled = fitlm(X, Y_shuffled); % Adjust for your model type
% Compute the performance metric for the shuffled model
performanceShuffled(i) = modelShuffled.Rsquared.Ordinary; % Adjust metric as needed
end
3. Compute the P-value
After obtaining the distribution of performance metrics from the shuffled datasets, you can compute the p-value as the proportion of times the shuffled models' performances equal or exceed the performance of the original model.
pValue = sum(performanceShuffled >= originalPerformance) / numPermutations;
Notes
  • The choice of performance metric (e.g., R-squared, RMSE, accuracy) depends on your model type and the nature of your prediction task (regression or classification).
  • Adjust the model fitting function (fitlm in the example) according to the type of model you're using (e.g., fitglm for generalized linear models, fitctree for decision trees, etc.).
  • A low p-value (typically <0.05) suggests that the observed model performance is unlikely to be due to chance, indicating a potentially significant relationship captured by the model.
This approach provides a non-parametric way to assess the significance of your model's predictive ability, complementing traditional statistical tests and confidence intervals that assume specific data distributions.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by