How to use Feature selection in MATLAB

6 visualizzazioni (ultimi 30 giorni)
Hugo
Hugo il 20 Gen 2022
Risposto: Akanksha il 28 Apr 2025
Hi,
I have a CSV file with 10 columns and 1000 lines. Apart from the headers (1st line), all the results are numeric. I would like to run feature selection to the first 9 columns, using as target the last column (column) 10.
I have three doubts/questions about my problem
-Which feature selection method, from those shown here: https://www.mathworks.com/discovery/feature-selection.html
is more suitable.
-Which function should be used for feature selection?
-How can I setup my code to obtain results from feature selection?
Thank you,

Risposte (1)

Akanksha
Akanksha il 28 Apr 2025
Hey @Hugo,
Answering your queries:
1.Which feature selection method to be used? The best method depends on your data and your goal. For regression problems (ie when target is numeric):
  • ReliefF (for regression),
  • F-test (filter method),
  • LASSO regression (embedded method) and
  • Sequential Feature Selection (wrapper method).
while for classification problems (ie when target is categorical):
  • ReliefF (for classification),
  • F-test/ANOVA,
  • Sequential Feature Selection.
Also, Sequential Feature Selection (using sequentialfs) is a robust and general-purpose method, as it works for both regression and classification, and can be paired with any model.
2. Which function should be used for feature selection?
MATLAB R2021a provides several functions for feature selection. The most general and commonly used are:
  • sequentialfs:Sequential feature selection for regression or classification.
  • relieff:Ranks features using the ReliefF algorithm.
  • lasso:Performs LASSO regression and selects features by shrinking coefficients to zero.
3. Below is the sample code that will help you achieve your results in MATLAB R2021a.
% Feature selection example for regression (MATLAB R2021a)
% 1. Load data from CSV
data = readmatrix('yourfile.csv'); % Replace with your actual filename
X = data(:, 1:9); % Features (first 9 columns)
Y = data(:, 10); % Target (last column)
% 2. Define regression model function for sequentialfs
fun = @(Xtrain, Ytrain, Xtest, Ytest) ...
mean((Ytest - predict(fitlm(Xtrain, Ytrain), Xtest)).^2);
% 3. Run sequential feature selection
opts = statset('display','iter'); % Show progress
[fs, history] = sequentialfs(fun, X, Y, 'cv', 5, 'options', opts);
% 4. Display selected features
disp('Selected feature columns:');
disp(find(fs));
% 5. Plot feature selection history
figure;
plot(history.Crit, 'o-');
xlabel('Number of features');
ylabel('Cross-validated MSE');
title('Feature selection history');
grid on;
Hope this helps!

Categorie

Scopri di più su Software Development Tools in Help Center e File Exchange

Prodotti


Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by