File Exchange

image thumbnail


version (4.59 MB) by Ke Yan
Implementation and wrappers of ~40 common pattern recognition algorithms.


Updated 26 Apr 2016

From GitHub

View license on GitHub

Yet ANother pattern recognition toolbox.
>>Feature processing

Logistic regression (LR), softmax
support vector machine (SVM)
random forest (RF)
K nearest neighbors (KNN)
Bayes, Mahalanobis distance
artificial neural networks (ANN)
extreme learning machine (ELM)

(Kernel) ridge regression
support vector regression (SVR)
least squares, robust fitting, quadratic fitting
partial least squares (PLS)
step-wise fit
random forest (RF)
artificial neural networks (ANN)

>>Feature selection
Correlation coefficients, Fisher ratio
minimum redundancy maximal relevance (mRMR)
single feature predictor
sequential forward selection (SFS)
genetic algorithm (GA)
random forest (RF)
step-wise fit
SVM-RFE (original linear and kernel version)

>>Representative sample selection (active learning)
Cluster centers
transductive experimental design (TED)
locally linear reconstruction (LLR)
Kennard-Stone algorithm (KS)

* Unified and simple interface;
* Convenient to observe and change algorithm parameters
* Extensible. Simple file structures makes it easier to modify the algorithms.


>>Feature processing
[Xnew, model] = ftProc_xxx_tr(X,Y,param) % training
Xnew = ftProc_xxx_te(model,X) % test

model = classf_xxx_tr(X,Y,param) % training
[pred,prob] = classf_xxx_te(model,Xtest) % test, return the predicted labels and probabilities (optional)

model = regress_xxx_tr(X,Y,param) % training
rv = regress_xxx_te(model,Xtest) % test, return the predicted values

>>Feature selection
[ftRank,ftScore] = ftSel_xxx(ft,target,param) % return the feature rank (or subset) and scores (optional)

>>Representative sample selection (active learning)
smpList = smpSel_xxx(X,nSel,param) % return the indices of the selected samples

Please see test.m for sample usages.

Besides, there are three uniform wrappers: ftProc_, classf_, regress_. They accept algorithm name strings as inputs and combine the training and test phase.

Please find more details at

Cite As

Ke Yan (2021). YAN-PRTools (, GitHub. Retrieved .

Comments and Ratings (6)


J s

Quoc Pham

Thank you Ke Yan for your kind reply.
If the code line is correct
Y_hat = classRF_predict(X_trn,model);
My model performance show that:
Accuracy of training is 100%, while accuracy of test 75%.
In this case, do you think I got overfitting?

Ke Yan

Quoc, I think you are correct to use
Y_hat = classRF_predict(X_trn,model);
to get the training set accuracy.

Quoc Pham

Hi Ke Yan,
Really appreciate your contribution.
Based on your code, I modified and applied random forest classification for my case.
The code as following:
%read file
Xtrn = xlsread('510classcalipca.xlsx'); %training dataset, include features and class label
Xtst = xlsread('510classvalipca.xlsx'); %test dataset, include features and class label
X_trn = Xtrn(:,1:6); %features of training dataset
Y_trn = Xtrn(:,34); %class label of training dataset
X_tst = Xtst(:,1:6); %features of test dataset
Y_tst = Xtst(:,34); %class label of training dataset
model = classRF_train(X_trn,Y_trn, 449,2);
Y_hat = classRF_predict(X_tst,model);
fprintf('\accuracy %f\n', length(find(Y_hat==Y_tst))/length(Y_tst));

Could you help me solve this question.
How I can evaluate performance of the model on training dataset, whether it is a correct way if I just modify like this:
Y_hat = classRF_predict(X_trn,model);
fprintf('\accuracy %f\n', length(find(Y_hat==Y_trn))/length(Y_trn));

Sofia Marino

MATLAB Release Compatibility
Created with R2011a
Compatible with any release
Platform Compatibility
Windows macOS Linux

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!