What is 'categorical predictor' in decision tree for regression

5 visualizzazioni (ultimi 30 giorni)
Hi, I'd like to use Matlab's own example for the question. Please refer to https://uk.mathworks.com/help/stats/fitrtree.html for the original example.
>> load carsmall
>> whos
Name Size Bytes Class Attributes
Acceleration 100x1 800 double
Cylinders 100x1 800 double
Displacement 100x1 800 double
Horsepower 100x1 800 double
MPG 100x1 800 double
Mfg 100x13 2600 char
Model 100x33 6600 char
Model_Year 100x1 800 double
Origin 100x7 1400 char
Weight 100x1 800 double
>> tree = fitrtree([Weight, Cylinders],MPG,...
'categoricalpredictors',2,'MinParentSize',20,...
'PredictorNames',{'W','C'})
tree =
RegressionTree
PredictorNames: {'W' 'C'}
ResponseName: 'Y'
CategoricalPredictors: 2
ResponseTransform: 'none'
NumObservations: 94
Properties, Methods
What exactly is the Categorical Predictors in this case and why it is 2?

Risposta accettata

Adam Danz
Adam Danz il 10 Gen 2019
Modificato: Adam Danz il 10 Gen 2019
Matlab's fitrtree() function returns a regression tree object. Read more about this object and its properties here:
As you'll read in the link above, the "CategoricalPredictors" contains index values corresponding to the columns of the categorical predictor data (if none of the predictors are categorical, this will be empty []).
So, why is it CategoricalPredictors equal to 2?
Now read about the function you're using fitrtree()
One of the name-value pairs (<- link) is 'CategoricalPredictors' which, is specified in your call to fitrtree() as 2. That's because you have two predictors being treated as categorical variables, [Weight, Cylinders].
  3 Commenti
Salad Box
Salad Box il 15 Gen 2019
Actually, as I read a few times, 'the "CategoricalPredictors" contains index values corresponding to the columns of the categorical predictor data', 'index value' (or the 'entry') means if index value is 1, that is the first column of the predictor data, in this case, it is 'Weight'; if 'index value' is 2, that is the second column of the predictor data, in this case, it is 'Cylinders'.
So
tree = fitrtree([Weight, Cylinders],MPG,...
'categoricalpredictors',2,'MinParentSize',20,...
'PredictorNames',{'W','C'})
indicates 'Cylinders' (the 2nd column in the predictor data) is a categorical predictor. In fact, there are only 4, 6, 8 cylinders, the number is not countinuous.

Accedi per commentare.

Più risposte (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by