What is the difference between oobPredict and predict with ensemble of bagged decision trees?

Question

0 voti

1- I am using both fuctions to predict a response through random forest, but the predict function gives higher percentage of explained variance compared to oobPredict. Why is it so? - I think there is some fundamental thing that I have not yet fully grasped.

2- If there is something different between these methods in the way that they weigh trees how can I make these methods homogenous?

3- Can one use oobPredict in someway to make predictions with a new set of data?

0 Commenti
Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti

Accedi per commentare.

Accedi per rispondere a questa domanda.

Follow Question

Answer 1

Malay Agarwal il 26 Ago 2024

Modificato: Malay Agarwal il 26 Ago 2024

1 voto

Hi @Faranak,

The "oobPredict" function is used to get a more realistic estimate of the performance of the model. For each data sample, the function only considers those trees for which the sample was out-of-bag during training. In other words, it only considers those trees which have not seen the sample during training. Since the trees have not seen the sample, the prediction can be incorrect and contribute to the model's error. This can lead to a lower percentage of explained variance.

On the other hand, the "predict" function uses all the trees to obtain a prediction for a sample. If the sample is from the training set, at least one tree must have seen the sample during training and the model can account for more of the variance in the dataset.

This is similar to having a training set and a validation set when training a neural network (https://en.wikipedia.org/wiki/Training,_validation,_and_test_data_sets).The network will always report a higher error and explain less of the variance on the validation set since the model is not explicitly trained on those samples. The out-of-bag samples act as the validation set since only those trees which haven't seen the sample during training have a say in the final prediction.

This is explained in the documentation of "oobPredict" (https://www.mathworks.com/help/stats/treebagger.oobpredict.html#bu0qyz1-2), albeit in a less direct manner:

"For each observation that is out of bag for at least one tree, oobPredict composes the weighted mean of the class posterior probabilities by selecting the trees in which the observation is out of bag. "

I don't think there is any way to make the outputs more homogenous since "oobPredict" will always choose a different set of trees to make a prediction for a sample as compared to the "predict" function. You can try experimenting with the "TreeWeights" name-value argument but I think that's unlikely to work since it only defines how to weigh the trees in the overall calculation of the prediction, and does not affect which trees will take part in the prediction.

Coming to your last question, the "oobPredict" function does not support making predictions on new data. It is simply to evaluate the model's performance by obtaining a less biased estimate of its error. For new data, please use the "predict" function.

Hope this helps!

1 Commento
Mostra -1 commenti meno recenti Nascondi -1 commenti meno recenti

Faranak il 29 Ago 2024

Thanks a lot Malay. Your answers made a lot of points clearer to me.

Accedi per commentare.

What is the difference between oobPredict and predict with ensemble of bagged decision trees?

0 Commenti
Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti

Risposta accettata

1 Commento
Mostra -1 commenti meno recenti Nascondi -1 commenti meno recenti

Più risposte (0)

Categorie

Prodotti

Release

Tag

Community Treasure Hunt

What is the difference between oobPredict and predict with ensemble of bagged decision trees?

0 Commenti Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti

Risposta accettata

1 Commento Mostra -1 commenti meno recenti Nascondi -1 commenti meno recenti

Più risposte (0)

Categorie

Prodotti

Release

Tag

Vedere anche

Community Treasure Hunt

0 Commenti
Mostra -2 commenti meno recenti Nascondi -2 commenti meno recenti

1 Commento
Mostra -1 commenti meno recenti Nascondi -1 commenti meno recenti