MATLAB: How will there be any out-of-bag predictions if each replica in the TreeBagger has the data set size equal to the number of observations in the training set

MATLABout-of-bagpredictionstreebagger

How will there be any out-of-bag predictions if each replica in the TreeBagger has the data set size equal to the number of observations in the training set?
If the "InBagFraction" is "1" by default, then how much data is sampled from the entire set for each tree?
If this fraction refers to how much data is sampled for each tree, and by default it is 1, then how would there be any out of bag predictions?

Best Answer

The "InBagFraction" property is used to indicate what fraction of the input data is to be sampled with replacement from the input data for growing each new tree.
If the value of this property is "1", then it means that each bootstrap replica has the same total number of observations, however this does not mean it has the same set of observations. This is because each replica is constructed via sampling with replacement, which means that the observations fed to a bootstrap replica are sampled from the original data.
For example, let N be the number of observations in the data passed to "TreeBagger". If "InBagFraction" is 1, N observations are sampled out of N. Sampling is done with replacement if you use the default value for "SampleWithReplacement". There would be out-of-bag predictions in this case because sampling is done with replacement.