Can someone explain me the following syntax (especially the one in bold), which is used in order to determine the optimal tree depth of a regression tree/classification tree: Generate minimum leaf occupancies for classification trees from 10 to 100, spaced exponentially apart:
leafs = logspace(1,2,10);
Create cross validated classification trees for the ionosphere data with minimum leaf occupancies from leafs:
rng('default') N = numel(leafs); err = zeros(N,1); for n=1:N t = ClassificationTree.fit(X,Y,'crossval','on',… 'minleaf',leafs(n)); err(n) = kfoldLoss(t); end plot(leafs,err); xlabel('Min Leaf Size'); ylabel('cross-validated error');
You can also find it under http://www.mathworks.de/de/help/stats/classification-trees-and-regression-trees.html#bsw6p3v . Also does anyone know how the default tree depth is generated? Any help on this question is very welcome, thank you!:)
Best Answer