|
|
|
|
|
|
|
|
|
where |
Given this, we can then compute the Gini index This is the expected error rate. To see this, note that |
|
|
![]() |
|
|
|
|
|
![]() |
The disadvantage of bootstrap
each base model only sees, on average,
The
We can use the predicted performance of the base model on these oob instances as an estimate of test set performance.
This provides a useful alternative to cross validation.
The main advantage of bootstrap is that it prevents the ensemble from relying too much on any individual training example, which enhances robustness and generalization.
Bagging does not always improve performance. In particular, it relies on the base models being unstable estimators (decision tree), so that omitting some of the data significantly changes the resulting model fit.
Random forests: learning trees based on a randomly chosen subset of input variables (at each node of the tree), as well as a randomly chosen subset of data cases.
It shows that random forests work much better than bagged decision trees, because many input features are irrelevant.
Boosting is an algorithm for sequentially fitting additive models where each
as long as each
forward stagewise additive modeling: sequentially optimize the objective for general (differentiable) loss functions
We then set
so
discrete AdaBoost
where
This can be found by applying the weak learner to a weighted version of the dataset, with weights
where
Thus we see that we exponentially increase weights of misclassified examples. The resulting algorithm shown in Algorithm 8, and is known as Adaboost.
where
where