Random Forest is a bagging algorithm rather than a boosting algorithm.
They are two opposite way to achieve a low error.
We know that error can be composited from bias and variance. A too complex model has low bias but large variance, while a too simple model has low variance but large bias, both leading a high error but two different reasons. As a result, two different ways to solve the problem come into people's mind (maybe Breiman and others), variance reduction for a complex model, or bias reduction for a simple model, which refers to random forest and boosting.
Random forest reduces variance of a large number of "complex" models with low bias. We can see the composition elements are not "weak" models but too complex models. If you read about the algorithm, the underlying trees are planted "somewhat" as large as "possible". The underlying trees are independent parallel models. And additional random variable selection is introduced into them to make them even more independent, which makes it perform better than ordinary bagging and entitle the name "random".
While boosting reduces bias of a large number of "small" models with low variance. They are "weak" models as you quoted. The underlying elements are somehow like a "chain" or "nested" iterative model about the bias of each level. So they are not independent parallel models but each model is built based on all the former small models by weighting. That is so-called "boosting" from one by one.
Breiman's papers and books discuss about trees, random forest and boosting quite a lot. It helps you to understand the principle behind the algorithm.
Im sure this is a typo. This document appears to be lecture notes from from Dr. Hastie from Stanford. Please look at Dr. hastie's book by following the link below at pg 351 table 10.1 has an accurate comparison and comprehensive background for machine learning methods, but does not have gradiant boosting method compared.
http://www.stanford.edu/~hastie/local.ftp/Springer/OLD/ESLII_print4.pdf
Best Answer
I would use whichever one performed better out of sample.
So far, I've found in impossible to tell which model will be better for a novel problem a priori.