Boosting Techniques

Up to now, we’ve discussed 5 different boosting algorithms: AdaBoost, Gradient Boosting, XGBoost, LightGBM and CatBoost.
Out of them, XGBoost, LightGBM and CatBoost are more important algorithms as they produce more accurate results with faster execution times.
In Part 5, we’ve already compared the performance and execution time between XGBoost and LightGBM. There, we’ve found that:
LightGBM can sometimes outperform XGBoost, but not always. However, LightGBM is about 7 times faster than XGBoost!
It is time to do some performance comparison of Catboost vs XGBoost and CatBoost vs LightGBM. At the end of this content, I’ll also mention some guidelines that help you to choose the right boosting algorithm for your task.
CatBoost vs XGBoost
Here, we consider 2 factors: performance and execution time. We build CatBoost and XGBoost regression models on the California house pricing dataset.

XGBoost has slightly outperformed CatBoost. However, CatBoost is about 3.5 times faster than XGBoost!
CatBoost vs LightGBM
Here also, we consider the same 2 factors. This time, we build CatBoost and Lightgbm regression models on the California house pricing dataset.

LightGBM has slightly outperformed CatBoost and it is about 2 times faster than CatBoost!
Conclusion
When we consider execution time, LightGBM wins hands down! It is 7 times faster than Xgboost and 2 times faster than CatBoost!
When we consider performance, XGBoost is slightly better than the other two. However, selecting the right boosting technique depends on many factors. Here are some guidelines that help you to choose the right boosting algorithm for your task.
Guidelines to select the right boosting technique
- Any boosting technique is much better than decision trees and random forests except for parallelizing the training process.
- You can start with a basic boosting technique such as AdaBoost, or Gradient Boosting and then you can move to an enhanced technique such as XGBoost.
- LightGBM and CatBoost are great alternatives to XGBoost.
- If you’ve larger datasets, consider using LightGBM or CatBoost. LightGBM is the best option.
- If your dataset has categorical features, consider using LightGBM or CatBoost. Both can handle categorical features that are not already encoded. CatBoost is the best option to deal with categorical features.
- XGBoost has some more generalization capabilities than other boosting techniques. In performance-wise, it is good.
- LightGBM has the fastest execution time.
- When we consider both performance and execution time together, LightGBM is the best option.
One of the major drawbacks of boosting techniques is that overfitting can easily happen with boosting algorithms since they are tree-based algorithms. We’ve already discussed few techniques to address the problem of overfitting:

One of the best techniques that can be used to address the problem of overfitting in boosting algorithms is early stopping. It is a kind of regularization that we’ve discussed in this article. I’ll also post a separate article describing that how we can use early stopping especially with boosting algorithms.
This is the end of today’s post. My readers can sign up for a membership through the following link to get full access to every story I write and I will receive a portion of your membership fee.
Thank you so much for your continuous support! See you in the next story. Happy learning to everyone!
Special credit goes to Arnaud Mesureur on Unsplash, **** who provides me with a nice cover image for this post.
Rukshan Pramoditha 2021–11–03