The world’s leading publication for data science, AI, and ML professionals.

Can LightGBM Outperform XGBoost?

Boosting algorithms in machine learning – Part 5

Boosting Techniques

Photo by Doug Duffee on Unsplash
Photo by Doug Duffee on Unsplash

Up to now, we’ve discussed 3 boosting techniques: AdaBoost, Gradient Boosting and XGBoost. We already know that XGBoost is more powerful than other boosting Algorithms that we’ve discussed so far.

LightGBM (Light Gradient Boosting Machine) is a great alternative to XGBoost. It was developed by Microsoft and released in 2016, a few years ago.

Due to a different approach taken by LightGBM when creating new trees in the ensemble, LightGBM has the following unique features:

  • Run faster than XGBoost
  • Use less memory
  • Can deal with large datasets
  • Can handle missing values in the dataset
  • Can handle categorical features that are not already encoded

Today, we’ll learn the installation of the LightGBM library and discover its Python API. The key part is that we’ll create 2 models on the same dataset by using XGBoost and LightGBM algorithms and compare the performance and execution time for each algorithm.

Let’s get started!

Installation of LightGBM

We can install LightGBM by running the following command in Anaconda prompt or Google Colab editor.

pip install lightgbm

Then run the following command and verify the installation.

import lightgbm

If you didn’t get any error message, you have successfully installed LightGBM!

Python API of LightGBM

The Python API of LightGBM consists of a few sub APIs. We’ll discuss two of them:

Scikit-learn API

This API is used to implement the LightGBM algorithm with the Scikit-learn .fit()/.predict() paradigm. The most important classes in this API are:

Plotting API

The LightGBM package provides some plotting methods under this API:

LightGBM VS XGBoost

Now, we’ll compare the performance and execution time of LightGBM and XGBoost. For this, we do a regression task using the California house pricing dataset.

LightGBM

Here, we use the LGBMRegressor() class with its relevant hyperparameters. We also use 5-fold cross-validation to evaluate the model’s performance.

(Image by author)
(Image by author)

The execution time of LightGBM is 0.608 in seconds. The dataset has 20640 rows and 8 features! So, LightGBM is really fast.

XGBoost

Here, we use the XGBRegressor() class with its relevant hyperparameters. We also use 5-fold cross-validation to evaluate the model’s performance.

(Image by author)
(Image by author)

The RMSE given by XGBoost is slightly better than the RMSE given by LightGBM. However, XGBoost tasks 4.660 seconds to execute. That is 7.66x slower than LightGBM!

Plot feature importances of LightGBM

Here, we use the plot_importance() class of the LightGBM plotting API to plot the feature importances of the LightGBM model that we’ve created earlier.

lgbm.fit(X, y)
lightgbm.plot_importance(lgbm)
(Image by author)
(Image by author)

The features Population and AveBedrms seem to be not much important to the model. So, you may drop these features and rebuild the model to reduce the RMSE.

Conclusion

Can LightGBM Outperform XGBoost? __ Here is the answer.

As regards performance, LightGBM does not always outperform XGBoost, but it can sometimes outperform XGBoost. As regards execution time, LightGBM is about 7 times faster than XGBoost!

In addition to faster execution time, LightGBM has another nice feature: We can use categorical features directly (without encoding) with LightGBM. However, a datatype conversion (object datatype → category datatype) is needed to do manually.

LightGBM is preferred over XGBoost on the following occasions.

  • You want to train the model fast in a competition.
  • The dataset is large.
  • You don’t have time to encode categorical features (if any) in the dataset.

This is the end of today’s post. In Part 6, we’ll discuss CatBoost (Categorical Boosting), another alternative to XGBoost. See you in the next story. Happy learning to everyone!


My readers can sign up for a membership through the following link to get full access to every story I write and I will receive a portion of your membership fee.

Join Medium with my referral link – Rukshan Pramoditha

Thank you so much for your continuous support!

Special credit goes to Doug Duffee on Unsplash, **** who provides me with a nice cover image for this post.

Rukshan Pramoditha 2021–10–29


Related Articles