The world’s leading publication for data science, AI, and ML professionals.

Easy Custom Losses for Tree Boosters using Pytorch

Tree Boosters such as Catboost, XGBoost and LightGBM are powerful tools, especially when tackling tabular data. They support a variety of…

Why calculate first and second derivatives for your custom objectives when you can let PyTorch do it for you?

Tree Boosters such as Catboost, XGBoost and LightGBM are powerful tools, especially when tackling tabular data. They support a variety of losses out of the box, but sometimes you want to use a tailor-made loss, something with that special oomph to make your models shine.

Fortunately for us, all common tree boosting packages support custom objectives. Unfortunately for us, in order to build one you have to provide a function that calculates the first and second derivatives of your objective w.r.t the model’s output, and who wants to do that? We already have modern Deep Learning packages that can do it for us. Just [pip install treeboost_autograd](https://pypi.org/project/treeboost_autograd/) and then defining your custom loss for Catboost, XGBoost or LightGBM can be as easy as this:

PyTorch to the rescue

Let’s have torch.autograd do the heavy lifting. Assume you have a scalar objective value (e.g. minibatch MSE) and a 1-d vector of model predictions. First, use pytorch to calculate the first derivative of objective w.r.t preds:

create_graph=True tells pytorch to construct a computation graph for the derivative, allowing us to compute higher level derivatives on top of it.

Then, treat each element of the first derivative as a scalar, and derive it again w.r.t preds:

retain_graph=True tells pytorch not to free the computation graph after calculating the derivative, since we need to use it multiple times.

That’s the gist of it. Now we just need to plug it into a Catboost-friendly objective class, and start using our custom losses!

Defining the custom loss

We’ll use the Boston House Prices toy dataset as an example. Let’s say we want our model to avoid undershooting the house price more than overshooting it, i.e. we want the loss to be harsher for predictions that are lower than the actual house price. Let x = (preds-targets):

Image by author
Image by author
Image by author
Image by author

Training the model

Now all we need to do is plug our custom Loss Function into a CatboostObjective object and train a CatBoostRegressor.

To see what our custom loss does, we can plot a histogram of the relative difference between the model predictions and the targets. As you can see, our custom "don’t undershoot" model indeed undershoots the target price much less than the model trained with the default symmetric RMSE loss. Naturally, our model is more prone to overshooting the target price.

Image by author
Image by author

Using standard Pytorch losses

The same principle also works with standard pytorch losses. Prefer reduction="sum" if you want to use default Catboost hyperparams for tuning. For example, using these custom losses is (almost) equivalent to using Catboost’s default losses for classification and regression:

Using it yourself

Image by Klinkow from Pixabay
Image by Klinkow from Pixabay

Okay, okay. No need to get all flustered over it. Just [pip install treeboost_autograd](https://pypi.org/project/treeboost_autograd/) and you’re good to go.

Full implementation can be found in the git repo, as well as ready-to-run examples of regression and binary classification – for CatBoost, XGBoost and LightGBM.

TomerRonen34/treeboost_autograd

Also check out this great answer by Diogo Pernes about how to calculate second order derivatives at the Pytorch Forums.

Image by John from TopPng, edited by author
Image by John from TopPng, edited by author

Related Articles