Creating, using and deploying a flexible custom estimator through PyCaret

In part 1 we learnt about estimators, python class object, exponential function,curve_fit function, positional arguments packing /unpacking, enumerate function & finally built a more customized & flexible regression estimator. You can visit the Part1 of this hands-on tutorial below:
In this part, we will primarily learn and build two things:
1οΈβ£ How to make custom estimator sklearn compatible 2οΈβ£ How to integrate with Pycaret
Let’s build it together, but before we touch these specific points, it’s good to add a few more things to our estimator class to improve performance and functionality.
1: If you recall from part 1, we complied the exponential function in Python using an iterative approach. In some cases that may lead to a slower speed. Luckily, Python has matrix operations available that are far more efficient and speedy when compared to an iterative approach. In this case, we can use the dot
method available to us in the NumPy module. This is how we can do it:

np.array(X).dot(np.array(args[1:]))
is equal to x[1]b +x[2]c …x[n]n.
This results in an array equal to the length of y. np.exp(y_)
takes the exponential of every element in the array y. `yargs[0]multiplies the very first coefficient, 'a' with each element of the array
y_` .The result is the output of the equation :

2: The next thing we may want to do is to add a custom ‘score’ method to our estimator so that we can access the results of the fit. This is a very simple thing. We will use Sklearn’s metrics to get a score function for mean absolute error. If you want, you can create a scoring function that suits your use case.
3: The last thing we would like to add is to make some space for arguments (Hyperparameters) in our estimator class, through the init method. Altering the values of our hyperparameters can impact the execution and the accuracy of the model. You simply pass the desired hyperparameter in the arguments of the init function and bound it to self. We will add an argument called Maxfev, which specifies how many times we are allowing the fit_curve function to alter the values of the coefficients while trying to find the global minimum.
This is how our code looks like once we add these functionalities:
We are now ready to take our estimator to the next level!
π Custom Estimator & Sklearn
To make your estimator compatible with sklearn, we need to include few things in our code. We are making it compatible with sklearn because it is a requirement of PyCaret. A complete and detailed list of requirements can be found here. Below are few important things we need to accomplish:
1): Inherit base and estimator sub-class from sklearn into our class. Inheritance is a detailed topic that is out of the scope of this tutorial. In simple words, inheriting sklearn classes into our class will save us from creating all necessary methods (functions) inside our class. To inherit, all we need is to importBaseEstimator
& RegressorMixin
(because our estimator is a regressor, for classification we will use ClassifierMixin
) sub-classes from sklearn and simply pass them as arguments in our estimator class like thisclass ExponentialRegressor(BaseEstimator,RegressorMixin):
.
2): Inside the init method, all the arguments should have a default value
3): The name of the hyperparameters (arguments of init) should be kept the same when we bind them to class (when we create attributes) e.g.
def __init__(self,maxfev=300):
#Wrong way
self.maxf = maxfev
#Right way , name of parameter should be the same as attribute
self.maxfev = maxfev
4): Do not take X or y through __init__
method.
5): Do not alter the value of attribute inside the __init__
method e.g.
def __init__(self,maxfev=300):
#Do not do this !
if maxfev < 300:
self.maxfev = 300
else:
self.maxfev = maxfev
6): οΈEvery estimator should have get_prams
& set_prams
methods. Well, we have already covered that by importing the BaseEstimator
subclass.
7): Inside the fit method, we need to deploy a few checks, coming straight from sklearn. One ischeck_x_y()
, which checks if the X & y are of the proper shape, X needs to be a 2D array, and y needs to be a 1D array. Second, if you are doing a classification, you need to check the labels through unique_label()
function. We don’t need it here, since we are doing regression.
8): When fit
method is called, it should generate some attributes (at least one)with names succeeding an ‘underscore’. The underscore will be used to run another check in the predict
method to make sure that fit
method has been called before calling predict
, e.g.
def fit(self,X,y):
# 'underscore' after self.X & self.y is required
self.X_,self.y_ = check_x_y(X,y)
9): The fit
method should return an instance of itself, in easy words, return
of fit
should be self
10): The predict
method should also have some checks, similar to fit
method. One is check_is_fitted()
function, this checks that if the fit
method has already been called. The other one is check_array()
function, which checks the X for the shape, non-empty & finite values, just like check_x_y()
This seems to be a long boring list, but once you get a hang of it, you will note that it’s not more than 4/5 lines of code when implemented. So cheer up πΊ and examine the code below. I am going to minimize comments because you have seen them multiple times by now π .
We are now ready to jump to the final section of our journey! All you need is to save this py file in the folder /directory you want OR just keep it in your Juypter Notebook and continue coding
π Integration with PyCaret
Once our estimator is sklearn compatible, Integration with pycaret is really easy. You will proceed as usual with the setup of PyCaret, can pass on your class to the create_model & compare_models functions. See this code snippet below:
Now, if you run m = creat_model(ExponentialRegressor())
you will see this:

Now run the compare_models command, specifying the models you want to run along with the regressor we built:
c = compare_models(include=['lr','ridge','br','et','rf','ada',ExponentialRegressor()])

Quite impressive results π , at least for this data set we beat the AdaBoost and Random Forest Regressor, and it is pretty quick too…
We can also tune our hyperparameter maxfev
, by calling the function tune_model
, along with the custom grid for maxfev
:
tune_model(m,custom_grid={'maxfev':[2000,3000,4000]})

The point is, you can now use many built-in features (ensemble_model, stack_models etc) of PyCaret to run on the model you built from scratch, with your own hand!
Bonus Material:
At this stage, you can create and add a custom metric to the PyCaret. All you need is a function that captures the logic and returns a scalar (single value). I want to measure the accuracy of my models in terms of their ability to predict within 20% of the actual value. If a prediction is within 20% of actual, I will assign a value of 1, else 0. Then simply count the percentage of 1s. This is how you will do it:
# create a custom function
def confidence(y, y_pred):
score = np.where(np.abs(y_pred-y)/y <=.20 ,1,0)
return ((np.sum(score))/len(y))
# add it to PyCaret
add_metric(id ='CI',name="Confidence",score_func=confidence)
Now when we run the compare model command :
c = compare_models(include=['lr','ridge','br','et','rf','ada',ExponentialRegressor()],sort='Confidence')
we get:

The code below is the final product of our hard work. The code is self-sufficient and you should be able to run it on your machine, given that you have installed all the required libraries
That’s the end of this series. In this part, we learnt about matrix multiplication, added a score method, made our estimator sklearn compatible and finally we learnt how to integrate it with PyCaret. We also learnt how to create a custom metric along with our custom estimator and used them together in PyCaret.
βββββββββββββββββββββββ You can follow me on medium & connect with me on LinkedIn & visit my GitHubβββββββββββββββββββββββ
You may also be interested in:
π Make your data science life easy with Docker π Custom Estimator With PyCaret, Part 1