The world’s leading publication for data science, AI, and ML professionals.

5 Different Ways To Save Your Machine Learning Model

Simplifying The Process of Reusing Models

Photo by Fredy Jacob on Unsplash
Photo by Fredy Jacob on Unsplash

Saving your trained Machine Learning models is an important step in the machine learning workflow: it permits you to reuse them in the future. For instance, it’s highly likely you’ll have to compare models to determine the champion model to take into production – saving the models when they are trained makes this process easier.

The alternative would be to train the model each time it needs to be used, which can significantly affect productivity, especially if the model takes a long time to train.

In this post, we will cover 5 different ways you can save your trained models.

1 Pickle

Pickle is one of the most popular ways to serialize objects in Python; You can use Pickle to serialize your trained machine learning model and save it to a file. At a later time or in another script, you can deserialize the file to access the trained model and use it to make predictions.

Here’s a utility function I created to save the model pipeline – I’ll also demonstrate its functionality in the training pipeline:

Let’s see how this works in our training pipeline script:

You can see we made a call to the save_pipeline utility function on line 50.

To load the pipeline, I created another utility function called load_pipeline.

In our prediction script, we load the pipeline into a variable called _fraud_detection_pipe: we can now use this variable as an instance of our trained pipeline object to make predictions. Let’s see the rest of the script…

Note: the trained model is loaded in line 14.

2 Joblib

Joblib is an alternative tool to pickle that we can use to save [and load] our models. It’s part of SciPy’s ecosystem and is much more efficient on objects that carry large NumPy arrays – learn more about Joblib benefits in this StackOverflow discussion.

"Joblib is a set of tools to provide lightweight pipelining in Python. In particular: transparent disk-caching of functions and lazy re-evaluation (memoize pattern) easy simple parallel computing."

To save our model with Joblib, we would only have to make a change to our save_pipeline() function.

Notice that we have important joblib instead of pickle, and on line 16 we have serialized our model pipeline with joblib.

Note: the interested reader can see the full code on my GitHub.

3 JSON

Another way we could save our model is with JSON. Unlike with Joblib and Pickle, the JSON method would not necessarily save the fitted model directly, instead, all of the required parameters to build the model are saved – this is a good approach when full control over the saving and restoration process is required.

Note: The interested reader may wish to learn more about [inheritance ](https://medium.com/geekculture/inheritance-getting-to-grips-with-oop-in-python-2ec35b52570#:~:text=the%20super()%20function.-,The%20super()%20function,-The%20super())to understand how we built this class.

Now we can literally call the save_json() method on our MyLogisticRegression instance to save the parameters.

And we can call it in another script as follows:

You can use all of this data to reproduce the previously built model.

4 PMML

Predictive Model Markup Language (PMML) is another format practitioners use to save their machine learning models. It’s much more robust than Pickle since a PMML model is not dependent on the class they are created from – this is not the case for a Pickle model.

We can also load the model as follows:

5 Tensorflow Keras

Tensorflow Keras can be used to save a Tensorflow model to SavedModel or HDF5 file.

Let’s build a simple model and save it with Tensorflow Keras – first, we will start by generating some training data.

Now let’s build a sequential model:

On line 16 in the above code, we used the save() method on our sequential model instance and passed a path to the directory we wanted to save the model into.

To load the model we simply used the load_model() method on the models object.

Wrap up

In this article, we’ve covered 5 different ways to save your machine learning models. It’s also important to record the version of Python you used to build the model, and the version of the libraries you use – this data will ease the process of re-creating the environment the models were built in so they can be reproduced at a later date.

Thanks for reading.

Connect with me: LinkedIn Twitter Instagram

If you enjoy reading stories like this one and wish to support my writing, consider becoming a Medium member. With a $5 a month commitment, you unlock unlimited access to stories on Medium. If you use my sign-up link, I’ll receive a small commission.

Already a member? Subscribe to be notified when I publish.

Get an email whenever Kurtis Pykes publishes.


Related Articles