Avoid Using "pip freeze" – Use "pipreqs" instead

Motivation
Package management is one of the best practices of software development workflow because it facilitates the automation of software delivery.
Nowadays, most Data Scientists and Machine Learning Engineers have been adopting this best practice for their pipeline automation. Even though this process is considered a good practice, the approach adopted by most practitioners might not always be efficient: the use of [pip freeze](https://pip.pypa.io/en/stable/cli/pip_freeze/).
In this conceptual blog, you will understand that there is a better option than using pip freeze.
This is why you should not use pip freeze
Imagine that you are working on a project that requires 5 dependencies: dep1
, dep2
, dep3
, dep4
, and dep5
. The reaction that most people will have when generating the dependencies file is to use the following magic command:
pip freeze > requirements.txt
But how can this be an issue?
The installation of most libraries requires other libraries which are automatically installed. Below is an illustration.
# Install the transformers library
pip install transformers
The installation of the transformers
library generates the following message:
Successfully installed huggingface-hub-0.10.1 tokenizers-0.13.1 transformers-4.23.1
This means that these two additional libraries huggingface-hub-0.10.1
tokenizers-0.13.1
have been installed along with the transformer library, and those two libraries will be automatically included in the requirement.txt file.

But I still do not see an issue! Don’t worry, we are getting there…
Now, imagine that the transformers library was upgraded and requires different librariesnew_lib-1.2.3
, and another_new_lib-0.2.1
. This means that the previous dependencies huggingface-hub-0.10.1
tokenizers-0.13.1
are not relevant anymore right? At least for the project ❌.
Here is the problem 👇🏽
Generating the new version of the requirements.txt file will still include the old dependencies in addition to the new ones; If you want to only keep the relevant dependencies, you will have to manually remove the old ones. Then imagine dealing with a project that requires 20, 30, or 50 libraries! That can quickly become a headache🤯.

In one word, pip freeze
is not smart enough to efficiently manage the dependencies, and here are a few reasons:
→ pip freeze
is only for "pip install": it is only aware of the packages installed using the pip install
command. This means that any packages installed using a different approach such as peotry
, setuptools
, conda
etc. won’t be included in the final requirements.txt file.

→ pip freeze
does not account for dependency versioning conflicts: a **** project lifecycle is interative, hence could require complete new or upgraded versions of existing libraries. Using pip freeze
saves all packages in the environment including those that are not relevent to the project.
→ pip freeze
grabs everything: if you are not using a virtual environment, pip freeze
generate the requirement file containing all the libraries in including those beyond the scope of your project.
So, what can I do to solve these issues? Good question!
The answer is by using pipreqs
🎉
Pipreqs – a better alternative
[pipreqs](https://pypi.org/project/pipreqs/)
starts by scanning all the python files (.py) in your project, then generates the requirements.txt file based on the import
statements in each python file of the project. Also, it tackles all the issues faced when using pip freeze
.

The installation is straightforward with the following pip
command.
pip install pipreqs
Once you have installed the library, you just need to provide the root location of your project and run this command to generate the requirements.txt file of the project.
pipreqs /<your_project_root_path>/
Sometimes you might want to update the requirement file. In this case, you need to use the --force
option to force the regeneration of the file.
pipreqs --force /<your_project_root_path>/
Imagine that you want to ignore the libraries of some python files from a specific subfolder. This can be achieved by using the --ignore
option before specifying the subfolder that needs to be ignored.
pipreqs /<your_project_root_path>/ --ignore /<your_project_root_path>/folder_to_ignore/
Conclusion
Congratulations!🎉🍾 You have just learned about a new, yet efficient way to manage your project dependencies.
If you like reading my stories and wish to support my writing, consider becoming a Medium member. With a $ 5-a-month commitment, you unlock unlimited access to stories on Medium.
Feel free to follow me on Medium, Twitter, and YouTube, or say Hi on LinkedIn. It is always a pleasure to discuss AI, ML, Data Science, NLP, and MLOps stuff!