The world’s leading publication for data science, AI, and ML professionals.

“8 More” Python Best Practices for Writing Industry Standard Code

A few Python best practices I learnt after entering the software industry

Photo by Christina Morillo from Pexels
Photo by Christina Morillo from Pexels

I recently read a good article by Anmol Tomar on Python Best Practices. It touches on simple yet effective best practices that can improve your code quality. Then I felt like "Well, I can add a few more to this list by Anmol." Hence, here I am, writing this blog post. I did not follow these best practices before going to the industry. But after being part of a team, I learnt these best practices from my colleagues and wished I knew and followed these small steps early on.

"Industry standard code" may sound fancy. But it is nothing other than readable, reusable, modifiable code. In industry, we work in a team. If I write some piece of code while working for a firm, the code I write is owned by the firm. Sometime later, a completely different person, working for the same firm, should be able to read my code, understand it, make improvements on it, or fix a bug in it, or integrate it with another piece of software. If that is not possible, software industry won’t scale beyond individuals. So as long as you are writing simple, understandable code that someone else can improve upon, you are writing "Industry standard code".

The 8 best practices I have listed here are in addition to what Anmol has mentioned in his article. Do read it as well. There is never an exhaustive list of "best practices." These are a few simple ones I learnt in the industry, that helped me improve my coding and work in a collaborative environment. So never treat it as a finite set of best practices.

1. Don’t use random values in your code. Define them as constants.

Let’s say, the piece of code you are writing has one line that converts mass (kg) to weight (N). It is a simple one-liner, where you have to multiply the mass variable by 9.8 ms⁻². So we tend to write a line of code like below.

Bad practice

It is technically correct. But when someone reads your code without context, they will wonder why it is 9.8 and where it came from. So it is always good to define it as a named constant and use it subsequently.

Good practice

Python doesn’t have constant definitions like most other languages. Hence we define it as just another variable. But we use ALL_CAPS as a naming convention to indicate that it is a constant. When defining constants, it is a good practice to define those at the top of the source file (after imports). In case the project is large and if many source files share project level constants, you can define all of the constants in a constants.py file and import them as necessary in the required source files.

2. Use verbs as function and method names

This might sound not so important, but hear me out. Let’s say you are writing a function to calculate the prime factors of a given number. Most probably we tend to define the function like:

Bad practice

Then when calling the function and assigning its output to a variable, we will be wondering what to name the variable because prime_factors is already taken. Same goes if we want to define a local variable inside the function. But if we ponder a little, almost always we write a function or method to "do something". Functions and methods perform an action on the data. Hence it is a good idea to name your functions and methods as <do>_<something>. In this particular example I would prefer the following as function name.

Good practice

After that I would use prime_factors as the variable name to capture the result of the function when calling it.

3. Define members as private or protected as per the access scope requirements

Defining attributes and methods of a class as private or protected is not a language feature of Python. But it can be done through a universal convention. It is always a best practice to expose attributes and methods that are only necessary. Everything else should be protected or private members of the class. Protected members are the attributes and methods which are accessible from within the class and its sub-classes, where as private members are accessible only from within the class itself.

In Python, the following conventions are used to define protected and private members. Members prefixed with single underscore _ are protected and the ones prefixed with double-underscores __ are private members.

Good practice

As I mentioned above, in Python this is just a convention. Still we have means to access these members. But a good programmer won’t. This indicates to subsequent programmers, who may extend your program, that these attributes and methods are not meant to be accessed from outside.

Even if you are not using Python in an object-oriented fashion (because most people don’t), you still can adopt the same convention for functions which are not meant to be imported elsewhere. Simply prefix such function names with underscores. This will indicate that this function is local to the file and not supposed to be imported elsewhere.

4. Don’t do import *

I have done it many times in the past. When I want to import few components that I defined elsewhere, I would simply do:

Bad practice

When we do an import * in our code, all the entities from the module will be imported to the current namespace. This severely affects the readability of your code. When you have two such lines, the reader will loose track of where the entities you have used are imported from. Even worse, such import may shadow the names from an earlier import or local definitions.

The best thing to do is to import the whole package or module if you are accessing a lot of entities. then you can later access the classes or functions as, <module>.<class>. If you are importing only one or two components, use the from <package> import <class> format. But use it scarcely.

Good practice

5. Use code formatters (or linters)

What if I say a lot of the styling (formatting) best practices can be enforced or automatically applied to your code? Yes start using a code formatter or linter. In industrial setting, most organizations have a linter as part of the CI (Continuous Integration) pipeline. When you try to push some code to the organization’s repo, it will be automatically linted and if any prescribed best practices aren’t followed it will raise an error preventing merging of the code.

So it is better to start using it. Black is a good code formatter. It can help you format your code as prescribed by PEP8 standard. Some IDEs have in built code formatters which you can use with a simple shortcut. Linters are a bit more than code formatters. They are static code analyzers. They can be used to check if proper variable naming conventions are used, check for bugs etc. Pylint can be a staring point to get introduced to linters.

6. Write unit tests

This can be boooooooooring and time consuming. But it is essential, when working in a team or on a long term project. Writing proper unit tests makes your code "standardized". Unit testing is the mechanism through which the functionality of individual logical units of source code is tested. You can simply isolate logical errors and prevent them from propagating up. When working in a team, this is essential, because if a functionally faulty code is merged, it can cause errors when interacting with other components. Trust me, writing proper unit tests is in fact a time saver in the long run. Python’s inbuilt unittest library is very good for unit testing your python code.

7. Log the errors

Your code isn’t bug free. Even after linting, unit testing, manual testing, it is highly likely to fail in production occasionally. Trust me, it is okay. It happens quite often in the industry. What matters is whether you have preemptively employed mechanisms to capture what went wrong and where. You can fix an error only if you have a pointer to what went wrong and where. So enable mandatory error logging in your code. It is simple – use python logging.

Good practice

Also use the warning, info, and debug level logging as required by your project. Remember the log file is your "blackbox" which you will have to use for a "postmortem", if your code failed in production.

8. Generate requirements.txt file with versions

This is more relatable if you are working in the domain of machine learning, or Data Science or any other domain with packages that are continuously evolving and getting major updates. You will create a virtual environment, install some packages and work on a project, and deliver. Then you won’t touch the project for sometime. After that, you are supposed to make some improvements. Now you try to recreate the environment, there is a significant possibility for your code to break. Well, the reason is packages go through updates which may, at times, contain breaking changes.

You can simply overcome this by following the best practices mentioned below.

  1. Have a separate virtual environment for each project
  2. Track the installed packages using a requirements.txt file per project
  3. When tracking the packages track them with the exact package versions.

The third point is very important. If you have the working package version always in the repo, every time you try to recreate the environment, the exact version of the package will be installed. Hence, there won’t be any breaking changes.

You can manually track the packages you install or if you are maintaining a proper minimal environment you can use the following shell command to generate the file automatically.

pip freeze > requirements.txt

Edit : As mentioned by Cesar Flores in the comments, using pip freeze may cause issues as it tracks second level of dependencies as well. So, manually tracking the packages would be the best option.

If you need more information on requirements.txt, here is a good read.


So there you have it, 8 basic Python best practices that I learn after entering the software industry. I hope that you found this article useful. Also, if there are any comments/more best practices that you think should be included, please do so in the comments section.

References

Python Best Practices for Writing Industry Standard Code

public, protected, private members in Python

Importing modules in Python – best practice

GitHub – psf/black: The uncompromising Python code formatter

pylint

unittest – Unit testing framework – Python 3.10.5 documentation

logging – Logging facility for Python – Python 3.10.5 documentation

The Python Requirements File and How to Create it


Related Articles