I am sure you all have read about the importance of documenting your code. As a data scientist, I usually use Jupiter notebooks during the development and the notes there are sufficient for me.
But, not going to lie, when I come back to it weeks later, to move the code to production, I had to scratch my head way too many times, than I care to admit. Plus, it can be a challenge when you are working in a team or handing it over to another team. 😅
It’s the little things that you forget to put in your Jupyter notes, that can cause the biggest headache.
In this article, I will cover the 3 things you need to know/practice to be as complete as possible with your Documentation.
1. Comments
Always add relevant comments to your code where possible. And be mindful of overdoing it by adding comments to everything. The best comments are the ones that you don’t have to write, as the code is very clear.
Generally speaking, there are two types of comments: single and multi-line comments. In Python, commenting is usually done using the hashtag symbol (#) before the comment.
However, for multi-line comments, you could also use the multi-line string literal (”’ ”’) method. Since the string literal is not assigned to any variable, the Python interpreter will just ignore it; therefore acting as a comment.
Should I use hashtag style or string literal style for multi-line comments? Just choose a style and stick to it. Personally, I go with hashtags, because I am an error-prone person 😅 .
If you are not careful with the string literal commenting style, you could accidentally turn the comment into docstrings, which would just muddy the documentation.

How do you decide when it’s relevant to add some comments? I use this simple quote from Jeff as a north star.
Code can only tell you how the program works; comments can tell you why it works. – Jeff Atwood
When deciding to add comments, ask yourself if the code explains the why behind it. If the why is ambiguous or challenging to decipher at a glance, it might be a good candidate for some good old fashion comments. 👍
2. Explicit Typing
Be as explicit with your code as possible. This will go a long way to remove any ambiguity down the line. I am sure all of you are clear and explicit with your variable declarations, but not many are explicit with their function definitions.
The below image is an example of explicit typing when defining a function.

From the signature of the above function definition, due to explicit typing, we can tell a lot about the function inputs and outputs. For example:
The function has 3 inputs:
1. A variable called Dimensions of type List.
2. A variable called shape of type string.
3. A variable called unit_type of type string with default value called 'metric'.
The function also has a return value of type float.
See how informative that is? 😄 Just by looking at that signature, we can tell a lot about the function.
3. Docstrings
Docstrings are your best friends when it comes to documentation. They provide a standardized way to define the overall utility, arguments, exceptions and so much more.
If you are using Visual Studio Code, please install the Python Docstrings Generator extension. It will make documenting so much easier!

All you have to do is type """
under a function and hit the Shift key. It will then generate a template docstring and autofill parts of the docstring using the typing information.
Pro-tip: Do the docstrings after you have completed the function. This way it will autofill as much as possible, including return types, exceptions, etc.
As an example, let’s apply docstrings to the function definition we used in the previous section.

Look at how the docstrings extension in VS Code automatically generated some documentation using the information from the function signature. It also highlights the parts that need to be reviewed for your documentation to be through.
Once you have your docstrings done for the project, you can turn it into a simple and elegant static website using mkdocs. More on that some other day.
PS: Ignore the squiggly line under the return value volume. It’s showing up as I didn’t declare a variable volume in the function. **😅**
Final Thoughts
Documenting your can go a long way. Not only does it help others understand what the code does, but it also makes you a better developer by making you rethink your code, forcing you to apply some standardisation across all your files, and making you a better communicator.
At first documenting can feel like some extra challenge. But if you practice the points mentioned in this article and use some extensions in your IDE, it becomes simple and second nature after a while. 😃
I hope you found this article useful. Please reach out to me if you have any questions or if you think I can help.
_Feel free to connect with me on Twitter as well._ 😄
You might also like these articles by me:
How to Schedule a Serverless Google Cloud Function to Run Periodically
Machine Learning Model as a Serverless App using Google App Engine
Machine Learning Model as a Serverless Endpoint using Google Cloud Functions
The Only Data Science/Machine Learning Book I Recommend
3 Simple Side Hustles to Make Extra Income per Month as a Data Scientist