Cleaning, Refactoring, and Modularity: The “Must” Foundations to Improve your Python Code and Career

How some code habits can bring your development process and career to a whole new level

Published in

Towards Data Science

8 min readJul 19, 2020

“Any fool can write code that a computer can understand. Good programmers write code that humans can understand.”
Martin Fowler

We all know that during the rush of the development process, we are generally more focused on making our program work than making it fully readable. This situation becomes even more complicated when we face a problem never seen before and have a tight deadline to deliver the work. Sometimes we have to appeal to Stack Overflow to find a solution or take some time to read all similar questions until finding a new idea of how to work around the problem.

“When your program is a complete mess, but it does it’s job.”

I use Stack Overflow a lot, but at least for me, the time I spent looking for a solution reduced the time (a hard and long one) I generally take to make my code cleaner and readable. If this happens to you too, maybe it’s because we are focusing more on “how to solve” than makes it accessible to other developers, well… we want to make it work! And even though we had time to format and document some things, who never spent a few hours trying to understand some code from years ago? The reason for this is simple, writing clean and readable code is a hard and tiring thing, but we should always think about the following sentence from the great “Uncle Bob”:

“The ratio of time spent reading versus writing is well over 10 to 1. We are constantly reading old code as part of the effort to write new code. …[Therefore,] making it easy to read makes it easier to write.”
Robert C. Martin, Clean Code: A Handbook of Agile Software Craftsmanship

We must put in our minds that as developers, programmers, software engineers, data scientists, and so on, our real audience is not computers, but other programmers (including ourselves). As the sentence from Uncle Bob defines, we usually spend more time reading documentation or other people code than making new ones, so why not spend more time on this part (however tiring it may be) and help you or others in the future?

It will not only make you a better programmer, but it will also help with the scalability and maintainability of your product, also reducing the number of bugs (this is real) and the system complexity/risks reduction to changes or additions. If none of this is enough for you, I can give you one more thought to change old habits!

“Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live.”
John F. Woods

Let’s see some ways on how to do this then? Below is a summary of some methodologies to achieve high-quality and clean code.

Refactoring

Refactoring is a way to restructure your code to improve its internal structure without changing its external functionality. The mindset behind this is: Did you manage to make it work? Go back to the beginning, clear, and modularize your program! It may seem like a waste of time to do this right at the beginning when you have several features to add, but doing this at steps will give you the following advantages:

Reduce workload in the long run;
Easier to maintain the code;
Increase reusability;
Decrease the time it takes to do this in the future or at new projects (the more you do, faster you will become in this activity)
If you try to do a better job than the previous refactoring, you will surely master this skill soon;
This skill is highly valuable in the job market and will highlight your profile (just look at “Desired to Have” Job Posts at LinkedIn or Indeed)

OK! I understand the advantages of refactoring, but how do I do this? It’s simple, here are some ways to achieve this:

First, try to understand how Code Complexity works in Python and the metrics of measuring this complexity (Lines of Code, Cyclomatic Complexity, Halstead Metrics, Maintainability Index, e.g.);
Knowing how complexity works, the most you will have to do now is to rename names, modules, functions, classes, methods and check if you applied some procedural programming in your code;
Finally, check for some complexity anti-patterns…. voilá your program will now shine!

If you want to go deeper into the topic, here is an excellent article that explains step by step how to achieve refactoring in Python, and here is a compilation of some Code Metrics available! If you prefer books, I advise the following:

Python Anti-Patterns (AWS);
Refactoring: Improving the Design of Existing Code (Martin Fowler);

Some good video lectures or workshops:

Measuring Python code complexity with wily (PyCon 2019);
Refactoring Python: Why and how to restructure your code (PyCon 2016).

Clean Code

A “Clean Code” is not a method or a set of rules but a philosophy that brings some techniques, easing the code writing and reading. Quoting Uncle Bob again:

“Clean code is not written by following a set of rules. You don’t become a software craftsman by learning a list of heuristics. Professionalism and craftsmanship come from values that drive disciplines.”
Robert C. Martin, Clean Code: A Handbook of Agile Software Craftsmanship

Do you know that idea mentioned earlier that we have a very tight time to perform this task, and that we focus a lot on the result instead of readable and clean code? Well, he makes it very clear how to solve this, the fault of writing bad code is entirely up to the one who wrote it.

“Nothing has a more profound and long-term degrading effect upon a development project than bad code. Bad schedules can be redone, bad requirements can be redefined. Bad team dynamics can be repaired. But bad code rots and ferments, becoming an inexorable weight that drags the team down.”
Robert C. Martin, Clean Code: A Handbook of Agile Software Craftsmanship

How do we solve this then? Depending on where you are (how long are you using bad habits on the road?), it can be easy or difficult, eliminating bad habits is complicated but don’t give up, just using the following steps over time you get the hang of it!

A Clean Code should be Elegant to the point of becoming pleasing to read;
A Clean Code must be descriptive and imply type, for example, use “is_” or “has_” for booleans to make it clear it is a condition;
A Clean Code must be consistent but clearly differentiate, for example, “age_list” and “age” is easier to differentiate than “ages” and “age”;
A Clean Code must avoid abbreviations and especially single letters, only use that for counters and common math variables, but remember that if your team has different roles (Full Stack Engineers working with Data Scientists, e.g.) it might be necessary to provide more descriptive names;
A Clean Code must show that long names are different to descriptive names, be descriptive only with relevant information;
A Clean Code must have lines around 79 characters, learn to break and indent a line and/or multiple lines;
A Clean Code must be well documented, in my opinion, the Google Style Example is the most complete, but you can find what you like best and start using it;
A Clean Code must use Whitespaces properly, organize your code with consistent indentation and separate sections with blank lines;
A Clean Code must follow the PEP 8 Guidelines for code layout;
A Clean Code must follow the Law of Demeter for OOP.

Fortunately, some tools can help us keep our code clean! We can use Linters to analyze it and detect various categories of “lint” being then able to analyze code errors, dangerous code patterns, code style, and potentially unintended results. Here is a compilation of the available Linters for Python and how each one differs from the others. If you want to go deeper into the topic, I recommend these books:

[A MUST] Clean Code: A Handbook of Agile Software Craftsmanship (Robert C. Martin & Dean Wampler);
Effective Python: 59 Specific Ways to Write Better Python (Breet Slatkin);

If you prefer videos and workshops I suggest these:

Clean Code — Uncle Bob
Clean code in Python (PyCon CZ);
Transforming Code into Beautiful, Idiomatic Python (PyCon 2013)

Modular Programming

Writing a Modular Code is an important step in software development as it allows the use of the same code in the module by referencing it to perform a specific action in different locations in the program. This method facilitates the debugging of large programs, increase code reusability & readability, improves reliability, and also helps in programming with multiples devs or teams. To move on, I will consider that you already know how to structure a Python project, but if you don’t, take a look here before continuing. In summary, to make a modular code the following tips must be followed:

Don’t Repeat Yourself: Generalize and consolidate repeated code in functions or loops, avoid Spaghetti Code at all costs;
Abstract out Logic to Improve Readability: This improves readability with descriptive function names, but use with caution since you can over-engineer;
Minimize the Number of Entities: There are tradeoffs to having function calls instead of inline logic;
Functions Should do One Thing: If there’s an “and” in your function name, consider refactoring, your function also must have less than 10 lines of code;
Arbitrary Variable Names can be More Effective in Certain Functions: Arbitrary variable names in general functions can actually make the code more readable;
Try to use Fewer than Three Arguments per Function: Remember we are modularizing to simplify our code and make it more efficient to work with. If your function has a lot of parameters, you may want to rethink how you are splitting this up.

An excellent article that I advise you to take a look at is this one, and a book that I consider a MUST is The Hitchhiker’s Guide to Python, which can be accessed for free here.

And Now What?

The problem of writing good code, despite being found in several IT fields and mainly in fresh starters, is something who had been discussed a lot in the field of Data Science. Much of this is due to the wide array of academic disciplines a data scientist is exposed, causing a lack of experience in some skills required for writing clean and high-level code (Principles of Software Engineering, Paradigms, Clean Code, Testing, Logging, e.g.). Despite this problem, we still have time to solve this, let’s help us and others to implement new code habits!

I hope this article will help provide a guide for those who want to improve not only their code but also to leverage their team together!