The world’s leading publication for data science, AI, and ML professionals.

Behavioural Code Analysis 101: Leveraging Forensic Techniques

Technical debt isn't technical. And static code analysis is not nearly enough to find it truly.

Technical debt is so much more than technical stuff. And behavioral analysis comes to the rescue in discovering much of what makes technical debt.

Behavioural Code Analysis is to Static Code Analysis as Cosmology is to Astronomy. It deals with the macroscale, the universe of the codebase – from its beginning to the current state, without forgetting the organizational apparatus behind the scenes.


Behavioural Code Analysis is about the macro. About the big-picture. Photo by Greg Rakozy on Unsplash
Behavioural Code Analysis is about the macro. About the big-picture. Photo by Greg Rakozy on Unsplash

I’ve used Sonarqube to help measure the technical debt in the React project. As a static analysis tool, it reports back with a list of files having the highest technical debt. The major one is renderer.js with nine issues, including duplication and useless assignments. But is this technical debt, and is it worth refactoring it?

So my point here is that static analysis shouldn’t be our north-star guiding our refactoring journey. It only tells us the symptoms our codebase has. Not the disease. A terrible looking code after refactoring may easily return to its previous state if you don’t really know why it was in that state in the first place.

Metrics such as the number of contributors, frequency of change, change coupling are fundamental in revealing if and why the code is rot. You can’t just analyze file after file as individual pieces, as Sonarqube does, but as a network through behavioural analysis.


Table Of Contents

  • Technical Debt Is Not Technical
  • Behavioural Code Analysis Techniques
  • Organizational Apparatus

Technical Debt Is Not Technical 😕

"A mess is not technical debt. It has no chance of paying in the future. A mess is always a loss." by Robert Martin.

Technical debt, just like financial debts, becomes more expensive during time unless we recurrently pay for it. We do it because it’s survival behaviour, a bet, to guarantee our software meets the short-term business expectations. The downside is the handicap implementation. Not correctly done to integrate other new features in the future easily.

The more technical debt we have, the more closer to technical bankruptcy, a state where it is so stupidly long to integrate new features – that makes them not worthwhile.

Relationship between technical debt and project life-cycle.
Relationship between technical debt and project life-cycle.

Be mindful of Lehmans Law Of Continuing Change, without new features your software becomes less useful. So how in the hell useless assignments are of technical debt? Well, we can’t know just by looking at the code, we have to look at version-control tools that will inform us of how hard has been maintaining this file.


Behavioural Code Analysis Techniques

Research shows the number of changes to a file is a good indicator of declining quality, resulting in bugs. And why wouldn’t? If a class changes too frequently, isn’t it perhaps that has too many responsibilities, or it’s poorly understood. Besides, the places where we mostly work on are where refactoring will be most useful.

Hotspots

Adam Thornhill calls it hotspots. React has a number of them, ExhaustiveDeps.js is one. These are potential refactoring targets that are breeding ground for buggy code. They are the result of both complex and active code.

Hotspots Of React Displayed Through Kibana.
Hotspots Of React Displayed Through Kibana.

Complexity Trend

Then, we can start to determine the complexity trend of each potential refactoring target. Just like volcanos, we can categorize if the hotspot is active (that is, an increasing complexity trend), dormant (a constant complexity trend), or extinct (complexity trend has started to decrease). While the file ExhaustiveDeps.js is dormant, there are others, like ReactFiberWorkLoop.old.js, that are active. And if not attended, will continuously grow to unmaintainable.

Change Coupling

ReactFiberWorkLoop.old.js is 2759 lines of code with more than 50 functions in itself. Before grasping what all of its functions do, we can apply another technique to understand the dynamic behind the evolution of all these functions. That is, Change Coupling. It’s sort of a shared-logic between two functions/files, in which if you change one of them you will likely have to change the other. Notice that this is different from the usual "coupling" in software engineering because it’s invisible, implicit in the code. We can only notice it by looking at the alterations done over time. Right away, it’s noticeable a significant high coupling percentage, 47%, between the functions commitRootImpl and prepareFreshStack. Thus, this becomes our starting refactoring candidate.

High coupling degree isn’t necessarily wrong, in fact, it’s expected within unit tests and their respective classes, but in other contexts, we have to start asking ourselves if it’s easily perceptible. For instance, by following the Principle of Proximity, grouping functions that change together, we are conveying information that is not possible to express through code only.

Altogether, through techniques such as Hotspots, Complexity Trend, and Change Coupling, we have a different perspective than Sonarqube initially gave us. This because the paradigm behind Behavioural Analysis is different from Sonarqube, where not all code is equally important in a maintenance perspective. Just because some code is bad, it doesn’t mean it’s a problem. It dramatically matters more the code we work the most.


_Organizational Apparatus – t_he problem isn’t technical, it’s a social one.

There will always be technical debt. The important is how we deal with it. Mother nature has long taught us to be adaptive to learn to live with tomorrow’s unforeseen normal. And that’s where lies the root-cause, the Normalization of Deviance:

The gradual process through which unacceptable practice or standards become acceptable. As the deviant behavior is repeated without catastrophic results, it becomes the social norm for the organization.

Fix The Cause, Not The Symptom

In the software domain, this means the moment we accept and continue to work with unusual functions/systems/behaviours they become the new normal, and other future deviations become another new normal, again and again. This phenomenon is borderless, even Nasa, where a single mistake may induce the loss of lives, has experienced it various times – Challenger, Columbia. Think about it, if this phenomenon has happened within Nasa organizational culture, why in the hell won’t happen in our day-to-day projects. Nevertheless, the developer is the one to blame. Because there’s no one else supposed to vindicate for good-looking maintainable code.

"The extent to which you can implement new features without calling a grand staff meeting is the ultimate test of an architecture’s success." by Adam Tornhill.

The demographic of the software industry keeps on growing more and more since the 70s, and it’s expected a 20% growth worldwide in the next four years. Meaning developing is becoming more and more teamwork.

Nevertheless, Sociology has proven the more elements a team has, the more difficult it is to coordinate it, and more likely there will be communication gaps, known as Process Loss. Furthermore, we more susceptible when working in groups, because our values and decision making are suddenly influenced, whether you notice or not, by the group’s behaviours. This phenomenon is commonly known as the Bystander Effect and is explained through two main social aspects.

  • Pluralistic Ignorance. It’s a group state where the group’s norm being publicly accepted, is privately rejected. It happens because each person, feels like it’s the only one thinking that way and is afraid of being ostracized. So all fail to act. For instance, it’s common in relationships, in student classes – when being afraid to expose doubts in class – and even in emergencies, where the victims have a better chance of survival if a single bystander, rather than a crowd, is present.
  • Diffusion Of Responsibility. A person’s state, of feeling less responsible, less accountable in large groups. The responsibility is diluted, from the individual to the group, but if everyone feels this way, then no one will account for that responsibility.

The problem isn't technical, it's a social one. Photo by Javier Allegue Barros on Unsplash
The problem isn’t technical, it’s a social one. Photo by Javier Allegue Barros on Unsplash

More than focusing on what to refactor, we shouldn’t underestimate the organizational apparatus. Studies have shown that practising code-review, and having a principal maintainer has positive quality effects on the code. The problem isn’t technical, it’s a social one. We shouldn’t rely only upon on ourselves because we’re subjective, biased, and adaptive. But we can make use of unbiased, objective heuristics derived from Behavioural Analysis that tell what’s happening behind the scenes.


My learning path of Behavioural Code Analysis started from the Adam Tornhill books, Your Code As A Crime Scene and Software Design X-Rays. It was a journey where I’ve learned much more about Social Side Of Code, how much powerful version-control tools are besides tracking the code itself. Finally, lead me to develop a new open-source tool that pretty much does the techniques underlined above.

pbmiguel/behavioural-code-analyser


Related Articles