How to Communicate Clearly About Machine Learning.

Your choice of words matters.

Daniel Rothmann
Towards Data Science

--

It’s no secret that the field of machine learning is considered heavy on theory: Math, statistics and computer science, mainly. While practitioners tend to enjoy the depth, colleagues and clients “outside the bubble” find it difficult to participate in the conversation. That results in bad communication.

Projects go bad when the communication goes bad.

Bad communication distorts expectations, impacts planning and, in general, it makes your team less efficient. As a practitioner, this is your problem too.

The Machine Learning Engineer Stereotype:

📊 Wrangler of Big Data™.

🧠 Wielder of ✨Magical Learning Algorithms✨.

🤔 Clever and curious.

😅 Difficult to talk to (if you’re not into ML).

Because what you’re doing seems complicated and unfamiliar, some of your collaborators won’t feel comfortable challenging you.

For this reason especially, it is on you to improve how you communicate about your work in order to break the “difficult-to-talk-to” stereotype.

Think Business Need.

The first step to communicating more clearly is to stop relating your work to Technical Need and begin relating it to Business Need instead.

Technical Need is your need to build a mathematically correct, statistically sound, technically robust, computationally efficient machine learning model.

That’s extremely important, but this information is not relevant to your non-technical collaborators. They care about Business Need: How is what you’re doing getting us closer to solving a practical business problem?

Say you’re working on a fraud detection system and your model has low recall, producing a lot of false positives. During a team meeting, you need to communicate your progress and plans for next steps.

The recall word doesn’t really mean anything to your colleagues, so don’t use it. Instead, you could say that your system is “making a lot of false alarms”. Your next step might be to “adjust some parts of the system so that the rate of false alarms is reduced”.

This kind of language makes sure your colleagues are in the loop — It could also facilitate a helpful discussion about acceptable rates of false alarms.

Make Meaningful KPIs.

Loss, accuracy, precision, recall, F1 scores, ROC curves and confusion matrices are great metrics for measuring the performance of your model.

These metrics make a lot of sense when the performance of the model is a goal in itself, such as in research or prototypes to test if an idea is feasible.

That would be a rare goal for a real business case!

More often, you’re working with data to solve a business problem. To name a few examples, you might be trying to:

  • Automate a task to save on labor costs.
  • Help someone complete a task with less errors.
  • Improve product user experience by making smart suggestions.

Your work should be measured on the basis of the problem you’re solving.

When starting a new project, try to understand the problem from a Business Need perspective first. Which KPIs will accurately reflect if that need is being met? With that knowledge, you can make better decisions about the internal metrics you will need to work towards a solution.

Say you’re building a system to optimize a labor-intensive task so that costs can be reduced. Does it matter that your system is highly accurate if it actually makes the process slower? That wouldn’t really reduce costs.

A good KPI for this case could be something like Average Workflow Speed. Define a workflow and measure how long it takes. Measure the same workflow using your system (controlling for experience level). Did you make the process faster?

You might learn that the model must reach a certain level of accuracy to be useful. Maybe you find out that the process is only really faster if the model is at least 75% accurate. If your goal is Accuracy, you’d likely try a model with higher capacity, increasing the computation cost. Instead, if your goal is Workflow Speed, you’ll find that there is a happy medium between a model that’s accurate enough and also fast to compute.

Settling on meaningful KPIs with your collaborators puts Business Need first and centers your communication around the actual problem you’re solving.

Use Consistent Terminology.

Using different words to describe the same thing can be a major source of confusion, making all communication less efficient.

When you start a project, you might want to hash these words out in collaboration with your team and your client.

For your project, try answering questions like:

What is a User? What is the System? What is a Model? What is the Pipeline? What is a Workflow? Make sure to write these definitions down.

Making everyone use the same words reduces the amount of communication that gets “lost in translation”. It does require some work up front, but the efficiency gained from the mutual understanding is worth it.

Keep It Simple, Silly.

Working with data isn’t always simple. In fact, information in data is often complex, nuanced, multi-faceted. But we should try and remember to KISS.

Now I love to KISS. This old acronym is often used in programming and it tells us to “Keep It Simple, Silly”.

Information Overload is a real thing, and it can be difficult to avoid when presenting a complex subject. Outputting a ton of information might make you look smart, but it’ll make your collaborators feel powerless.

The whole purpose of presenting your work to someone else is so that they can make a decision or take an action. If you overload them with information, they won’t be able to do their work effectively.

Coping with the many subtleties of data is your job. When communicating with your colleagues and clients, try focusing on which information is essential to the Business Need. It usually requires extra preparation to focus the information that way, so remember to allocate some time for it.

--

--

CTO @ Kanda. Technologist by trade and creator at heart. These are my thoughts on code, data, sound and beyond. I hope you find them useful.