A Machine Learning Engineer’s Must-Have Tools

From both technical and productivity aspects

Ceren Iyim
Towards Data Science

--

Photo by JESHOOTS.COM on Unsplash

I changed my career from SAP Consultant to Data Scientist roughly 4 years ago. After following a curriculum I designed for myself, I secured my Machine Learning Engineer role within a startup ecosystem in a year.

It isn’t easy to condense all that I’ve learned, the tools I’ve used, and the experiences I’ve had during these past four years into a single article. Nonetheless, I’ll highlight the ones that have particularly benefited me.

As I progressed in my role, I cultivated software development skills by using a variety of tools and following certain best practices on the job:

  1. Git and Version Control
  2. Writing Readable and Clean Code
  3. Exploring Different Development Tools

I will not only explain them in this article but also mention how they helped me improve my soft skills and productivity. Let’s get started 🚀

Git and Version Control

Git is an open-source version control system widely used in software development. It organizes projects and manages collaboration between developers working on the same project. I was not using Git when I was working solo; I was manually versioning my code and notebooks 🙃

When collaboration comes into play, Git becomes a necessity. It helps to track the progress of the project and fosters collaboration.

It is a vast topic to learn from, and great sources are out there (like this one). Today, I will focus on the “commit” term and how it helped me organize my thought process.

Git commit is like taking a snapshot of your code in time.

One of my first learnings in my early days was to have organized Git commits and concise commit messages.

Later on, I realized that thinking beforehand about your commits and how to structure them also helps you to organize your work and design your code with a better logical pattern.

Here is an example of how Git commits can be organized in the context of Data Science from one of my recent projects:

Image by author

From the collaboration perspective, breaking down commits — so that each code change should have one purpose at a time — will also help your colleagues review your code faster.

Writing Readable and Clean Code

My previous mentor once told me, “Even your grandma should be able to read your code” — a disclaimer: Not from an ageist perspective but rather to imply that everyone should be able to read and understand your code easily.

Jokes aside, reflecting your thought process in your code and writing a self-explanatory code will help anyone review your work and understand it faster.

I learned how to craft readable and clean code both on the job and by reading several industry-wide books:

  • Clean Code: A Handbook of Agile Software Craftsmanship by Robert C. Martin
  • Philosophy of Software Design by John Ousterhout.

“So if you want to go fast, if you want to get done quickly, if you want your code to be easy to write, make it easy to read.”

Robert C. Martin.

and an implementation of this practice from my recent work:

Believe me, your future self will thank you for writing readable and clean code, not only your team!

Exploring Different Development Tools

When it comes to experimenting with machine learning models or prototyping solutions to test their viability, using notebooks is often the first choice.

And Jupyter notebooks are a great medium for that.

Before landing in the Machine Learning Engineer role, I primarily worked on my projects using Jupyter notebooks. As my previous team enforced using PyCharm, I met with my first integrated development environment (IDE).

I felt a bit overwhelmed by the numerous functionalities and the user interface of PyCharm, as a Data Scientist mostly using one-time notebook solutions.

In time, PyCharm has become my second-nature tool.

Its code completion and error-highlighting features have become my indispensable helpers and contributed to my productivity, a lot.

Mastering Git within IDEs has significantly improved my ability to organize work, boosted my coding speed, and enabled me to focus on the task at hand more efficiently. Also, I’m always grateful to myself for writing clean code when I revisit it months later 😉

Thank you for reading! This blog post holds a special place for me. It signifies my return to blogging and technical writing 🤗

For comments or constructive feedback, you can reach out to me on responses, Twitter, or Linkedin!

--

--