What Does Following “Best Practices” Mean, Anyway?

Published in

Towards Data Science

3 min readApr 13, 2023

When new tools, research, and buzzwords arrive on the scene as quickly as they have in recent years, terms like “state of the art” and “best practices” can seem meaningless. Last month’s cutting-edge model might already feel outdated; workflows that made perfect sense for your last project can become obsolete by the time you launch your next one.

Sure: the constant push towards newer, better, and shinier things can generate excitement. It also has a tendency to induce anxiety and a fear of falling behind—and those are ingredients we don’t really recommend adding to your data science learning stew. Instead, we love reminding ourselves (and anyone else who’d listen) that gradual, cumulative improvement is more than enough for sustainable professional growth.

The articles we chose for you this week reflect the very same spirit — they focus on optimizing specific areas in your day-to-day tasks and developing robust habits. Enjoy!

How to set your data team up for success. Given the iterative (and occasionally repetitive) nature of data science work, Rebecca Vickery rightly reminds us that “it is highly important that whether you are a team of one data scientist working alone, or a large team working together, you develop a set of best practices.” She goes on to propose six ideas to help you formulate the right ones for your needs.
Why settle for mediocre plots? For better or for worse, people can’t read your mind—which means that even the best data analysis won’t make much of an impact if you present it in the form of hard-to-decipher visualizations. Aruna Pisharody comes to the rescue with a thorough, hands-on guide for producing publication-ready plots with LaTeX.
More effective experimentation is within reach. Eryk Lewinson’s latest contribution explores the use of DVC’s Visual Studio Code extension to transform your IDE into a machine learning experimentation platform. Here, Eryk demonstrates how you can monitor model performance and evaluate experiments with interactive plots.

Reap the benefits of streamlined code. “Companies and employers prefer optimized code that can easily scale and allow new developers to get on board quickly,” says Susan Maina. But how can you tell if your code is optimized? Susan’s tutorial walks us through Python’s magic commands and how to use them to test your code’s efficiency.
On continuous integration and how to implement it. For a clear and accessible primer on CI and its power to prevent failures and pain points in your machine learning pipelines, head right over to Khuyen Tran’s well-illustrated, step-by-step tutorial, which includes all the code you’ll need to get started.
Apply powerful lessons from a neighboring discipline. Prioritization is a perennial challenge for data professionals, who often have to juggle the competing needs of developers, marketers, business analysts, and others. Brian Roepke leans into the wisdom of successful product managers and shares insights that can help data scientists in their decision-making process.

Is that learning stew we mentioned earlier still simmering? We hope so — there are still some excellent reads we don’t want you to miss:

Our latest Monthly Edition asks one of the most crucial questions of our time: Can AI be governed?
If you’re interested in ways to enhance diffusion models’ outputs, Andrew Zhu explains how you can overcome token size limitations and implement custom model loading, among other techniques.
Yennie Jun’s fascinating new article takes a close and detailed look at the evolution of creative processes in large language models, tracking their progress from GPT-2 to GPT-4.
How does Google’s Bard and OpenAI’s GPT-4 stack up? Cassie Kozyrkov “unboxes” the two models and compares their performance on a few well-curated tasks.
Still on the GPT model beat, Mark Chen presented the first part of a comprehensive study on generative-AI usage and testing, focusing on text summarization to benchmark GPT-3.5’s abilities.
How can data engineers transition “from a batch mindset to a streaming mindset?” Scott Haines’s deep dive will help you build the right mental model (and equip you with the knowledge) for making that switch.

Thank you for spending time with us this week! If you enjoy the articles you read on TDS, consider becoming a Medium member. It’s a particularly great time to join for students, as many of you can now enjoy a substantial discount on a membership.

Until the next Variable,

TDS Editors

What Does Following “Best Practices” Mean, Anyway?

Written by TDS Editors