Discover New and Exciting Voices on TDS

TDS Editors
Towards Data Science
3 min readOct 20, 2022


Few things thrill us more than connecting readers with the work of new TDS authors—our team reviews dozens of articles every week to find the best ones to share with you. Today, we celebrate our newest cohort of contributors: the articles in this edition of the Variable are all by data science writers we’ve welcomed in the past few weeks. And they’re all worth your time.

The lineup of recommended reads you’ll find here covers a diverse range of topics, approaches, and perspectives, and the people behind it come from just as wide a range of backgrounds and career paths. You’re bound to learn something new along the way. (And if browsing these posts inspires you to join our merry community of authors, we’d love to hear from you.)

  • So many language models, so little time. With the proliferation of large language models (LLMs) like BERT, LaMDA, and GPT-3, machine learning practitioners face an embarrassment of riches. Janna Lipenkova’s overview will help you find, select, and deploy the right one for your next NLP project.
  • How to make a scikit-learn pipeline your own. Talia Shrem joined the ranks of TDS authors just a few days ago with a memorable debut post. It offers a clear, detailed walkthrough of the process of customizing your data pipelines and creating a bespoke transformer.
  • The lofty pursuit of snowfall prediction. Bridging the gap between machine learning and geoscience, Fraser King (with coauthors George Duffy, Lisa Milani, Christopher G. Fletcher, Claire Pettersen, and Kerstin Ebell) presents DeepPrecip, a deep convolutional neural network that uses ground-radar data inputs to predict surface precipitation quantities.
  • Molecules, say hello to graph neural networks. Yunchao "Lance" Liu (刘运超) shares another groundbreaking research project (coauthored with Yu Wang, Oanh Vu, Rocco Moretti, Bobby Bodenheimer, Jens Meiler, and Tyler Derr), focused on the modeling of molecules as graphs, with the aim of facilitating future drug discovery.
Photo by Lasse Nystedt on Unsplash
  • Bolstering mental health services with data. For data scientists and analysts in the public-health sector, finding and making sense of available data can be an uphill battle. Sebastien Peytrignet recently shared a practical guide for using NSH mental health data, which you might find useful even if you’re not in the UK.
  • Variations on the Bayes classifier, unpacked. If you’re just learning how to work with classical classification methods (say it out loud: it’s fun!), nobody will hold it against you if you find it a bit confusing to distinguish between Gaussian Naive Bayes (GNB), Linear Discriminant Analysis (LDA), and Quadratic Discriminant Analysis (QDA). Francesca Argenziano’s clear explainer will set you on the right path.
  • Get familiar with anonymization. Working with personally identifiable information (PII) and navigating the increasingly complex regulatory landscape around it has become a required skill for many data practitioners. Lingzhen Chen wrote a helpful introduction to the topic, bringing together theory, best practices, and recommended resources.
  • Applying the right lessons across disciplines. Caitlin Ray’s career has straddled both data science and machine learning engineering, and transitioning between the two fields has left her with some sharp insights on problem-solving, storytelling, and the importance of simplicity.
  • Digging deep into bipartite graphs. As a trained archaeologist, James Scott Cardinal has experienced firsthand the challenges of solving the “big jigsaw puzzle” of excavation sites. Read James’s first TDS article to see how data science methods can have an outsized impact on a seemingly distant discipline.
  • An accessible approach to causality. Answering causal questions and becoming effective at inference are two essential skills for data professionals. Eden Zohar explores inverse propensity weighting (IPW) as one powerful way to determine the relation between a treatment and an outcome.

Thank you, as always, for your support and for the time you spend with our authors’ work. If you’d like to show your support in other meaningful ways, consider becoming a Medium member.

Until the next Variable,

TDS Editors



Building a vibrant data science and machine learning community. Share your insights and projects with our global audience: