Which Data Science Skill Are You Looking to Level Up?
In fast-changing fields like data science and machine learning, adding new skills to your toolkit might sometimes feel overwhelming: how do you choose your next step? Do you focus on something practical and job-related, or expand your horizon with the latest research? Do you explore a brand-new area, or build on an existing interest?
While we can’t answer these questions for you, thanks to our community of authors we can offer options—exciting, diverse, and often unexpected ones. Here are a few we wanted to highlight this week.
- Learn about the benefits of different approaches to model optimization. Bradley Stephen Shaw’s hands-on tutorial covers the always-important topic of model optimization from a fresh angle. He compares the results of tuning hyperparameters in a LightGBM model using a Bayesian approach to those he gets from careful feature engineering, and reaches a surprising conclusion.
- Make your charts more compelling. We’ve all internalized the importance of visual storytelling in data science, but even if your charts and graphs are already crisp, there’s always more to explore. Parul Pandey shows us how to inject visualizations with an extra dose of personality, creating xkcd-style charts that look hand-drawn (without sacrificing readability).
- Ensure all stakeholders can make sense of your repos. Even the smartest and hardest-working data team can waste hundreds of hours if nobody can find the stuff they need for a given task. Lucy Rothwell is here for the rescue. She introduces a speedy shortcut that makes it possible to set up a common structure for any repo, empowering colleagues across teams to self-serve the data and code they require.
- Find a more effective way to master Python. We can only learn something for the first time once, but the mistakes we make along the way can be valuable for our own growth—and for others’. Nicholas reflects on his early struggles with Python, and shares the lessons he learned so that data scientists who are just starting out might have a smoother experience than his.
- Fill in the gaps in your time-series forecasting knowledge. Sooner or later, most data scientists are tasked with taking historical data and using it to predict future scenarios. Matt Sosna’s deep dive into the world of ARIMA (Auto Regressive Integrated Moving Average) models first lays the necessary foundations, then patiently walks us through several common practical applications.
- Get inspiration from a neat crop-mapping project. Even if your interests lie nowhere close to agriculture, crops, or labor issues, there’s always a lot to learn from an end-to-end project that solves a complex problem with a mix of technical know-how and creativity. Case in point: Madeline Lisaius’s work at The Rockefeller Foundation, using satellite data to map hand-harvested crops to better detect changes in production.
We hope you found some time this week to learn something new or to expand your knowledge in an area you care about; if you’d like to teach the TDS community something you are an expert in, you really should—we’d love to hear from you.
Until the next Variable,
TDS Editors
Recent additions to our curated topics:
Getting Started
- The Four Maturity Levels of ML Production Systems by Samuel Flender
- Phik (𝜙k)—Get Familiar with the Latest Correlation Coefficient by Eryk Lewinson
- Staying Competitive with Linear Models by Dr. Robert Kübler
Hands-On Tutorials
- Create Synthetic Time Series with Anomaly Signatures in Python by Tirthajyoti Sarkar
- How to Deliver an AI Pipeline by Fernando Tadao Ito
- An Introduction to Polynomial Regression by Xichu Zhang
Deep Dives
- Can We Use Stochastic Gradient Descent (SGD) on a Linear Regression Model? by Wei Yi
- 7 Embarrassingly Easy Ways to Speed Up Your Python Program by Joseph Robinson, PhD
- Using pgfplots to Make Economic Graphs in LaTeX by Arnav Bandekar