To Grow Your Data Skills, Find a Passion Project
Whether you call it tinkering, hands-on learning, or “I’m not really sure what I’m doing, but it’s fun,” it can be extremely valuable to find a topic you’re passionate about, gather some relevant data, and see what happens next. This week, we’re highlighting four fantastic passion projects that will help you learn a few new tricks—and might inspire your creativity along the way, too.
- Can we leverage data to tell us which European football league is the most competitive? Like many a football (or, fine, soccer) fan, John Ade-Ojo would find himself in constant debates about the comparative merits of England’s Premier League, Italy’s Serie A, Germany’s Bundesliga, and so on. Unlike most fans, John could put his data science skills to good use to settle this discussion.
- What does data show us about the evolution of women’s representation in movies? Alison Yuhan Yao’s latest article is, first and foremost, a detailed, data-backed analysis of an ever-timely question: are movies getting better at representing women both on- and off-screen? It’s also a comprehensive and patient guide for collecting data via API, analyzing, and visualizing it — skills that are crucial to develop regardless of the topic you’re working on.
- Going back to The Office to learn about logistic regression. Whether you are a Dunder-Mifflin-forever superfan or have never watched a single episode of The Office, Will Crowley’s new tutorial shows how any topic can become fun and engaging with the right framing. Here, Will uses the fictional (yet nonetheless legendary) paper company to explain the ins and outs of lead scores and binary logistic regression models.
- Learn about reinforcement learning through the mechanics of a dice game. As Thomas Dybdahl Ahle tells us in his fascinating debut post on TDS, Liar’s dice is a deceptively simple game. Trying to teach an AI to play it pushed him to explore concepts like counterfactual regret minimization and technical challenges like serving PyTorch models in the browser.
We publish new and excellent hands-on tutorials every single day, so if you’re ever in the mood to browse around, check out our dedicated column where we collect some of the best ones.
As always, there are too many great reads to share with you here, but we couldn’t sign off without recommending some of our other recent highlights:
- After an exciting year for geometric and graph ML in 2021, Michael Bronstein and Petar Veličković spoke to leading experts to see what’s in store for the field in 2022.
- As the size of datasets used in deep learning continues to grow, practitioners face new challenges. Suneeta Mall shares a comprehensive overview of recent advances that allow data scientists and ML engineers to scale up their operations.
- We love deep dives that bring together data, machine learning, and environmental impact. Daniel García’s latest post discusses waste-classification techniques and his work to build a classifier that would help authorities improve their sorting and recycling efforts.
If you’ve worked on an exciting project recently—something that inspired you to push your craft and try new approaches—we’d love to hear about it (and so would our readers).
Thank you for your time and for your support, and until the next Variable,
TDS Editors