With machine learning and AI research making strides daily, it can often feel like all the cool, innovative frontiers of the field are already spoken for. That’s especially the case if you’re just getting started. This week, we’d like to remind you (and ourselves, too) just how much there is left to learn, grow, and improve in these areas. If you need a nudge to keep going or a dose of inspiration to help you launch your next project, here you go!
- Get familiar with three promising machine learning challenges. Yes, large language models and computer vision are more powerful than we could’ve ever imagined just a few years ago. But, as Vincent Vanhoucke makes clear in his recent post, there’s still so much to do. From the inverse video-game problem to the next phase of reinforcement learning, Vincent introduces the three directions he thinks the field will focus on in the coming months and years.
- Discover new ways to leverage ML and NLP to support mental-health service providers. COVID-19 has made the job of mental health professionals even tougher than usual: more people are experiencing anxiety and stress, and in-person connections are harder (if not impossible) to plan. Tiffany Meshkat and her colleagues at the Crisis Text Line used natural language processing methods to analyze the text messages of teenagers, allowing them to screen and triage the people reaching out as well to detect patterns in the issues they brought up.
- Learn about the future of open-ended reinforcement learning. Reinforcement learning—the subfield of ML in which an agent, motivated by rewards, is tasked with figuring out an environment and its rules—has made impressive progress in recent years. On a recent episode of the TDS Podcast, Jeremie Harris and his guest, DeepMind’s Max Jaderberg, discussed what’s next and how AI agents might soon be able to win games they had never encountered before.
- Read new research that could revolutionize microbiological analysis. Using deep learning methods, Sylwia Majchrowska and Jarosław Pawłowski attempted to radically reduce the time it takes to identify and count microorganisms in Petri dishes. Their post walks us through their process, shares their results, and points to the promise of developing this approach further.
- Explore the problem of sustainable deep learning models. With massive models come massive costs—both financial and environmental. In his latest article, Intel Labs’ Gadi Singer reflects on the challenges of continued scientific and technological progress. Gadi proposes that companies and practitioners focus on a "tiered access structure," one that can empower us to "increase the capabilities and improve the results of AI technologies while minimizing power and system cost."
- Add hashing to to your data science toolkit. If you’re in the mood for more hands-on tips and tricks this week, we won’t leave you empty-handed! Konstantin Kutzkov‘s guide discusses Machine Learning techniques for the design of data-specific hash functions, and also walks you through their applications. Enjoy!
If you encountered something new and exciting in your work this week, we’d love to hear about it—leave a comment, or better yet: write a post about it. Thank you, as always, for supporting our authors’ work.
Until the next Variable, TDS Editors
Recent additions to our curated topics:
Getting Started
- The Art of Learning Data Science by Chanin Nantasenamat
- 5 Concrete Benefits of Bayesian Statistics by Renato Boemer
- To All Data Scientists – Don’t Live on the Edge, Have a Plan by Thushan Ganegedara
Hands-On Tutorials
- UMAP Dimensionality Reduction – An Incredibly Robust Machine Learning Algorithm by Saul Dobilas
- Determining the Optimal Pokemon Team for Pokemon Brilliant Diamond and Shining Pearl with Pulp by Jamshaid Shahir
- API Interaction with R by John Clements
- Time Series Forecast Error Metrics You Should Know by Konstantin Rink
Deep Dives
- Bias, Consistency, and Designing KPIs for Data-Driven Endeavors by Rohit Pandey
- Two Problems with the Random Walk by Xichu Zhang
- Python Scenario Analysis: Modeling Expert Estimates with the beta-PERT Distribution by Heiko Onnen
Thoughts and Theory
- Why Is the Closed Form of the Fibonacci Sequence Not Used in Competitive Programming? by Rohit Pandey
- The Shapley Value for ML Models by Divya Gopinath and David Kurokawa
- Dealing with Leaky, Missing Data in a Production Environment by Sam Wilson
- The Power of Constrained Language Models by Karel D’Oosterlinck