In just a few short months, large language models moved from the realm of specialized researchers into the everyday workflows of data and ML teams all over the world. Here at TDS, we’ve seen how, along with this transition, much of the focus has shifted into practical applications and hands-on solutions.
Jumping straight into tinkering mode can make a lot of sense for data professionals working in industry—time is precious, after all. Still, it’s always a good idea to establish a solid grasp of the inner workings of the technology we use and work on, and that’s precisely what our weekly highlights address.
Our recommended reads looks both at the theoretical foundations of LLMs—specifically, the Gpt family—and at the high-level questions their arrival raises. Even if you’re just a casual user of these models, we think you’ll enjoy these thoughtful explorations.
- The transformers architecture is the groundbreaking innovation that made GPT models possible in the first place. As Beatriz Stollnitz makes clear, "understanding the details of how they work is an important skill for every AI practitioner," and you’ll leave her thorough explainer with a crystal-clear idea of transformers’ power.
- Lily Hughes-Robinson offers a different approach for learning about transformers: one that focuses on the source code so you can build your knowledge intuitively from the ground up.
- How important is size when it comes to LLMs’ performance? Gadi Singer dives into this question in great detail as he surveys the latest crop of compact generative AI models. These contenders aim to compete with GPT-4 in accuracy, but at a lower cost and with a greater potential to achieve scalability.
- Of all the heated debates surrounding ChatGPT and similar tools, perhaps none has been more contentious than the question around LLMs’ supposed intelligence. Lan Chu tackles this topic head on, and brings a refreshingly measured and pragmatic perspective to the conversation. (Spoiler alert: no, AI isn’t conscious; yes, it’s complicated.)
- "So, how can we move beyond perceiving LLMs like ChatGPT as magical black boxes? Physics may provide an answer." Tim Lou, PhD‘s latest article proposes a thought-provoking idea: that the equations that make language models tick are analogous to the laws of physics and the way they govern particles and forces.
We published so many fantastic articles on other topics in recent weeks; here are just a few we absolutely had to highlight.
- Who says summer reading has to be lightweight fluff? Our August Edition brings together an impressive collection of engaging, enlightening, and heat-proof posts.
- The missing ingredient in your marketing strategy might just be machine learning, says Elena K., whose debut TDS story is full actionable tips and tricks.
- If you’re in the mood for another business-focused topic, you’re in luck: Matteo Courthoud is back with [a new contribution that focuses on the interaction of churn and revenue](http://interaction between churn and revenue).
- Turning back to the more practical side of working with LLMs, Felipe de Pontes Adachi outlines seven tactics for monitoring their behavior to ensure consistent performance.
- Anna Via‘s new post encourages industry data practitioners to take a step back before launching a ML-centered project and to ask if a machine learning model is even necessary for the problem at hand.
Thank you for supporting our authors! If you enjoy the articles you read on TDS, consider becoming a Medium member – it unlocks our entire archive (and every other post on Medium, too).
We hope many of you are also planning to attend Medium Day on August 12 to celebrate the community and the stories that make it special – registration (which is free) is now open.
Until the next Variable,
TDS Editors