MONTHLY EDITION

As you learn about and practice Data Science, it becomes harder and harder to avoid abstraction – often in the form of an algorithm coded in a particular programming language, or the mathematical characterization of an idea. What I’d like to highlight here is that data science and its component disciplines (statistics, machine learning, etc.), have origins very closely tied to our own way of perceiving and thinking about the world. I’d like to invite you to relate every data science principle or big idea you come across to your life: What are the parallels between a given learning algorithm and how you yourself learn? Whether it’s supervised or unsupervised, or even reinforcement learning, these paradigms and their underlying techniques need not be as abstract as we may characterize them.
One way to do this is to seek opportunities to talk about data science concepts in a way very much relatable to a wide audience. In my experience as a data educator, I often find the need and desire to think of simple scenarios that perfectly describe a potentially daunting idea in data science.
For example: I often characterize the concept of generalization in the context of tennis (or any other 1–1 sport). Say an individual chooses to train at the game of tennis at one academy, exclusively with the same coach, without ever playing with anybody else for several years. After a while, this individual starts beating their coach consistently and decides to crown themselves the best tennis player on Earth. How realistic is this title? Can we dare say this individual represents overfitting a data problem? After all, after playing their coach so much, they’re bound to memorize their patterns and exploit them. What if we were then to have this individual play against a player from another academy? How likely is it that our player will beat their rival?
Even if you don’t practice a sport yourself, I’m sure you can certainly weigh in the merits of playing against a diverse set of opponents. You may train on repetitive motions and techniques, but the ability to strategize, react, and perform in the face of pressure is best sparked by facing a diverse set of opponents continuously as you test your abilities. This is what I characterize as good generalization in learning.
Go ahead and challenge yourself to explain abstract concepts in simple, yet meaningful and correct ways. Try to explain them to a friend or colleague and gauge their interest and agreement. Take this feedback yourself and strive to become a better data science communicator. After all, clear and engaging communication is one of the most overlooked skills of the modern data scientist!
Can you think of other concepts worth communicating more accessibly? Exploration vs exploitation? Bias-Variance tradeoff? Independence in probability? Regularization? Ensembles? Gradient Descent? Share your ideas down below!
Sources that might interest you:
- https://arxiv.org/pdf/1702.07800.pdf
- https://www.frontiersin.org/articles/10.3389/fevo.2020.00082/full
- https://link.springer.com/article/10.1007/BF02478259
- https://archive.org/details/in.ernet.dli.2015.226341
Sergio E. Betancourt, Editorial Associate at Towards Data Science
Go Ahead, Change My (AI) Mind
Agency as a missing ingredient in the AI Fairness debate.
By Vincent Vanhoucke – 4 min read
Train Your Mind to Think Recursively in 5 Steps
How to solve recursive problems easily
By Sara A. Metwalli – 6 min read
A Simple Story to Explain Version Control to Anyone
Let’s build a house together… using git.
By Julia Di Russo – 6 min read
AlphaZero and the Beauty of the Artificial Mind
How self-learning AI could re-define our concepts of creativity
By Manuel Brenner – 11 min read
How to Explain Each Machine Learning Model at an Interview
Summarization of models from regression to SVMs to XGBoost
By Terence Shin – 6 min read
What is it like to be intelligent?
Exploring the depths of human consciousness
By Aki Ranin – 25 min read
Consciousness, Free Will & Artificial Intelligence
In defence of free will, how it is closely tied to consciousness, and why it matters to Artificial General Intelligence
By Miguel Pinto – 12 min read
PyTorch + SHAP = Explainable Convolutional Neural Networks
Learn how to explain predictions of convolutional neural networks with PyTorch and SHAP
By Dario Radečić – 4 min read
Crystal Clear Reinforcement Learning
Comprehensive & concise concepts of Reinforcement Learning
By Baijayanta Roy – 31 min read
New podcasts
- Anders Sandberg – Answering the Fermi Question: Is AI our Great Filter?
- Sarah Williams – What does ethical AI even mean?
- Ben Garfinkel – Superhuman AI and the future of democracy and government
- Margot Gerritsen – Does AI have to be understandable to be ethical?
We also thank all the great new writers who joined us recently Yuichiro Tachibana (Tsuchiya), Lean Tran, Stephanie A., Zito Relova, Nicolo’ Lucchesi, Lakshmi Ajay, Faruk Cankaya, Violeta Mezeklieva, Juan Andrés Malaver, Avery Parkinson, Charlie Craine, Dr. Sohini Roychowdhury, Nick Caros, Cameron Trotter, Chen Karni, Laura Gorrieri, Pratik Kamath, John Bica, Jennifer Bland, Oscar Darias Plasencia, Guillaume Blot, Frankie Cancino, Streicher Louw, Adnan Haider, Egor Vorontsov, Taras Baranyuk, Alex Kalinins, Savanna Reid, Juan Gesino, Alexander Petrov, Núria Correa Mañas, Tomer Ronen, Vasili Shichou, Etienne Dilocker, Bruce Nguyen, Tal Baram, Sven Harris, Tim Lou, PhD, Wesley Liao, Mihir Gandhi and many others. We invite you to take a look at their profiles and check out their work.