The world’s leading publication for data science, AI, and ML professionals.

March Edition: Warm and fuzzy data science

Data Science need not be as cold and distant as the math makes it seem

MONTHLY EDITION

Photo by Bogdan Glisik from Pexels
Photo by Bogdan Glisik from Pexels

As you learn about and practice Data Science, it becomes harder and harder to avoid abstraction – often in the form of an algorithm coded in a particular programming language, or the mathematical characterization of an idea. What I’d like to highlight here is that data science and its component disciplines (statistics, machine learning, etc.), have origins very closely tied to our own way of perceiving and thinking about the world. I’d like to invite you to relate every data science principle or big idea you come across to your life: What are the parallels between a given learning algorithm and how you yourself learn? Whether it’s supervised or unsupervised, or even reinforcement learning, these paradigms and their underlying techniques need not be as abstract as we may characterize them.

One way to do this is to seek opportunities to talk about data science concepts in a way very much relatable to a wide audience. In my experience as a data educator, I often find the need and desire to think of simple scenarios that perfectly describe a potentially daunting idea in data science.

For example: I often characterize the concept of generalization in the context of tennis (or any other 1–1 sport). Say an individual chooses to train at the game of tennis at one academy, exclusively with the same coach, without ever playing with anybody else for several years. After a while, this individual starts beating their coach consistently and decides to crown themselves the best tennis player on Earth. How realistic is this title? Can we dare say this individual represents overfitting a data problem? After all, after playing their coach so much, they’re bound to memorize their patterns and exploit them. What if we were then to have this individual play against a player from another academy? How likely is it that our player will beat their rival?

Even if you don’t practice a sport yourself, I’m sure you can certainly weigh in the merits of playing against a diverse set of opponents. You may train on repetitive motions and techniques, but the ability to strategize, react, and perform in the face of pressure is best sparked by facing a diverse set of opponents continuously as you test your abilities. This is what I characterize as good generalization in learning.

Go ahead and challenge yourself to explain abstract concepts in simple, yet meaningful and correct ways. Try to explain them to a friend or colleague and gauge their interest and agreement. Take this feedback yourself and strive to become a better data science communicator. After all, clear and engaging communication is one of the most overlooked skills of the modern data scientist!

Can you think of other concepts worth communicating more accessibly? Exploration vs exploitation? Bias-Variance tradeoff? Independence in probability? Regularization? Ensembles? Gradient Descent? Share your ideas down below!

Sources that might interest you:

Sergio E. Betancourt, Editorial Associate at Towards Data Science


Go Ahead, Change My (AI) Mind

Agency as a missing ingredient in the AI Fairness debate.

By Vincent Vanhoucke – 4 min read


Train Your Mind to Think Recursively in 5 Steps

How to solve recursive problems easily

By Sara A. Metwalli – 6 min read


A Simple Story to Explain Version Control to Anyone

Let’s build a house together… using git.

By Julia Di Russo – 6 min read


AlphaZero and the Beauty of the Artificial Mind

How self-learning AI could re-define our concepts of creativity

By Manuel Brenner – 11 min read


How to Explain Each Machine Learning Model at an Interview

Summarization of models from regression to SVMs to XGBoost

By Terence Shin – 6 min read


What is it like to be intelligent?

Exploring the depths of human consciousness

By Aki Ranin – 25 min read


Consciousness, Free Will & Artificial Intelligence

In defence of free will, how it is closely tied to consciousness, and why it matters to Artificial General Intelligence

By Miguel Pinto – 12 min read


PyTorch + SHAP = Explainable Convolutional Neural Networks

Learn how to explain predictions of convolutional neural networks with PyTorch and SHAP

By Dario Radečić – 4 min read


Crystal Clear Reinforcement Learning

Comprehensive & concise concepts of Reinforcement Learning

By Baijayanta Roy – 31 min read


New podcasts


We also thank all the great new writers who joined us recently Yuichiro Tachibana (Tsuchiya), Lean Tran, Stephanie A., Zito Relova, Nicolo’ Lucchesi, Lakshmi Ajay, Faruk Cankaya, Violeta Mezeklieva, Juan Andrés Malaver, Avery Parkinson, Charlie Craine, Dr. Sohini Roychowdhury, Nick Caros, Cameron Trotter, Chen Karni, Laura Gorrieri, Pratik Kamath, John Bica, Jennifer Bland, Oscar Darias Plasencia, Guillaume Blot, Frankie Cancino, Streicher Louw, Adnan Haider, Egor Vorontsov, Taras Baranyuk, Alex Kalinins, Savanna Reid, Juan Gesino, Alexander Petrov, Núria Correa Mañas, Tomer Ronen, Vasili Shichou, Etienne Dilocker, Bruce Nguyen, Tal Baram, Sven Harris, Tim Lou, PhD, Wesley Liao, Mihir Gandhi and many others. We invite you to take a look at their profiles and check out their work.


Related Articles