MONTHLY EDITION

March Edition: Warm and fuzzy data science

Data Science need not be as cold and distant as the math makes it seem

TDS Editors
Towards Data Science
4 min readMar 1, 2021

--

Photo by Bogdan Glisik from Pexels

As you learn about and practice data science, it becomes harder and harder to avoid abstraction — often in the form of an algorithm coded in a particular programming language, or the mathematical characterization of an idea. What I’d like to highlight here is that data science and its component disciplines (statistics, machine learning, etc.), have origins very closely tied to our own way of perceiving and thinking about the world. I’d like to invite you to relate every data science principle or big idea you come across to your life: What are the parallels between a given learning algorithm and how you yourself learn? Whether it’s supervised or unsupervised, or even reinforcement learning, these paradigms and their underlying techniques need not be as abstract as we may characterize them.

One way to do this is to seek opportunities to talk about data science concepts in a way very much relatable to a wide audience. In my experience as a data educator, I often find the need and desire to think of simple scenarios that perfectly describe a potentially daunting idea in data science.

For example: I often characterize the concept of generalization in the context of tennis (or any other 1–1 sport). Say an individual chooses to train at the game of tennis at one academy, exclusively with the same coach, without ever playing with anybody else for several years. After a while, this individual starts beating their coach consistently and decides to crown themselves the best tennis player on Earth. How realistic is this title? Can we dare say this individual represents overfitting a data problem? After all, after playing their coach so much, they’re bound to memorize their patterns and exploit them. What if we were then to have this individual play against a player from another academy? How likely is it that our player will beat their rival?

Even if you don’t practice a sport yourself, I’m sure you can certainly weigh in the merits of playing against a diverse set of opponents. You may train on repetitive motions and techniques, but the ability to strategize, react, and perform in the face of pressure is best sparked by facing a diverse set of opponents continuously as you test your abilities. This is what I characterize as good generalization in learning.

Go ahead and challenge yourself to explain abstract concepts in simple, yet meaningful and correct ways. Try to explain them to a friend or colleague and gauge their interest and agreement. Take this feedback yourself and strive to become a better data science communicator. After all, clear and engaging communication is one of the most overlooked skills of the modern data scientist!

Can you think of other concepts worth communicating more accessibly? Exploration vs exploitation? Bias-Variance tradeoff? Independence in probability? Regularization? Ensembles? Gradient Descent? Share your ideas down below!

Sources that might interest you:
- https://arxiv.org/pdf/1702.07800.pdf
- https://www.frontiersin.org/articles/10.3389/fevo.2020.00082/full
- https://link.springer.com/article/10.1007/BF02478259
- https://archive.org/details/in.ernet.dli.2015.226341

Sergio E. Betancourt, Editorial Associate at Towards Data Science

Go Ahead, Change My (AI) Mind

Agency as a missing ingredient in the AI Fairness debate.

By Vincent Vanhoucke — 4 min read

Train Your Mind to Think Recursively in 5 Steps

How to solve recursive problems easily

By Sara A. Metwalli — 6 min read

A Simple Story to Explain Version Control to Anyone

Let’s build a house together… using git.

By Julia Di Russo — 6 min read

AlphaZero and the Beauty of the Artificial Mind

How self-learning AI could re-define our concepts of creativity

By Manuel Brenner — 11 min read

How to Explain Each Machine Learning Model at an Interview

Summarization of models from regression to SVMs to XGBoost

By Terence Shin — 6 min read

What is it like to be intelligent?

Exploring the depths of human consciousness

By Aki Ranin — 25 min read

Consciousness, Free Will & Artificial Intelligence

In defence of free will, how it is closely tied to consciousness, and why it matters to Artificial General Intelligence

By Miguel Pinto — 12 min read

PyTorch + SHAP = Explainable Convolutional Neural Networks

Learn how to explain predictions of convolutional neural networks with PyTorch and SHAP

By Dario Radečić — 4 min read

Crystal Clear Reinforcement Learning

Comprehensive & concise concepts of Reinforcement Learning

By Baijayanta Roy — 31 min read

--

--

Building a vibrant data science and machine learning community. Share your insights and projects with our global audience: bit.ly/write-for-tds