The world’s leading publication for data science, AI, and ML professionals.

Data Science Is Becoming More Tool-Oriented. Will It Kill The Science Behind?

Do not pass the point where a necessity becomes a burden.

Photo by Dan-Cristian Pădureț on Unsplash
Photo by Dan-Cristian Pădureț on Unsplash

The goal of Data Science is simply to create value out of data. The main components to achieve this goal are:

  • A problem to be solved with data
  • Data (obviously)
  • Ability to explore and understand the data
  • Tools

What I mean by the tools is the entire set of software tools that somehow prove to be useful in the data science workflow.

Excel is a widely used one. Python is the most commonly used Programming language in data science. SQL is what we use to manage relational databases. Tableau allows us for creating stunning dashboards. Databricks enable us to work with huge amounts of data.

And the list goes on with more specific tools focusing on certain tasks and operations.

More and more software tools and packages are being introduced constantly. I think there are mainly 3 motivations for creating such tools:

  • The size of data we deal with keeps increasing so new tools are created to work on large-scale data more efficiently
  • To simplify the routine operations
  • The competition in the industry. For instance, AWS, Microsoft, and Google are competing for getting the bigger slice from the "cloud" pie.

It is crystal clear that we need software tools to perform data science. They enable us to turn ideas into actions. However, the ever-increasing number of software tools and packages sometimes makes me feel exhausted.


Photo by Adam Winger on Unsplash
Photo by Adam Winger on Unsplash

In some cases, there is no significant difference between the performance of a particular set of tools. A company can just prefer one over another. As a result, which tool we need to learn only depends on the company we work for.

Whatever tool we use, the fundamentals of data science are the same. The products are built on the same principles and theory. Statistics, Bayes’ theorem, linear regression, and correlation do not change with respect to the tools we are using.

But, how we use a particular tool can be very different than the other ones so we need to spend some time learning it.

Time is the most valuable resource and it is limited. Even if we have the motivation to learn a new tool, we might just not have enough time.

Let’s say a part of my job is to create dashboards and I use Tableau at my current company. I change my job for some reason and my new company prefers Power BI.

In this case, learning Power BI does not add significant value to my dashboard-creating skills. I just become an employee that can perform the same task with two different tools.

So, my first concern is not being able to use the time efficiently and productively.


My other concern is that the number of tools you can use might be considered more important than your data science knowledge. This situation leads to data scientists being evaluated based on tool knowledge, not science.

If this happens, it will be a serious problem. Software tools are just there for turning ideas into action or value.

The ideas come from data scientists who blend analytical thinking, creativity, statistics, and theory. If data scientists are forced to learn as many tools as possible, they might miss the point.

They will get very quick at performing tasks thanks to the highly advanced tools. However, this is not enough for creating value out of data.

What leads to creating value is first to define a problem that can be solved with data. Once a problem is defined and a solution is designed, the tools are then needed to do the tasks.

I think you would agree that without a problem and solution, there is no use for advanced software tools.


To sum up, we definitely need software tools and packages to perform data science. They enable us to work with large amounts of data quickly and efficiently.

However, I feel like the number of tools will likely be more than necessary. The requirement to learn a wide range of tools puts extra pressure on data scientists.

Spending a tremendous amount of time learning new tools might cause data scientists not to give enough importance to what really matters.


Become a Medium member to access an unlimited number of articles on Medium.

Join Medium with my referral link – Soner Yıldırım


Thank you for reading. Please let me know if you have any feedback.


Related Articles