
There is not a strict definition of Data Science. A very simple yet concise definition would be "study of data" or "applying science to data". So a data scientist can be described as a person who studies data or applies science to data.
But why would a person study data? The goal of data science is much more dominant in explaining the importance of it. The goal is to solve a problem, overcome an issue, answer a question, or make things better and easier by applying science to data. These goals can be achieved with a systematic and structured study with following questions in mind:
- What is the problem and how can we solve it?
- What kind of data is needed and how can it be collected?
- How do we need to approach the data?
- What does the data tell us?
- What kind of model should we build?
- How do we evaluate the model?
- What are the conclusions?
- How do we tell what the results mean to others?
So data science is a process rather than being a single event. Each step of this process needs to handled well in order to achieve significant and satisfying results.
Studying the data is not always easy especially when we have lots of data. Just like surgeons, carpenters, mechanics use tools to do their jobs, data scientists need tools as well. Ofcourse the tools come in different forms. Data scientists use computers which make complex computations fast and easy. There are many software packages and libraries as well as readily-available, tested algorithms that expedite data science process.
These tools are very handy and useful. However, these are just tools which should not be the solely focus of a data scientist. I feel like being a data scientist is associated with the capabilitity of using these tools. Prominent feature of a data scientist should not be how good he/she is at using TensorFlow or PyTorch. The primary target is to answer a question. Tools are there to make you do your job easier and seamless.
The focus of a data scientist should be finding ways to answer questions with data.
The tools help a data scientist to do his/her job just like other professions. Consider surgeons. They use lancets in surgeries and they have to be very good at using them. However, surgeons need much more than being good at using a lancet in order to heal a person. They have to know the organs, the veins and the very complex structure of human body. Using a lancet very well does not make you a surgeon.
Data scientists need the tools like Python, R, Pandas, Scikit-learn, Theano, TensorFlow and so on. But, what really matters is not how good you are at using these tools. At the end of the day, if you cannot bring any value to the table, the tools mean nothing. By value, I mean improving a process, building an accurate predictive model, increasing the profit of a business, make the life easier for some people, contributing to a domain using Machine Learning and the list goes on.
We can teach a person how to use data science tools but we cannot teach them how to become a data scientist.
We can teach a person how to use these tools but we cannot teach them how to become a data scientist. We can lead them in the way but being a data scientist requires more than using tools.

A data scientist is:
- Curious
- Critical thinker
- Argumentative
- Not afraid to criticize
- Looking for answers
- and more…
These are some skills or characteristics that you cannot learn from a software package but very important to make use of data. We can master all deep learning frameworks and we can code with eyes-closed. However, if we do not produce, what is the point?
If I were a hiring manager and interviewing for a data scientist position, my questions would lean towards learning the perspective, curiosity and critical thinking of the candidate. I would be less interested in the ability of mastering the software packages or frameworks.
We can and should learn how to use tools but our thinking needs to reach beyond the tools. We should be able to look at the data from many different perspectives and interpret the data in many aspects. What I would suggest to aspiring data scientists is to define a problem and try to solve it. When you are solving, you will need and learn how to use the tools. You may even create your own tool because you will exactly know what you need.
Conclusion
The tools used in data science domain are absolutely necessary and many thanks to the community that build and let us use these tools. However, tools are not the answers. Data scientists are the ones who find answers. Tools just help them reach the target.
Thank you for reading. Please let me know if you have any feedback.