And why it’s a good idea to learn more than just Data Science

Over a little over two years, I have gone from learning to program on Python to building dashboards and pipelines for customers on a slew of machine learning and deep learning problems. When I started my career, I had begun with a job that hired me on my educational strength. My background is in Solid Mechanics and for over five years with an advanced degree to boot, I was learning and constantly applying my foundational knowledge into solving real world problems and partly on fundamental research.
Over time I was running more complex simulations, only to wait for days to optimize them and was looking at better ways to make sense of the corpus I had. Other difficulties were to find optimal number of material combinations that would offer experimentalists to quickly test only a few combinations. I had then begun reading articles that tried to marry fundamental engineering concepts to statistics and data science with successful results. This is when I decided to make a jump, allowing myself to leverage my domain knowledge and empowering it with a data driven approach.
The current scenario
Even if you aren’t a data scientist, it is not a bad idea to start learning the basics of data analytics and most certainly a bit of statistics. It helps analyzing your data in a much better way even if your interest isn’t Data Science. In a manufacturing assembly line, for instance, how many samples would you pick for quality testing, to accurately represent your entire production lot? Even if your interest lies outside of engineering, statistics can be a powerful tool to analyze the data that is presented to you in various scenarios.

Accessibility to good educational resources which are also easy on the pocket has never been better. Open source education and community sharing platforms imply that there are plenty of avenues to seek help from. This is great from the perspective of "democratization" of education and access to the latest releases on new technologies.
The flip side is that there are a lot of people out there who can easily do some basic feature Engineering, run prediction algorithms and throw out some accuracy scores. While this is a great asset, the demand for this will rapidly fall down with automation around the corner and no-code or low code threatening some of the developer jobs. It is estimated that over 60% of the developer jobs will be automated by 2025. This is where it becomes crucial to develop expertise on domains rather than on the tools in this disruptive space when tools such as this promise to deliver what data scientists spend months on doing, in a matter of minutes!
Specialized Domains and Applications
Data science is not a sequence of actions aimed at predicting a phenomenon in a robotic fashion. Data science is much more than that – it is a process of arriving at informed answers about a possible multitude of questions.
- The Right Question: To solve problems in any field, the first stage is to ask the right questions. What problem am I trying to solve? Is data science even needed when I have a plethora of simpler options in front of me? What data given to me is relevant and can I throw away the seemingly irrelevant data? Can I trust the data source? Are there automated tools or simpler engineering solutions already at my disposal, that I can use to solve the problem at hand?
- Entrepreneurial spirit: In an environment that thrives on self motivation, there is a higher likelihood of finding the right problem areas to solve if you are familiar with the industry. It helps one to move from a mere transactional role to that of a more indispensable part of an industry; one who can see the problems to be solved from a macroscopic viewpoint to the one who understands the nitty gritties of data wrangling.

- Focus on customer: Effective communication to the customer requires one to go one step ahead of explaining the curves. Learning complex algorithms is fantastic, but to be able to convert the learnings into business specific outcomes and communicating effectively, is a gap that is commonly observed in many organizations.
How to successfully manoeuvre in a dynamic field
New technologies will come and go. By the time you master one algorithm, there will be a new one out there. It becomes an endless race to be abreast of every new algorithm that emerges. Besides, it is also advisable to have these algorithms tested on their stability over a period of time. Starting with simpler but effective architectures are a better approach and have proven to be stable for production.
Relying on time tested technologies are sufficient to solve most problems that you will encounter in the real world. Instead spend a bit of your time on understanding the data, the process of data collection, alternate sources of data and spending more time with your customers or associates to get a bigger picture on the problems they intend to solve. At the same time, it is imperative to be aware of what the recent developments are and hone the skills that will bolster your area.
Focusing on getting a model that is actually used by people is far more important than having that 1% increase in accuracy of the models. In industries that are slow to adapt the power of AI in their businesses, it is important to first demonstrate what the power of this sub-science is. This is where effective communication and ability to ‘sell’ your skill comes to the forefront. Once the anchor is firmly placed, you can dabble with more complex algorithms.
Data engineering will probably have more and more value as time passes. Having beautiful graphs shown in presentations is the end of a majority of data science projects. There is an utter lack of data scientists who can actually put projects in production and monitor them over time – end to end pipelines. Developing a more well-rounded understanding of the problem at hand, quickly gauging the tools at your disposal and ‘getting things done’ without compromising on the authenticity of work is of vital importance in business. My suggestion would therefore be to apply simple solutions and start small. Use complex algorithms only if there is a need.
Concluding thoughts
The field of data science is fast changing. The way we will interact with data will undergo a drastic change. As more and more automated tools are brought into the hands of laymen, the need for specialized skills that are gleaned only with a significant time spent on solving industrial problems will rise exponentially.
Data science tools are just that — tools. They are not problem solvers. The problem to be solved is much bigger than running a notebook. Notebooks are essential but only one of the gears of a much bigger gearbox needed to run the engine in tandem with several others. Also remember that you cannot be superhuman and learn all skills required to solve a problem. Most problems have been solved by minds working together.
As I made my shift from a core mechanical engineering stream to data science, I decided to still be rooted to the field I had studied through formal channels and only opted on strengthening it through the added tool of data science. This has enabled me in solving problems in ways a person not familiar with the industry would find hard to grasp.
Constructive feedback is welcome!