Be sure to SUBSCRIBE here to never miss another article on Data science guides, tricks and tips, life lessons, and more!
When I first started learning Data Science, there were so many topics and techniques to learn that it often felt overwhelming to decide what to learn and in what order to learn it.
Looking back, I can surely say that certain skills that I learned have been much more practical and useful than others. In this article, I want to share with you three skills that have ultimately accelerated my career and increased my salary by 60% in the past year.
The reason I attribute most of my credit to these three skills is that these three skills have allowed me to Work completely autonomously, helped me discover insights and ideas with incredible business value, and have given me the ability to ship results faster.
With that said, let’s dive into it!
1. Building Efficient Data Pipelines
What
Data pipeline development refers to the development of pipelines (or systems) that clean and transform data inputs to a desired output. I’m specifically referring to the transform step of the ELT or ETL process, where the data is transformed after the raw data is already ingested.
The way this is done is different from company to company depending on their tech stack. At my company, we use Airflow to schedule queries and build BigQuery tables.
Why
- The ability to build data pipelines can help you build models faster. At every company that I worked at, the data that I needed for my models was never perfect. Either the existing tables had bugs (eg. duplicate rows), or the existing tables were not in the format that I wanted them in (eg. events-based table that needed to be on a user-level). By being able to build the tables that I needed without the help of others, I was able to build models months faster than if I needed to wait for someone. Building models faster meant crushing more projects and improving the bottom line even more.
- The ability to build data pipelines can help democratize data for business users. Not only is building pipelines useful for yourself but for others as well. In my career, I found opportunities where I could build tables that were useful for my models and also useful for particular teams and departments for analytics and reporting purposes. This helped me build stronger relationships with other teams in the business and ultimately improved my rapport.
- The ability to build data pipelines allows you to engineer new features. By building data pipelines, you get a better understanding of the underlying data that you’re working with. This helped me think of creative and impactful features that I wouldn’t have thought of if I didn’t work so closely with the data. And because I was able to come up with such novel features, the performance of my models improved significantly. And this leads me to my next skill…
Be sure to SUBSCRIBE here to never miss another article on data science guides, tricks and tips, life lessons, and more!
2. Feature Engineering
What
In simple terms, feature engineering refers to creating new features that are not explicitly already available. For example, if I had the birth date of my users, I could create a new feature called "age" using the users’ birth dates.
Why
- Feature engineering allows you to significantly improve the performance of models. Through my experience, the number one factor that determined the strength of my models was the ability to engineer features with high predictive power. This ultimately led to better forecasting and better targeting.
- Engineering strong features can help give you a better understanding of your data and the business overall. By engineering relevant features with high predictive power, you can get a better understanding of the associations between different variables which can give you a better understanding of what drives the business.
3. Deep Dives
What
A deep dive analysis is an extensive investigation or analysis of a question or topic of interest.
It can be investigative or explorative in nature. An investigative dive dive analysis answers questions like "why did sales drop by 25% last month?" whereas an explorative deep dive might answer a question like "do higher levels of app engagement correlate with more profitable users?"
Why
- Deep dive analyses allow you to discover golden opportunities. What I mean by "golden opportunities" is things that aren’t immediately evident but have a significant impact on the business. These are the things that get noticed by the execs and get praised for 😉
- Deep dive analyses give you a better understanding of the products and data that you’re working with. Since you work directly with the data, you often develop the highest level of expertise in the respective domain. This makes you more knowledgeable and more valuable.
- Deep dive analyses help you make fewer errors and make you more thorough in your work. Deep dives are not easy. It’s especially difficult when you don’t know what you don’t know. However, the ability to perform thorough deep dives will make you a better worker overall. It will improve your curiosity, make you more meticulous, and help you learn more overall.
Thanks for Reading!
Be sure to SUBSCRIBE here to never miss another article on data science guides, tricks and tips, life lessons, and more!
Not sure what to read next? I’ve picked another article for you:
and another one:
– Terence Shin
- If you enjoyed this, SUBSCRIBE to my Medium for exclusive content!
- Likewise, you can also FOLLOW me on Medium
- Sign up for my personal newsletter
- Follow me on LinkedIn for other content