Are you learning Data Science or interested in learning but not sure how you would be able to kick start a career in data science? Getting your first job in data science can be tricky. Many people from diverse fields and students following my "Learn Data Science in 100 days" initiative asks me about getting job opportunities and some of them even often doubt their ability to find a job in data science.
In this article, I will clearly put forth a step-by-step approach that can help in increasing your chances of getting hired as a data scientist. It doesn’t matter if you have a degree in data science or programming background, if you are keen and ready to put in the hard work then nothing can stop you.
- Step 1— Get to know about tools commonly used by data scientists
- Step 2— Understand the basic concepts
- Step 3 – Build a portfolio
- Step 4 – Get connected to people in the field
- Step 5 – Look out for opportunities and references
By the end of this article, you should be in a position to make a clear plan for yourself about getting started with a career in data science. Though the steps mentioned above are the same for anyone looking forward to getting started with data science the time one spends in each step could be very different from individuals to individuals depending upon the level of expertise in data science. If you have a preference for video format check here – https://www.youtube.com/watch?v=H91xvW7ub24
Step 1 – Get to know about tools commonly used by data scientists
You can ask me how is it possible to get to know about the tools and techniques commonly used by the data scientists in their day-to-day job. It is very easy, no need to go through the LinkedIn profiles of hundreds of data scientists. All the information you would need is already collected as part of the kaggle machine learning and data science survey. The 2020 kaggle machine learning and data science survey received over 20K responses from data scientist around the world and the entire survey data is accessible to the public, this survey has everything you would want to know about the data scientists, the tools and technologies used by them, their salary, algorithms and other platforms used by them. Below data is based on the 2020 survey,

Since we are talking about getting started with a Data Science Job, it is really important that you are familiar with one programming language (Python/R/C/C++/Java), SQL and then anyone dashboarding tool as well to increase your chances of getting hired as a data scientist
Being familiar with these tools is important but the most important part is making sure your resume reflects the same as well. Because most organizations use these as keywords to shortlist resumes. If your resume has not been shortlisted for data science jobs then check if it has these keywords.
Step 2 – Understand the basic concepts
Your knowledge about the basic data science concepts is critical as the initial test or technical interview will be mostly on the key data science concepts that would be used in the day-to-day job. If you are interested in making a plan to learn the data science concepts check this article I have made a detailed plan about learning the data science concepts in 100 days,
Below is the list of concepts you need to be familiar with
Basic programming concepts
Your programming skills will come in handy when you work on data analysis and feature engineering. As an aspiring data scientist you are not expected to be an expert programmer, but you need to know enough programming such as,
- Basic syntax
- Data types
- Ability to implement logical conditions
- Knowledge about loops and their implementation
- Creation of function – key to ensuring reusability
In the recent kaggle, 2020 machine learning and data science survey over 70% of the respondents recommends aspiring data scientists to start learning python first also about 80% of data scientists primarily use python in their job so if you are looking forward to picking a programming language go for the most popular one. Most data science job openings as well prefer job applicants familiar with either python or R. The main reason for mentioning the programming language as an eligibility criterion is because many organizations don’t have the capability to support developers with a diverse programming background. Also, since there would be a lot of collaboration it makes sense for the team to use the same programming language.
Libraries/tools/techniques for data analysis
You would also be expected to have enough knowledge to perform the exploratory data analysis. Make yourself familiar with the libraries and techniques commonly used in the data analysis phase of the project. Data analysis plays an important role, about 70% to 80% of the time in a data science project would be spent on the data analysis as the quality of the data analysis would directly impact the quality of the project’s outcome. In case, you are working on a data science project to make a prediction. Here the extent and quality of the data analysis will impact the features you are using in the predictive model hence the better the quality of the analysis the better the features will be and hence the outcome of the model will also be better.
Knowledge about handling data issues like missing data and outliers
No data is perfect, almost any dataset you work, will definitely have data issues. In a real-life project, it is absolutely normal to work on data with a lot of issues but the key factor here is you should be familiar with the techniques that can be used to handle these data quality issues.
By working on data science competitions you would exactly learn about handling the data issues. Kaggle is one of the best platforms when it comes to data science competition and their discussion forums will be very insightful in this context.
Basic statistics concepts
Statistics concepts are again very commonly used in data science projects, from selecting a random sample to exploring the dataset using the descriptive statistics, to use relationship metrics such as causal and correlation to understand the relation between different aspects of the data, to extract insights based on inferential statistics.
One simple example from a real-life scenario where statistics concepts are used is, let’s say you work for a product company and you would want to understand the impact of a newly designed feature. These new features will be initially launched to a smaller audience and if proven effective then they will be launched to a larger audience incrementally. To prove that the new feature is effective enough to be launched to the larger customer base one needs to use statistic tests.
Data visualization
This is important for data analysis to extract the patterns in the data as well as to communicate the results to the business stakeholders. If you are interested in learning to build some amazing visualization check the below article.
Step 3 – Build a portfolio
This step can happen concurrently as you learn the data science concepts. The objective here is, as you learn the data science concepts you can work on creating a portfolio that can be a testimony to your learning. Also, it will definitely help in better understanding the concepts.
By providing a link to your portfolio, like the competition you have participated in Kaggle, the learning scripts as well as hobby projects you have uploaded into your git repository and the blog you have written, will increase your success probability in getting a data science job.
Simple Scenario to explain the importance of having a portfolio
Let’s say that there are over 200 applications for a couple of vacant data science job positions. Now it is not feasible for the recruiter to respond to every single application so first they would filter the resume based on the keywords and let’s say from over 200 they can shortlist 50 resumes, it is still quite a big number hence will not be feasible to organize an interview for all 50 so they generally try to filter the resume further based on content provided in your resume. Now if you provide links to your portfolio then as compared to someone with no reference to their work the probability of your resume getting shortlisted will be much higher. Also the better the portfolio you have the better are the chances for you to drive your interview conversation towards your areas of strength.
This is a common scenario with most companies and hopes it conveys the importance of having a portfolio. I will share some ways to create your data science portfolio using the below platforms/tools.
Kaggle
Kaggle is one of the world’s largest community of data scientists. A lot of data science competitions are hosted on this platform. Kaggle offers a lot of opportunities too, such as
- Learn new concepts, especially form the discussion forum
- Meeting other like-minded people and the opportunity to work with them
- Recognition across the industry
- Contribute towards building datasets and data science notebooks
Personally, Kaggle played an important role in my career. I have learned a lot of concepts by participating in competitions on this platform. It also helped me in gaining better self-confidence, got me some good friends, and some amazing opportunities which furthered enhanced my career. I will forever be grateful for this platform.
If you are interested to know about mastering your data science skills using Kaggle check this article
Blog
When you start to put your thoughts into writing then a lot of questions come up and in the process of answering these questions, your understanding of the concept takes a better shape.
It might be slightly difficult to get started with blogs especially if you have a dislike towards writing contents but trust me to try to break your own entry barrier and start simple like putting up your understanding about a concept or simply explaining your hobby project. Try to read more articles from others and this could help in improving your writing and could give you more ideas as well
Git
I strongly advise people to start using git as early as possible not only to keep a track of the versions but also to keep track of all their work. Try to follow the coding standards so if potential recruiter lands on your git repository then they can clearly understand your capability.
Again you can start small and early, like as you learn the concepts you can continue to push your learning scripts, push your hobby projects that you would want to showcase. Also, when you work as a team it helps in version control and better collaboration
Step 4 – Get connected to people in the field
As much as your depth of knowledge it is equally important to have a good professional network. All the organizations I have worked with so far have a preference for the profile referred by an internal employee. Also, a lot of people share the job availability in their organization on their LinkedIn profile as well so staying connected with a lot of people will provide you better exposure.
It is OK to send connection requests on Linkedin to people whom you haven’t met in person, but before sending out an invite make sure that your profile is up-to-date and has your profile picture so that people know that it is not a random spam account.
Meet-up
Look for popular data science-related meetups around you. A lot of start-ups sponsor locations for data science-related events. These are again a great place to meet like-minded people in person. Usually, in all these events, there will be time for networking before and after the main event and you also get to see some amazing presentations.
Meeting these like-minded people and getting connected to them over LinkedIn can bring some interesting opportunities your way. I would like to share a personal story here, through one of the meetup events I got acquainted with a few data science professionals and later one of them introduced me to DataKind an organization that uses data science to serve humanity. I got connected with one of their initiatives and joined them to work on it and till now it remains one of my best and most satisfying projects I have ever worked on. We built a recommendation system based on the nutrient content of the soil to help small farmers suggest the best crop as well as informational videos best suited for them to improve their yields.
Step 5 – Look out for Opportunities and references
Keep checking for opportunities and if you are very keen about working for some organizations then set a job alert using LinkedIn so that when there is a relevant opportunity from those organizations you get to know them immediately.
You can also reach out to your connections and check about current and upcoming data science-related opportunities in their organizations. When you have an employee referring your profile then there are better opportunities for getting shortlisted. Always remember to apply for any job position with a cover letter and provide links to your LinkedIn profile, git repository, blog, and other relevant sources showcasing your work.
Don’t be afraid of reaching people from companies where you are looking out for opportunities and asking them to refer your profile in their organization. There is nothing to lose if there is a positive response then that’s great else keep trying. Generally, when your LinkedIn profile is complete you tend to get a positive response.
About me
I am a Data Science professional with over 10 years of experience and I have authored 2 books in data science. I have a YouTube channel where I teach and talk about various data science concepts. If interested, subscribe to my channel below.