Applying for a Data Science position is not an easy feat because of how complex the preparation is; From learning programming, understanding how to analyze data, reading statistic books, and creating the data science project as a portfolio – there are so much to do.
The problem is that you still need to stand out from the other applicants. Many people would have similar skillsets – Programming, fundamental statistic, dashboard, etc. That is why the only way to differentiate your application is by building a stand-out data science project.
In this article, I want to outline what you should include in your first Data Science Project from a data scientist’s perspective. Let’s get into it.
Data Science Project Definition
Before we talk about what you should include in your data science project, let’s define what we mean by "Data Science Project." When we talk about the data science project, it is not limited to the machine learning or algorithm aspect, but it is expanded to any solutions using data to solve the business problem.
The data science project thus is a solution that developed by using various tools and based on the historical data to address a particular business problem, ideally to solve or eliminate the problem that the company had.
Why did I define the Data Science project from the beginning of this article? From my personal experience, many new applicants in the field present their data science project that only showcases the technical aspects without trying to solve a real problem. It is excellent that applicants know in-depth about machine learning, but never forget what is your data science project for – it is to solve the business problem.
Now that we already get the Data Science Project’s definition let’s see what the project should have – especially for the first project you create.
1. Solve a Real Business Problem
First, the essential aspect of your data science project is solving a real business problem. It did not matter how great your model metrics were or how fancy the dashboard you developed. If the project did not showcase any possibilities to solve the real business problem, it might not show how you would perform in the data scientist position.
When I talk about real business problems, it isn’t something novel or the most complex problem ever. It could start simple such as:
- Propensity to buy a new product
- Prediction of churn customer
- Fraud detection
And many more. The point is that the business problem stated above are realistic and often shown up in every company. You might think it was evident that you should present a real project. However, you would surprise by how many people, including an absurd project.
Another question is, how could you stand out from all the other applicants if everybody has the same things in mind? To stand out, try to imagine your dream company by answering the following questions:
- Where do you want to work?
- What kind of industry do you want to employ?
- What business problem from the industry and company do you want to solve?
- How would you present the solution to the management?
Why do I ask you to imagine the dream company? Because to stand out from the others, the best way is to have a data science project that solves the business problem presented in the company you want to apply.
If you are applying for a data scientist position in the insurance company, the company might notice you if the project is about re-insured prediction. However, the re-insured project might not be suitable for the company with image-detection products. By imagining your dream company, you would have a direction for your first data science project.
In short, try to develop your data science project that solves a business problem and is specific to the company you imagine.
2. Clear and Structured Project
From a data scientist’s perspective, the result is not everything. You might have a 99% accurate model, but what is your process to achieve this model? this is what the data scientist wants to know. Data scientists know that metrics could be misleading because of bad flagging, information leak, and many other reasons. That is why we want to see how you come out with the solution.
Having a structured and clear project means that your project has a clear business problem you want to address and shows the process structured. Also, every step you do to develop your project would need a reason why you did it – for example, why did you choose this data source? Why do you use these features? Why did you drop this data? Why did you select this model? etc. bottom line is you need to justify every step in your project.
If you have a hard time thinking about your data science project structure, then my other article could help with that. Using the CRISP-DM methodology, you could define your project’s structure and have a better way to present the project.
3. Showing Creativity
At the outset, your data science project would need to solve real business problems and having a clear structure. However, it doesn’t mean that you cannot be creative in the process. After all, creativity in the process is one way to make your data science project stand out.
Being creative might come inherently and needs a lot of learning, but it could also come from a trial-and-error experiment. You might ask a question, what does it mean to be creative with your data science project? Well, it could mean a lot of things. For example:
- Business problem approach – is the business need a machine learning model to solve, or do you manage to find another way?
- Data cleaning – how do you clean the data, and how do you fill the missing data?
- Feature engineering – Are there any features you could create from other features?
- Feature selection – how do you select which features to use?
- Model implementation – what model did you decide to develop?
- Accessibility – how could you make the model and the model result accessible to everybody?
These are examples of where you could be creative; the answers to all the above questions would always depend on your creativity and experience. It might come hard at first, especially for your first data science project, but it would get easier in time.
If you are still confused about becoming more creative, I suggest trying to read many articles and data science projects people have developed. You might find inspiration from others and adapt it for your own.
Conclusion
Developing your first data science project to stand out from the other applicant is a hard thing to do. That is why in this article, I outline some of the aspects you should include within your project from the data scientist perspective. They are:
- Solve a Real Business Problem
- Clear and Structured Project
- Showing Creativity
Also, when we talk about a data science project, it is defined as a solution developed using various tools and based on historical data to address a particular business problem.
I hope it helps!
If you enjoy my content and want to get more in-depth knowledge regarding data or just daily life as a Data Scientist, please consider subscribing to my newsletter here.
If you are not subscribed as a Medium Member, please consider subscribing through my referral.