At the end of 2020, right around the time when everyone was making new year’s resolutions, I decided that my goal for 2021 was to begin my journey into learning Data Science.
Part of the learning curriculum I developed focused on the importance of completing projects as a way to further my knowledge and to begin applying the skills I was learning.
The Step-by-Step Curriculum I’m Using to Teach Myself Data Science in 2021
That brings us to this article: mapping out the 7 data science projects I plan on completing this year, and how those projects will strengthen my skills in specific areas. I wanted to select projects that focus on subjects that I’m interested in, and I also selected projects that came with source code that I could reference if I got stuck. Furthermore, I also wanted to pick projects that were a bit more unique than the regular ones that are regularly shared. So, without further ado, let’s jump right in!
Project #1: Visualizing a Spreading Virus

- Language: Python
- Difficulty: Beginner
- Data Set: Novel Corona Virus 2019 Dataset
- Source Code: How to Visualize the Coronavirus Pandemic with Choropleth Maps
Whether you chalk it up to morbid curiosity or a need to understand data visualization, this project seemed like a no-brainer. The pandemic has given us a plethora of data to direct and visualize. Furthermore, I’ve always been fascinated by the beautiful map visualizations that people have been able to build. So, it made sense to make this easy project one of my first forays into data science.
Skills used:
- Python libraries (NumPy, pandas, Plotly)
- data visualization using choropleth maps
Project #2: Cats vs Dogs Classification

- Language: Python
- Difficulty: Beginner
- Data Set: Cats and Dogs dataset
- Source Code: Cats vs Dogs Classification
Whether you’re a cat person or a dog person, this project has something for everyone. To be honest, I just wanted to do this project so I could look at cute animal pictures. And I suppose I also wanted to improve my neural network and deep learning skills too.
Skills used:
- Python libraries (NumPy, pandas, Keras, scikit-learn, MatPlotLib)
- neural networks
- deep learning
- data visualization
Project #3: Real-Time Face Mask Detection

- Language: Python
- Difficulty: Beginner
- Data Set: Face Mask dataset
- Source Code: Real-Time Face Mask Detection
With masks not going away any time soon, why not make the best of a bad situation and work on a Machine Learning project that detects whether or not you’re wearing a face mask? Facial recognition has always interested me, so this project idea was a no-brainer.
Skills used:
- Python libraries (Keras, OpenCV, NumPy, scikit-learn)
- machine learning
- neural networks
Project #4: Music Genre Classification

- Language: Python
- Difficulty: Intermediate
- Data Set: GTZAN dataset
- Source Code: Music Genre Classification
With all music produced in the last ten years sounding exactly the same, a program that classifies music by its genre would be incredibly useful to avoid confusion. Music isn’t really my thing, but this was such a unique type of project that I just had to add it to my list. Machine learning can classify the music files using their frequency and time domain. Once analyzed, the songs will be slotted into one of ten different musical genres.
Skills used:
- Python libraries (NumPy)
- machine learning
- deep learning
- K-nearest neighbors
Project #5: Road Lane Line Detection

- Language: Python
- Difficulty: Intermediate
- Source Code: Lane Line Detection
Self-driving cars have come a long way and it’s no doubt that they’ll play an instrumental part in our future. If you think about it, it’s kind of spooky that a few lines of code are all it takes to keep a car from crossing a solid white line into oncoming traffic. Like many, I’m wary of the technology, so I thought it would be a good idea to get my hands dirty and see how it all actually works.
Skills used:
- Python libraries (Numpy, MatPlotLib, OpenCV)
- deep learning
- machine learning
- computer vision
Project #6: Fake News Detection

- Language: Python
- Difficulty: Advanced
- Data Set: news.csv
- Source Code: Detecting Fake News with Python
This project would have been pretty useful anytime in the last four years. However, just because the wicked witch is dead doesn’t mean that fake news will be taking a break any time soon. Therefore, why not build something that will stretch my data science skills and will also be useful when it comes time to lay down some hard truths?
Skills used:
- Python libraries (NumPy, pandas, scikit-learn)
- machine learning
- model building and fitting
Project #7: Stock Market Price Prediction

- Language: Python
- Difficulty: Advanced
- Data Set: List of S&P 500 Companies
- Source Code: Stock Market Price Prediction
Often considered to be the holy grail of data science projects for beginners due to its factor of uncertainty, stock market price prediction is going to be a great way to test me to see how my data science skills have grown and improved over the year. While this project may be a long-shot, I’ve included it in here anyway for a time when I’m feeling ambitious. Because this project is very intensive, I’ve included some additional resources that might come in handy if I get stuck.
Additional resources:
Predicting Stock Prices with Python
Simple Stock Price Prediction with ML in Python – Learner’s Guide to ML
Skills used:
- Python libraries (pandas, NumPy, scikit-learn)
- machine learning
- deep learning
- linear regression
- recursive neural network architecture
Final thoughts.
Thanks to the vast amount of resources available on the internet I was able to select a wide variety of projects that focus on different areas of data science that I’m looking to improve upon. Each of the projects will help me practice and develop skills that I learn as I go through each section of my curriculum (Programming, mathematics and statistics, data analysis and visualization, and machine learning). The beauty of the projects I chose is that they are all expandable, so in the future, I can build similar projects to analyze different data sets or to complete different tasks.
With a new year bringing a fresh start, it’s the perfect time to jump in headfirst and try to accomplish a new goal. Thanks to the welcoming data science community, I look forward to getting constructive criticism on my projects once I share what I’ve managed to accomplish. Here’s to a new year and completing big goals!