How To Learn Data Science - My path

Tools and Resources I use

Senthil E
Towards Data Science
16 min readAug 17, 2019

--

Last week I published my 3rd post in TDS. Before the next post, I wanted to publish this quick one. I hope this post helps people who want to get into data science or who just started learning data science. In this post, I will share the resources and tools I use. It is basically all the apps and links I use day to day activities. All the below I used or will be using in the future. If you want me to add any other info then please post it in the comment section and I will include it.

My data Science Curriculum

This chart I created. This should give a basic idea of what tools and skills needed to work in DS and ML.

Thanks to Andrew Chen for coming up with the roadmap . Please check the above link.

Another one I found online

Programming Languages:

I started off with R programming. Then I moved to python and pretty much stayed with python and Jupyter Notebook. So for me, python and SQL is a must.

1.Python or R

2.SQL

Python:

Important Libraries used:

  • NumPy
  • Pandas
  • Matplotlib
  • SciPy
  • Scikit-Learn
  • TensorFlow
  • Keras
  • Seaborn
  • NLTK
  • Gensim

Some useful resources to start

1.Basic python tips

2.Some example

3.Youtube Channels

SQL

2.Data Visualization:

  • Matlotlib
  • Seaborn
  • Altair
  • Plotly
  • Tableau
  • D3.js
  • Bokeh

3.Web Scraping:

  • Beautiful Soup
  • Scrapy
  • Urllib
  • Selenium
  • API’s like Twitter API’s

4.Process

There are 5 core activities of data analysis:

1. Stating and refining the question

2. Exploring the data

3. Building formal statistical models

4. Interpreting the results

5. Communicating the results

5.Math in DS/ML

Resources:

http://www.cis.upenn.edu/~jean/gbooks/linalg.html

6.Machine Learning:

Resources:

7.Big Data

Nice one from 365 Data Science

4v’s

Resources:

1.Data Engineering Guide.

2.Learn Data Engineering

8.My IDE

Jupyter Notebook

Pycharm

R Studio

IntelliJ IDEA

The following apps are very helpful and I use it

* Quora

* Medium

* Blind

* Reddit

* Linkedin

* Udemy

* Coursera

* Youtube

* Meetup

* Datacamp

1. Reddit:

I have subscribed to the following Reddit’s and it is very helpful

  • Dataengineering
  • Dataisbeautiful
  • Datasets
  • Learndatascience
  • Learnprogramming
  • Learnpython
  • Machinelearning
  • Learnmachinelearning
  • Python
  • Rstats
  • Computervision
  • learnprogramming
  • Businessintelligence
  • programming
  • Rlanguage
  • Scala
  • AWS
  • bigdata
  • SQL

Create a personalized feed by subscribing to the above subreddits.In the app, you can sort the post by popularity by the day, week, month, year,alltime,etc.

2.Blind:

This is a useful community. You can find out what's happening in DS, ML and Software engineering in other companies.

3.Udemy for Business:

I think most of the companies have tied up with Udemy and I have the Udemy for business through my employer. This is very nice and super helpful. There are around 3000 courses. For example, I was trying to install spark, Hadoop and scala. There was not much information available online and was struggling to install it in my pc. I just went to Udemy and enrolled in a course which teaches apache spark and pretty much installed in less than an hour or so. This is really helpful for me. Also, the prices are very reasonable like 9.99$. There are good courses available.

4.Linkedin Learning:

Again I have access to all the LinkedIn learning courses. Its been provided by my employer and also free from the local county library membership. I found some great python and R courses in Linkedin Learning.

Also, check the Linkedin Groups

  • Data Mining, Statistics, Big Data, Data Visualization, and Data Science
  • Artificial Intelligence, Deep Learning, Machine Learning
  • Big Data, Analytics, Business Intelligence & Visualization Experts Community
  • KDnuggets Machine Learning, Data Science, Data Mining, Big Data, AI
  • Cloud Computing, SaaS & Virtualization
  • Data Warehouse — Big Data — Hadoop — Cloud — Data Science — ETL
  • Artificial Intelligence, Deep Learning and IoT
  • SQL Server Business Intelligence(BI)
  • Internet of Things
  • Bank and Finance Technology — FinTech Banking Systems Financial Executives
  • Cloud Computing
  • Python Community
  • Python Data Science and Machine Learning

5.Medium:

This is the best platform to learn and also publish your articles.

Some of the publications I follow are

  • Towards Data Science
  • The Startup
  • HackerNoon.com
  • freeCodeCamp.org

I publish in

· Towards Data Science

TDS is the fastest growing publication in DS .

  • 218,052 subscribers
  • Growing by 436.8 followers per day

6.Youtube:

Obviously, there are a lot of channels available for DS, ML, and AI. I follow a few

  • Data School
  • Google Cloud Platform
  • Learn R
  • Datacamp
  • Simplilearn
  • Edureka
  • Marinstatslectures
  • Sentdex
  • Siraj Rawal Channel

7.Udacity:

I didn’t do any paid courses in Udacity. There are some free courses and you can check it out.

8.Other useful sites

  • Kdnuggets.com
  • Datacamp.com
  • Khanacademy.org
  • geeksforgeeks.org

9.Kaggle:

I think everyone knows Kaggle and no needs introduction

Also, check out their mini-courses which are helpful

10.Github:

Obviously, you can build your portfolio. I have a few projects like web scraping, twitter analysis, data visualization using python, etc. I plan to add more going forward. Also it a great place to search for similar projects and you get a lot of help and ideas from other projects published in Github.

11.Meetup:

I have enrolled in a few local San Jose and San Francisco meetups. Unfortunately, I don’t have time to make it often. This is very helpful in learning and networking. I have seen a python meetup in Linkedin building which had a few hundred developers attending. So obviously a golden opportunity to meet people and learn from them. Even there are some meetups that do hands-on work once or alternate days during the weekdays. There are some workshops for the whole weekend. Almost all workshops are free or some charge very minimal fees.

12.Quora:

I subscribed to the following feeds

  • Algorithms
  • Competitive programming
  • Data Science
  • Machine Learning
  • Deep Learning
  • Big Data
  • Data Analysis
  • Data Visualization
  • Python
  • Hakon Hapnes Strand
  • Mike West
  • William Chen
  • Lili Jiang

13.Podcast

  • Data Skeptic ( Spotify)
  • The week in machine learning and artificial intelligence( Spotify )
  • Superdatascience.com
  • https://talkpython.fm/

Also, check the below ones

Paysa.com & Levels.fyi

14.Stackoverflow

I think without StackOverflow you can’t code:-)

15.Problem Solving sites:

  1. HackerRank (http://hackerrank.com/)
  2. CodeChef (http://codechef.com/)
  3. HackerEarth(http://hackerearth.com/)
  4. LeetCode (http://leetcode.com/)
  5. Topcoder (http://topcoder.com/)
  6. Kaggle (http://kaggle.com/)
  7. ChallengePost (http://challengepost.com/)
  8. CodeForces (http://codeforces.com/)
  9. Brilliant (http://brilliant.org/)
  10. SPOJ (http://www.spoj.com/)
  11. Project Euler (https://projecteuler.net/)
  12. CodingBat (http://codingbat.com/)
  13. Codewars (http://www.codewars.com/)
  14. Codility (https://codility.com/)
  15. Codingame (https://www.codingame.com/)
  16. CoderByte (https://coderbyte.com/)
  17. CodeEval (https://www.codeeval.com/)
  18. UVA Online Judge (https://uva.onlinejudge.org/)
  19. CodeFights (https://codefights.com/)
  20. CheckiO (http://www.checkio.org/)
  21. Talentbuddy (http://talentbuddy.co/)
  22. PythonChallenge (http://pythonchallenge.com/)
  23. LintCode (http://www.lintcode.com/en/)
  24. Rosalind (http://rosalind.info/problems/locations/)
  25. CrowdANALYTIX (https://www.crowdanalytix.com/)
  26. SQL-EX.RU (http://sql-ex.ru/)
  27. Kattis (http://www.kattis.com/)
  28. CodeKata (http://codekata.com/)
  29. CodeAbbey (http://codeabbey.com/)
  30. FightCode (http://fightcodegame.com/)
  31. BeatMyCode (http://www.beatmycode.com/)
  32. TunedIT (http://tunedit.org/)
  33. MLComp (http://mlcomp.org/)
  34. HPC University (http://hpcuniversity.org/students/weeklyChallenge/)
  35. https://practiceit.cs.washington.edu/

16.Free Courses:

Helsinki University in Finland has launched a course on artificial intelligence(Elements of AI) — one that’s completely free and open to everyone around the world. Unlike Carnegie Mellon’s new undergrad degree in AI, which the institution created to train future experts in the field, Helsinki’s offering is more of a beginner course for those who want to know more about it.

Machine Learning

Artificial Intelligence

Machine Learning for Data Science and Analytics

Machine Learning

Machine Learning Unsupervised Learning

Data Science 21 courses

Certificates in Big Data, Ai and ML

Free Google Machine Learning Courses

17. Books

Some important bookmarks I always refer

1.All DS cheatsheets in one place

2.DS resources in one place

3.DS Notes

4.DS Tutorial

5.ML Tutorial

6.DS Glossary

7.DS Toolbox

8.Plotly Tutorial

9.DS Github repository

10.Github

11.Industry Machine Learning

12.Data Engineering Guide.

13.Learn Data Engineering

14.Google Dataset Search

15.Data Engineering Study Guide:

16.Regex

https://regex101.com/

17.Google OpenRefine

OpenRefine (formerly Google Refine) is a powerful tool for working with messy data: cleaning it; transforming it from one format into another; and extending it with web services and external data.

18.Some more ML Youtube Channels:

19. Homemade Machine Learning

20. Machine Learning Terms:

21.Awesome Data Science

22.Data Science Blogs

23.Data Science Specialization Courses

24.100 Page Machine Learning Book:

25.Data Camp Info Sheets

Hope this post is helpful to you guys. If you want to add anything please post in the comment section. Thank you for reading my post:-)

Passed the TensorFlow Exam:

I just passed TensorFlow Developer Certificate exam and thought of sharing the sources which helped me.

Definitely interesting and worth preparing. I think for experienced TensorFlow developers it might be simple and easy.

Sources I used :

The main one is the TensorFlow in Practice in Coursera or Intro to TensorFlow in Udacity. Other sources helped me in broader understanding. For a certification point of view, either Udacity or TensorFlow in practice in Coursera is enough.

Only for Certification:Either Coursera or Udacity

  • TensorFlow in Practice from Coursera
  • Intro to TensorFlow — Udacity

Monthly 49$. Coursera.

Udacity Free Course — Similar to Coursera one.

Udacity — Github

Book:

Additional Sources :

Youtube Channel

Pycharm

Google Colab

TensorFlow Basics

TensorFlow 2.0 Cheatsheet

Gradient Descent explained step by step: My fav one

AWS Machine Learning Speciality

Any questions please email me at esenthil@hotmail.com

All the best and again the objective is to learn TensorFlow and not the certification.

If you like the post then hit the clap button just not once 😝

Hope this post is helpful to you guys. If you want to add anything please post in the comment section. Thank you for reading my post:-)

--

--

ML/DS - Certified GCP Professional Machine Learning Engineer, Certified AWS Professional Machine learning Speciality,Certified GCP Professional Data Engineer .