How To Learn Data Science - My path
Tools and Resources I use
Last week I published my 3rd post in TDS. Before the next post, I wanted to publish this quick one. I hope this post helps people who want to get into data science or who just started learning data science. In this post, I will share the resources and tools I use. It is basically all the apps and links I use day to day activities. All the below I used or will be using in the future. If you want me to add any other info then please post it in the comment section and I will include it.
My data Science Curriculum
This chart I created. This should give a basic idea of what tools and skills needed to work in DS and ML.
Thanks to Andrew Chen for coming up with the roadmap . Please check the above link.
Another one I found online
Programming Languages:
I started off with R programming. Then I moved to python and pretty much stayed with python and Jupyter Notebook. So for me, python and SQL is a must.
1.Python or R
2.SQL
Python:
Important Libraries used:
- NumPy
- Pandas
- Matplotlib
- SciPy
- Scikit-Learn
- TensorFlow
- Keras
- Seaborn
- NLTK
- Gensim
Some useful resources to start
1.Basic python tips
2.Some example
3.Youtube Channels
SQL
2.Data Visualization:
- Matlotlib
- Seaborn
- Altair
- Plotly
- Tableau
- D3.js
- Bokeh
3.Web Scraping:
- Beautiful Soup
- Scrapy
- Urllib
- Selenium
- API’s like Twitter API’s
4.Process
There are 5 core activities of data analysis:
1. Stating and refining the question
2. Exploring the data
3. Building formal statistical models
4. Interpreting the results
5. Communicating the results
5.Math in DS/ML
Resources:
http://www.cis.upenn.edu/~jean/gbooks/linalg.html
6.Machine Learning:
Resources:
7.Big Data
Nice one from 365 Data Science
4v’s
Resources:
1.Data Engineering Guide.
2.Learn Data Engineering
8.My IDE
Jupyter Notebook
Pycharm
R Studio
IntelliJ IDEA
The following apps are very helpful and I use it
* Quora
* Medium
* Blind
* Udemy
* Coursera
* Youtube
* Meetup
* Datacamp
1. Reddit:
I have subscribed to the following Reddit’s and it is very helpful
- Dataengineering
- Dataisbeautiful
- Datasets
- Learndatascience
- Learnprogramming
- Learnpython
- Machinelearning
- Learnmachinelearning
- Python
- Rstats
- Computervision
- learnprogramming
- Businessintelligence
- programming
- Rlanguage
- Scala
- AWS
- bigdata
- SQL
Create a personalized feed by subscribing to the above subreddits.In the app, you can sort the post by popularity by the day, week, month, year,alltime,etc.
2.Blind:
This is a useful community. You can find out what's happening in DS, ML and Software engineering in other companies.
3.Udemy for Business:
I think most of the companies have tied up with Udemy and I have the Udemy for business through my employer. This is very nice and super helpful. There are around 3000 courses. For example, I was trying to install spark, Hadoop and scala. There was not much information available online and was struggling to install it in my pc. I just went to Udemy and enrolled in a course which teaches apache spark and pretty much installed in less than an hour or so. This is really helpful for me. Also, the prices are very reasonable like 9.99$. There are good courses available.
4.Linkedin Learning:
Again I have access to all the LinkedIn learning courses. Its been provided by my employer and also free from the local county library membership. I found some great python and R courses in Linkedin Learning.
Also, check the Linkedin Groups
- Data Mining, Statistics, Big Data, Data Visualization, and Data Science
- Artificial Intelligence, Deep Learning, Machine Learning
- Big Data, Analytics, Business Intelligence & Visualization Experts Community
- KDnuggets Machine Learning, Data Science, Data Mining, Big Data, AI
- Cloud Computing, SaaS & Virtualization
- Data Warehouse — Big Data — Hadoop — Cloud — Data Science — ETL
- Artificial Intelligence, Deep Learning and IoT
- SQL Server Business Intelligence(BI)
- Internet of Things
- Bank and Finance Technology — FinTech Banking Systems Financial Executives
- Cloud Computing
- Python Community
- Python Data Science and Machine Learning
5.Medium:
This is the best platform to learn and also publish your articles.
Some of the publications I follow are
- Towards Data Science
- The Startup
- HackerNoon.com
- freeCodeCamp.org
I publish in
· Towards Data Science
TDS is the fastest growing publication in DS .
- 218,052 subscribers
- Growing by 436.8 followers per day
6.Youtube:
Obviously, there are a lot of channels available for DS, ML, and AI. I follow a few
- Data School
- Google Cloud Platform
- Learn R
- Datacamp
- Simplilearn
- Edureka
- Marinstatslectures
- Sentdex
- Siraj Rawal Channel
7.Udacity:
I didn’t do any paid courses in Udacity. There are some free courses and you can check it out.
8.Other useful sites
- Kdnuggets.com
- Datacamp.com
- Khanacademy.org
- geeksforgeeks.org
9.Kaggle:
I think everyone knows Kaggle and no needs introduction
Also, check out their mini-courses which are helpful
10.Github:
Obviously, you can build your portfolio. I have a few projects like web scraping, twitter analysis, data visualization using python, etc. I plan to add more going forward. Also it a great place to search for similar projects and you get a lot of help and ideas from other projects published in Github.
11.Meetup:
I have enrolled in a few local San Jose and San Francisco meetups. Unfortunately, I don’t have time to make it often. This is very helpful in learning and networking. I have seen a python meetup in Linkedin building which had a few hundred developers attending. So obviously a golden opportunity to meet people and learn from them. Even there are some meetups that do hands-on work once or alternate days during the weekdays. There are some workshops for the whole weekend. Almost all workshops are free or some charge very minimal fees.
12.Quora:
I subscribed to the following feeds
- Algorithms
- Competitive programming
- Data Science
- Machine Learning
- Deep Learning
- Big Data
- Data Analysis
- Data Visualization
- Python
- Hakon Hapnes Strand
- Mike West
- William Chen
- Lili Jiang
13.Podcast
- Data Skeptic ( Spotify)
- The week in machine learning and artificial intelligence( Spotify )
- Superdatascience.com
- https://talkpython.fm/
Also, check the below ones
Paysa.com & Levels.fyi
14.Stackoverflow
I think without StackOverflow you can’t code:-)
15.Problem Solving sites:
- HackerRank (http://hackerrank.com/)
- CodeChef (http://codechef.com/)
- HackerEarth(http://hackerearth.com/)
- LeetCode (http://leetcode.com/)
- Topcoder (http://topcoder.com/)
- Kaggle (http://kaggle.com/)
- ChallengePost (http://challengepost.com/)
- CodeForces (http://codeforces.com/)
- Brilliant (http://brilliant.org/)
- SPOJ (http://www.spoj.com/)
- Project Euler (https://projecteuler.net/)
- CodingBat (http://codingbat.com/)
- Codewars (http://www.codewars.com/)
- Codility (https://codility.com/)
- Codingame (https://www.codingame.com/)
- CoderByte (https://coderbyte.com/)
- CodeEval (https://www.codeeval.com/)
- UVA Online Judge (https://uva.onlinejudge.org/)
- CodeFights (https://codefights.com/)
- CheckiO (http://www.checkio.org/)
- Talentbuddy (http://talentbuddy.co/)
- PythonChallenge (http://pythonchallenge.com/)
- LintCode (http://www.lintcode.com/en/)
- Rosalind (http://rosalind.info/problems/locations/)
- CrowdANALYTIX (https://www.crowdanalytix.com/)
- SQL-EX.RU (http://sql-ex.ru/)
- Kattis (http://www.kattis.com/)
- CodeKata (http://codekata.com/)
- CodeAbbey (http://codeabbey.com/)
- FightCode (http://fightcodegame.com/)
- BeatMyCode (http://www.beatmycode.com/)
- TunedIT (http://tunedit.org/)
- MLComp (http://mlcomp.org/)
- HPC University (http://hpcuniversity.org/students/weeklyChallenge/)
- https://practiceit.cs.washington.edu/
16.Free Courses:
Helsinki University in Finland has launched a course on artificial intelligence(Elements of AI) — one that’s completely free and open to everyone around the world. Unlike Carnegie Mellon’s new undergrad degree in AI, which the institution created to train future experts in the field, Helsinki’s offering is more of a beginner course for those who want to know more about it.
Machine Learning
Artificial Intelligence
Machine Learning for Data Science and Analytics
Machine Learning
Machine Learning Unsupervised Learning
Data Science 21 courses
Certificates in Big Data, Ai and ML
Free Google Machine Learning Courses
17. Books
Some important bookmarks I always refer
1.All DS cheatsheets in one place
2.DS resources in one place
3.DS Notes
4.DS Tutorial
5.ML Tutorial
6.DS Glossary
7.DS Toolbox
8.Plotly Tutorial
9.DS Github repository
10.Github
11.Industry Machine Learning
12.Data Engineering Guide.
13.Learn Data Engineering
14.Google Dataset Search
15.Data Engineering Study Guide:
16.Regex
https://regex101.com/
17.Google OpenRefine
OpenRefine (formerly Google Refine) is a powerful tool for working with messy data: cleaning it; transforming it from one format into another; and extending it with web services and external data.
18.Some more ML Youtube Channels:
19. Homemade Machine Learning
20. Machine Learning Terms:
21.Awesome Data Science
22.Data Science Blogs
23.Data Science Specialization Courses
24.100 Page Machine Learning Book:
25.Data Camp Info Sheets
Hope this post is helpful to you guys. If you want to add anything please post in the comment section. Thank you for reading my post:-)
Passed the TensorFlow Exam:
I just passed TensorFlow Developer Certificate exam and thought of sharing the sources which helped me.
Definitely interesting and worth preparing. I think for experienced TensorFlow developers it might be simple and easy.
Sources I used :
The main one is the TensorFlow in Practice in Coursera or Intro to TensorFlow in Udacity. Other sources helped me in broader understanding. For a certification point of view, either Udacity or TensorFlow in practice in Coursera is enough.
Only for Certification:Either Coursera or Udacity
- TensorFlow in Practice from Coursera
- Intro to TensorFlow — Udacity
Monthly 49$. Coursera.
Udacity Free Course — Similar to Coursera one.
Udacity — Github
Book:
Additional Sources :
Youtube Channel
Pycharm
Google Colab
TensorFlow Basics
TensorFlow 2.0 Cheatsheet
Gradient Descent explained step by step: My fav one
AWS Machine Learning Speciality
Any questions please email me at esenthil@hotmail.com
All the best and again the objective is to learn TensorFlow and not the certification.
If you like the post then hit the clap button just not once 😝
Hope this post is helpful to you guys. If you want to add anything please post in the comment section. Thank you for reading my post:-)