
Data and Artificial Intelligence continue to steal the limelight in LinkedIn’s 2020 Emerging Jobs Report, citing Artificial Intelligence Specialists and Data Scientists as the Top 1 and 3 emerging jobs in the U.S., respectively.
With companies realizing the value of data beyond the hype, we can expect to see Data Science and AI jobs postings and salaries keep rising in 2021.
Regardless of your background or skill level, data science professionals and enthusiasts-alike all need to keep sharpening the saw. This post attempts to collate some of the most helpful Books you can read to increase your data science proficiency.
Disclaimer: There are no affiliate links in this post. This post is for information purposes only.
Data Science Appreciation
These are some books meant for those who don’t have any background in Data Science. Moreover, these are books that are also well-suited for Business Leaders and Managers looking into applying concepts of Data Science in their workplace. The following books provide a high-level view of the Data Science process and some of the many applications in business.
1. The Art of Data Science – A Guide for Anyone Who Works With Data
By Roger D. Peng and Elizabeth Matsui
This book provides an excellent overview of the data analysis workflow. Moreover, it articulates well how despite the presence of many tools, data analysis is fundamentally an art, involving an iterative process where information is learned at every step.

2. Predictive Analytics – The Power to Predict Who Will Click, Buy, Lie, or Die
By Eric Siegel
This book offers a comprehensive yet accessible resource to anyone who wants to learn how predictive analytics work, dissecting many real-life applications from mortgage risk, terrorism, crime predictions, and politics, to name a few.

3. Data Science for Business – What You Need to Know about Data Mining and Data Analytic-Thinking
By Foster Provost and Tom Fawcett
This is a must-read book for business people who want to have a better understanding of how data science can be used to achieve a competitive advantage.

4. Data Smart – Using Data Science to Transform Information into Insight
By John Foreman
What’s interesting about this book is how it teaches data science concepts using none other than Microsoft Excel. All in all, the book shows a perfect illustration of how data science is inherently tool-agnostic.
It doesn’t matter what language, platform, or software you do your data science on, the fundamentals and math behind the algorithms remain the same.

Math and Statistics
Who says grasping numerical concepts can’t be light and entertaining? Some of these math and statistics books are geared to give you a less intimidating introduction to many of the key concepts required to use data science in business.
5. Naked Statistics – Stripping the Dread from the Data
By Charles Wheelan
Statistics can sometimes be an daunting topic to dive into. Not only that, focusing on the details sometimes obscure the intuition behind the metrics we use at work. In this book, author Charles Wheelan clarifies key concepts like inference, correlation, and regression analysis in a fun and less dreadful way.

6. Practical Statistics for Data Scientist — 50+ Essential Concepts Using R and Python
By Peter Bruce, Andrew Bruce, and Peter Gedeck
This is a practical high-level guide to get you familiarized with statistical methods used by Data Scientists. While it does not provide an in-depth explanation of the mathematical concepts, it is nonetheless an excellent reference that allows you to continue learning statistics elsewhere.

7. The Art of Statistics – How to Learn from Data
By David Spiegelhalter
Written by the well-renowned statistician, David Spiegelhalter, The Art of Statistics shows how we can derive insights from raw data and how we can approach a variety of problems using statistics.

Data Visualization and Storytelling
A key aspect of the data science process is data visualization. Many might settle for bland matplotlib and maybe fancy some seaborn plots every once in a while, but these books will tell you there’s indeed a proper way to do data visualization. Getting the execution scripts right is one thing, but designing charts and dashboards to get the right insights out is another thing.
8. Storytelling with Data – A Data Visualization Guide for Business Professionals
By Cole Nussbaumer Knaflic
This is a must-read book for anyone who wants to get better at presenting information in a clear, concise, and graphical way. This book teaches you the fundamentals of data visualization and how to effectively communicate with data, complete with numerous real-world examples.

9. Fundamentals of Data Visualization – A Primer on Making Informative and Compelling Figures
By Claus O. Wilke
This book presents the basic principles alongside good and bad contrasting examples of data visualization. It is a book that can help you understand the rationale behind an effective visualization and can teach you to design more meaningful plots that get the right message across.

10. Good Charts – The HBR Guide to Making Smarter, More Persuasive Data Visualizations
By Scott Berinato
This book draws insights from research in visual perception and neuroscience and attempts to explore how people perceive good and bad charts differently. It teaches frameworks on how to make persuasive visualizations along with case studies to illustrate them.

11. MakeoverMonday — Improving How We Visualize and Analyze Data, One Chart at a Time
By Andy Kriebel
This book is an extension of the #MakeOverMonday project where members of the data visualization community share their improved take on existing charts and data. It emphasizes that while there’s variability in designing visualizations, there are key techniques that you can follow to make sure your chart makes an impact.

Machine Learning
If you are ready to get your feet wet into making predictions with data, these books will give you an in-depth exposition of Machine Learning concepts with practical application and hands-on examples.
12. Introduction to Machine Learning with Python
By Andreas C Muller and Sarah Guido
This book is an excellent resource that can get you up to speed with the basics of the most widely used machine learning algorithms, including techniques on how to process data, advanced methods for model evaluation and parameter tuning, and principles on creating your modeling workflow. It is beginner-friendly with no assumption that the reader has a heavy programming background. Not to mention, the accompanying GitHub repository is undeniably useful for learning.

13. The Hundred Page Machine Learning Book
By Andriy Burkov
It is a condensed resource for machine learning concepts perfect as a go-to handbook for managers or software developers looking to integrate ML pipelines into their projects.

14. Hands-On Machine Learning with Scikit Learn, Keras, and TensorFlow
By Aurelien Geron
Another one of those O’Reilly books that offer a practical guide to learning ML coupled with clear conceptual explanations and code implementations. It helps you build a solid understanding of machine learning through a variety of hands-on exercises implemented with Scikit-Learn and TensorFlow.

15. AI and Machine Learning for Coders — A Programmer’s Guide to Artificial Intelligence
By Laurence Moroney
A must-have book for programmers breaking into the Artificial Intelligence field or for anyone who has a strong technical background that is looking into applying AI in their projects. Primarily based on TensorFlow, author Laurence Moroney walks you through common AI and ML concepts as applied in computer vision, natural language processing, sequence modeling, to name a few.

16. The Elements of Statistical Learning – Data Mining, Inference, and Prediction
By Trevor Hastie, Robert Tibshirani, et al.
Probably, one of the more academic looking books on this list. However, we can’t deny the immense knowledge contained in this book. It is a valuable resource for statisticians or anyone interested in data mining.
Sufficiently technical and can serve as a good lasting reference that you definitely should keep on your shelf.

Deep Learning
Deep Learning is perhaps the hottest aspect of data science nowadays. This subset of machine learning is responsible for many of the high-profile applications we see today from self-driving cars, deep fakes, to image recognition. The following books are excellent resources to get you started on this topic.
17. Deep Learning with Python
By Francois Chollet
Written by the creator of Keras, Deep Learning with Python helps you to build an understanding of deep learning from scratch. It contains detailed examples with practical recommendations and high-level explanations to allow any beginner to start their deep learning project.

18. Foundations of Deep Reinforcement Learning – Theory and Practice in Python
By Laura Graesser and Wah Loon Keng
A rather advanced textbook that explores Deep Reinforcement Learning, where artificial agents learn to solve sequential decision making. A well-written book for anyone who has working knowledge of machine learning and wants to solve problems using Deep RL.

19. Deep Learning Illustrated – A Visual, Interactive Guide to Artificial Intelligence
By John Krohn, Grant Beyleveld, and Aglae Bassens
This is a practical reference that can help you build your intuition on deep learning algorithms. In this visual, interactive guide, you will learn theories together with examples you can run through on the accompanying Jupyter notebooks.

Programming
This section is an exception from the title.
While these books originally came from the software engineering field and were written with examples from languages other than Python and R, concepts here are universal and can be used to level up your programming proficiency.
Many Data Scientists come from non-tech backgrounds. Hence, it is not uncommon to see messy code when reviewing ML notebooks. The remaining two books to complete this list are classic references used by many programmers to rethink and improve the way they code.
20. The Pragmatic Programmer — Your Journey To Mastery
By David Thomas and Andrew Hunt
This is a timeless book that "examines the very essence of software development, independent of any particular language, framework, or methodology". Not only does it discuss techniques to keep your code adaptable and easy to reuse, but it also explores topics on personal responsibility and career development.

21. Clean Code – A Handbook of Agile Software Craftsmanship
By Robert C. Martin
This book explains the principles, and best practices of writing clean code illustrated using several case studies. Important for data professionals working in a collaborative setting, writing clean code is a skill that can prepare you and your team to produce better data products.

Other interesting data science books you might like:
- Everybody Lies – Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are by Seth Stephens-Davidowitz, Timothy Andres Pabon, et al.
- Big Data – A Revolution That Will Transform How We Live, Work, and Think by Victor Mayer-Schonberger
- How Chart’s Lie – Getting Smarter About Visual Information by Alberto Cairo
- Calling Bullshit – The Art of Skepticism in a Data-Driven World by Carl T. Bergstrom and Jevin D. West
- Stories That Stick – How Storytelling Can Captivate Customers, Influence Audiences, and Transform Your Business by Kindra Hall
- The Book of Why – The New Science of Cause and Effect by Judea Pearl
- Deep Learning by Ian Goodfellow, Yoshua Bengio, et al.
- Design Patterns – Elements of Reusable Object-Oriented Software by Erich Gamma, Richard Helm, Ralph Johnson, and John Vissides
Are there other books you think that should be on this list? 📚
If you liked this, say 👋 and follow me on twitter!