We hope you had a great year. In the last Monthly Edition for 2018, we bring you Our Most Read Posts of the Year from Towards Data Science. If you have recently joined us or have been reading our articles for quite some time, we bring you our latest Editor’s Picks to:
- Help you de-mystify key concepts and learn more about data science
- Encourage you to publish with us or join our data science community
- To help you apply data science at your workplace, at your next hackathon or at University
James Le launches 2018 with an overview of the Top 10 algorithms for machine learning for beginner. In February Nicklas Donges takes us on a journey of Random forests and Decision Trees and Susan Li demonstrates how to use the SciKit-Learn library in Python for Multi-Class Text Classification. In March, Jonny Brooks-Bartlett’s discusses ‘Reality vs Expectation’ in reference to why so many data scientists are leaving their jobs.
Eugenio Culurciello brings us the rise and fall of Recurrent Neural Networks in April. In May, James Loy caused a retweet by Hilary Mason with an excellent introduction to build a neural network using Python and in Simon Greenman’s article compares the best AI chips to use and platform cloud solutions for AI. Lastly in July, Michael Galarnyk’s article on How to build a great Data Science Portfolio inspired me and many amongst the data science community.
With four weeks remaining in 2018, we hope these articles truly ‘pop’ like a bottle of champagne as you welcome your New Year wherever you are located in the world – we ask that you read these articles again or for the first time. Be sure to bookmark them for future reference and to share the articles amongst friends and colleagues. Sharing is caring!
Wendy Wong, TDS Editor.
Here’s why so many data scientists are leaving their jobs
By Jonny Brooks-Bartlett – 8 min read
Yes, I am a data scientist and yes, you did read the title correctly, but someone had to say it. We read so many stories about data science being the sexiest job of the 21st century and the attractive sums of money that you can make as a data scientist that it can seem like the absolute dream job.
A Tour of The Top 10 Algorithms for Machine Learning Newbies
By James Le – 11 min read
In machine learning, there’s something called the "No Free Lunch" theorem. In a nutshell, it states that no one algorithm works best for every problem, and it’s especially relevant for supervised learning (i.e. predictive modeling).
The 5 Clustering Algorithms Data Scientists Need to Know
By George Seif – 11 min read
Clustering is a Machine Learning technique that involves the grouping of data points. Given a set of data points, we can use a clustering algorithm to classify each data point into a specific group.
How to build your own Neural Network from scratch in Python
By James Loy – 6 min read
As part of my personal journey to gain a better understanding of Deep Learning, I’ve decided to build a Neural Network from scratch without a deep learning library like TensorFlow. I believe that understanding the inner workings of a Neural Network is important to any aspiring Data Scientist.
The Random Forest Algorithm
By Niklas Donges – 8 min read
Random Forest is a flexible, easy to use machine learning algorithm that produces, even without hyper-parameter tuning, a great result most of the time. It is also one of the most used algorithms, because it’s simplicity and the fact that it can be used for both classification and regression tasks.
The fall of RNN / LSTM
By Eugenio Culurciello – 9 min read
We fell for Recurrent neural networks (RNN), Long-short term memory (LSTM), and all their variants. Now it is time to drop them!
How to Build a Data Science Portfolio
By Michael Galarnyk – 17 min read
How do you get a job in data science? Knowing enough statistics, machine learning, programming, etc to be able to get a job is difficult. One thing I have found lately is quite a few people may have the required skills to get a job, but no portfolio.
Multi-Class Text Classification with Scikit-Learn
By Susan Li – 11 min read
There are lots of applications of text classification in the commercial world. For example, news stories are typically organized by topics; content or products are often tagged by categories; users can be classified into cohorts based on how they talk about a product or brand online …
Hyperparameter Tuning the Random Forest in Python
Buy William Koehrsen – 12 min read
So we’ve built a random forest model to solve our machine learning problem (perhaps by following this end-to-end guide) but we’re not too impressed by the results. What are our options?
We also thank all the great new writers who joined us recently, Valentino Constantinou, Jason Shuo Zhang, Oscar Knagg, Elena Nisioti, John Braunlin, John Hartquist, Melanie Tsang, Adam Radziszewski, Aidan Morrison, Romain Beaumont, George Liu, Bassim Eledath, Claire Longo, Sarah Wolf, Pankaj Mathur, Sabyasachi Sahoo, Ruan van der Merwe, Ira Cohen, MJ Bahmani, Uri Merhav, Amit Rathi, James Liang, Robert Sandor, Chris Bow, and many others. We invite you to take a look at their profiles and check out their work.