The world’s leading publication for data science, AI, and ML professionals.

How I’d Learn Machine Learning Again, After 6 Years

Some rough guidelines

By this month, I have 6 years of experience with machine learning.

My first contact with machine learning was classic machine learning (think support vector machines, k-means). Back then, I found the topic rather boring: too much theory, can’t build anything with it. That changed when I took more and more courses on machine learning.

Photo by Arian Darvishi on Unsplash
Photo by Arian Darvishi on Unsplash

I remember sitting in a lecture on natural language processing (NLP) with neural networks. The great instructor showed us graphs on the parameter counts of the latest NLP models: They ranged to billions of parameters!

He then said: Oh, the graph’s already outdated. Models are larger now.

Wow! What a time to witness such advancements; from millions to billions to … trillions of parameters. Somewhere around that time, I got hooked on all thing deep learning, doing all available classes on AI, plus some online courses (Coursera, MIT Deep Learning lectures).

The more I learned about machine learning and deep learning, the more I realized: there will always be more to discover. 6 years later, I might know slightly more than my past machine learning-self, but I still am an absolute beginner.

Take the most recent example where I again realized this: Batch Normalization.

Batch normalization is a widely used technique to address the covariate shift within a neural network. In other words, it adapts the input to a network layer via normalization.

Or that’s what I thought all the years! It turns out, I was completely wrong: In a NeurIPS 2018 paper, researchers showed that BN does not reduce covariate shift, but stabilizes gradients during training – that’s a completely different thing! And this is a paper from 2018! All the years, the information lay there, within easy reach. And only two weeks ago – years later! – I stumbled upon it. Amazing.

In a sense one always is beginning, re-adjusting, correcting inductive biases.

And that’s still true today! Even though interest in AI has skyrocketed (with even traditional manufacturing companies like Bosch having dedicated AI research institutions), one can absolutely learn ML today. If you are interested in learning machine learning, go pursue it. You can probably do it on the side, if necessary or desired. The internet has democratized the learning of any topic, and ML is not an exception.

After getting that goal stopper out of the way, we can focus on learning machine learning. In thinking about my past years with ML, I came up with a set of rough guidelines. If you want a more detailed list, then head over to these two articles.

1. Pursue your own coding projects

Do you want to classify images? Do so. Generate music? Do so. Translate text? Do so. There’s a great many courses, tutorials, and Jupyter/Colab notebooks to quickly get into a desired topic. If you are motivated about your own project you’ll learn much faster – and enjoy the process.

The beauty of starting today (as opposed to 6 years ago or earlier) is that there’s a wealth of resources available. Platforms like Kaggle, GitHub, and Medium offer datasets, code, and step-by-step tutorials that can help you learn and implement new projects in a hands-on fashion. On Kaggle and Colab, you even get free access to fast GPUs for your machine learning experiments. They make it easy to play with different techniques, offering interactive environments to quickly test and refine your code projects.

2. Embrace deep work for tackling challenging problems

This is a concept that I came across relatively late in my journey. It has been coined by Cal Newport, professor of computer science. He argues, and I am with him here, that to make progress on challenging things, you need uninterrrupted times of hard concentration.

This approach to mental challenges perfectly fits with machine learning: Rearding papers is hard. Coding is hard. No room for interrruptions, but much space to grow.

Three consecutive hours of uninterrupted concentration on a single topic per day seems to be a good starting amount.

3. Regularly read papers

There’s controversy about how many papers a beginner should read, or if any. I suggest: yes, read some.

Here’s why: A paper is a self-contained lecture on a very narrow topic. If it is from a top-tier ML conference, it has to be crisp and concise in writing, and it needs to explain the research very well. Will reading it be easy? No, definitely not, but the learning curve is steep.

For beginners, I recommend the following to start reading papers: First, select a research field you are personally interested in (see point 1). Then, read a single survey about this field. After this survey, you will have a very rough guideline to the field. Next, pick ten or so papers from the field (actual papers, not surveys) from top-tier ML conferences (ICLR, ICML, ECLM, NeurIPS, CVPR, AISTATS…), they need not be recent. To find papers that are freely available (i.e., not behind a paywall/subscription), you can use arXiv (type a title, topic, search phrase in the search box) or use a web search of your choice.

For the large conference mentioned, papers are commonly handled through OpenReview or the proceedings. Thus, searching for, e.g., "NeurIPS proceedings 2017" will guide you to a webpage with all accepted papers for the 2017 conference edition. By repeating this search for other conferences, you can quickly build a list of publications. (For a more expanded discussion, read my follow-up article)

After creating your list, read through it in any order.

Expect it to initially take a couple of hours per paper, but after ~20 publications from the same field (e.g., all NLP), time-to-read-through times will shorten drastically. For a slightly expanded discussion on this topic, refer to this article.

4. Re-implement (parts of) papers

After dedicating time to your own projects, the next step often comes naturally: the desire to improve what you’ve built. This is where next-level learning truly begins! By now, your hands-on experience has given you a solid foundation. At the same time, reading papers introduced you to many different methods.

The next step is combining the existing research with your own projects. Start by selecting a paper that directly relates to your coding project. If you’re working on image classification, you can read papers on transfer learning, attention mechanisms, or the latest advancements in convolutional neural networks (CNNs). Similarly, if your project involves natural language processing, you could delve into works on transformers, fine-tuning pre-trained models, or multilingual embedding techniques.

5. Embrace generality over specialization

The beginner time is the best time to learn broadly: you have complete freedom to choose from any subfield. After you have completed a project from one field, switch to the next. Yes, you will be a beginner again, but that happens to anybody switching ML fields. By doing projects from various research fields, you build a broader, more stable foundation for your future research.

For your later progress, having experience with many ML fields will actually be beneficial: It allows you to pick methods from one field and apply them to problems from another field.

6. Briefly study classic machine learning

Remember in my introduction, I wrote that the classic ML did not capture my interest? That has changed over the years: I have read research papers that combine classic approaches such as k-means clustering with the power of deep neural networks. Such hybrid approaches show that foundational methods are still important in today’s Deep Learning era.

Briefly studying classic machine learning helps you understand the humble beginnings of the AI field, from statistics and basic algorithms to multi-modal foundational models trained in a distributed fashion over thousands of GPUs.

Be a cook and mix the fields yourself.


Related Articles