Don’t Get Lost in the Deep

Machine learning is rarely about deep learning. And that’s okay.

Ravi Malde
Towards Data Science

--

Let’s start with why I am writing this article. As someone that is early on in their data science journey, it’s easy to lose the forest from the trees. And no, that’s not a pun about decision trees and random forests. What I mean is there is so much to learn, so much to be excited about, but what I think we should all really be focusing on is a detailed understanding of the basics. Deep learning is an incredible step forward in the machine learning domain, it’s definitely an area that piques many people’s interest, even bringing them into this area of study in the first place, but the truth is that few companies are actually applying these techniques. More often than not linear models do just fine solving business problems. My point is you are of more value to a company if you have a solid understanding of the fundamental machine learning algorithms than if you spread yourself too thin focusing on the novel tools.

What is machine learning?

I thought I’d add this section here as there are so many definitions that it’s easy to become confused about what is and isn’t machine learning.

Machine learning is an application of artificial intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed. The primary aim is to allow the computers learn automatically without human intervention or assistance and adjust actions accordingly — Source

Many definitions can get rather wordy but I think that the quote above sums it up quite nicely. Machine learning is a subsection of AI that aims at giving a system the ability to learn from experience (i.e. learn from data), without being explicitly programmed to behave in a particular way.

Now this all sounds rather sexy, but a linear regression model does exactly this and therefore it belongs firmly in the machine learning domain. For many, this can be somewhat discouraging.

We’ve all seen Excel fit a line of best fit to some data, therefore do you mean to tell me Excel is applying machine learning techniques?

Well, yes, actually it is. But that’s fine. Linear regression has a reputation for being basic and bland but it is an exceptionally powerful tool that is exceptionally undervalued, especially with people relatively new in the field (myself included prior to giving this article thought). I’m using linear regression as an example in this article, but many of the other fundamental algorithms have a similar repute.

A look at the trends…

Being a data science related article, let’s take a look at some actual data. The plot below is of data taken from Google Trends showing the relative interest (number of google searches) in machine learning, deep learning and linear regression over the last 15 years.

Interest in linear regression has remained fairly consistent over this period, which makes sense as its inception was way back in the early 19th century by either Carl Freidrich Gauss or Adrien-Marie Legendre (more can be read about its controversial discovery here). Deep learning on the other hand has only really been popularised over the last 5 years.

What’s given me ‘food for thought’ is that despite deep learning being used far less in industry, it’s being looked up online significantly more often. This is understandable as there is a tremendous amount of hype around this topic at the moment, but I’m going to break a data science rule here and suggest the causality of this.

People are focusing on the wrong things. By all means, it’s okay to have an interest in the most exciting and advanced machine learning techniques. In fact, that’s a great thing as it means you’re broadening your knowledge and demonstrating a keen interest in topics at the forefront of your domain. But this is only okay as long as we are not overlooking the basic algorithms that are the foundations of the data science field.

So who is and who isn’t using deep learning?

A deep learning algorithm is a large neural network with multiple hidden layers. The problem with deep neural nets is that they require huge amounts of data in order to gain an advantage over traditional machine learning algorithms.

This plot presents the general performance trends we see when comparing machine learning algorithms on data sets of varying size. Traditional methods tend to outperform neural networks when the datasets are on the smaller side, and this is where the problem lies.

Few companies are sitting on enough data to actually reap the benefits of these brilliant algorithms. Those who are tend to be the tech behemoths — the likes of Amazon, Facebook and Google. The fact is that most companies are still in the early adoption phases of data science and are a long way off from having the resources (data, money and expertise) required for them to exploit the advantages of deep learning.

Moreover, businesses actually want the simplest solution to a complex problem. Even if a company had the resources for deep learning or any other convoluted algorithm, if the problem can be solved with a simpler model that has comparable performance, you bet they’ll go with that simpler model. Why is this?

Simple models are quicker to build, easier to implement, more interpretable and relatively painless to update. The importance of this should not be understated. The more straightforward an algorithm is, the more transparent it is. We should always avoid implementing models that are ‘black boxes’ as this limits the scope of its applications, ultimately reducing its value to us. By using uncomplicated models we can better investigate and comprehend the relationships that exist in the problem we are solving.

I’ll finish off this section with a quote that summarises the essence of my point in a rather poetic way:

With sophisticated models there is an awful temptation to squeeze the lemon until it is dry and to present a picture of the future which through its very precision and verisimilitude carries conviction. Yet a man who uses an imaginary map, thinking it is a true one, is likely to be worse off than someone with no map at all; for he will fail to inquire whenever he can, to observe every detail on his way, and to search continuously with all his senses and all his intelligence for indications of where he should go — From Small is Beautiful by E. F. Schumacher

A final note…

Please don’t get me wrong, complex != bad. Sophisticated approaches definitely have their place and we are right to get excited about them. Let’s just make sure that the basics are covered first. Don’t dive into the deep end of the algorithmic swimming pool right away.

Furthermore, don’t be disheartened if you aren’t implementing the advanced techniques very often in the workplace. How many times on TV do we see lawyers arguing in front of a jury or doctors carrying out emergency life saving surgery?

Very often. The issue is that these are dramatised representations of reality. Lawyers rarely do this in their day to day work and doctors spend most of their time diagnosing the same routine medical conditions each and every day. The same applies to data scientists.

We read all the time about exciting developments in the field which can make us think ‘why does my work seem comparatively mundane?’. Most of our time is spent cleaning and preparing data, then using straightforward models to solve the problem at hand. It’s still an awesome job. Managing expectations is an important thing we should apply in many aspects of our lives, otherwise we could be setting ourselves up for disappointment.

Thank you for reading this far! I’d love to hear your thoughts in the comments. If you’d like to reach out to me directly then feel free to leave me a message on LinkedIn.

--

--