A Conversation about Deep Learning

Someone overheard two people talking about Deep Learning and let me know every little detail. One of them was totally against deep learning, and the other one was not only a defender of the field, but really knew the subject.

Favio Vázquez
Towards Data Science

--

For privacy I’m calling the first person Mr. A (the “hater”) and Ms. B (“the expert”).

This is Mr. A:

HI!

And this is Ms. B:

HI!

Mr. A: Hello B! How are you?

Ms. B: Just great A! Great to see you. I wasn’t expecting to see you in a Deep Learning conference.

Mr. A: Yes, I came to see what these crazy people were talking about, and I was right. Deep Learning sucks. It’s just the same old thing I studied in the University about Neural Nets, now you add more layers and that’s it! Nothing interesting to see.

Ms. B: Wow, that’s a big statement, I’m sorry you see things like that. Actually Deep Learning is just a brand for what’s it’s going on in the AI world. You could call it a “hype” but there are very important developments there. Do you want to come to my computer? I’ll show you.

Mr. A: I don’t the point but well, I have no other place to be right now let’s go.

Ms. B: So, let’s start from the basics, what do you know about deep learning?

Mr. A: I’ve read that they are some algorithms that mimic the way our brain works, that’s where the name “neural nets” come from, our brain. And they found a way of abstracting that into math, calculus and code. Then with it, the systems can learn like we do, or so they say, using these old school neural nets to solve some problems the system is presented to. Oh and by the way, they have no clue on how to interpret their models, is just another black box.

Ms. B: Ok. I think you maybe reading some weird stuff about Deep Learning. I’ll start with the basics. In the context of Machine Learning, the word “learning” describes an automatic search process for better representations of the data you are analyzing and studying. Looks at this example I found on the web:

Let’s say I tell you I want you to drive a line that separates the blue circles from the green triangles for this plot:

What would you do?

Mr. A: What’s the definition of line again?

Ms B: Let’s say is straight one-dimensional figure that has no thickness and extends infinitely in both directions.

Mr. A: Ok in that case this is an impossible task. I can’t put a line there.

Ms. B: And, what would you say if I present this picture now, and ask you to do the same:

Mr. A: Oh that’s very easy, just a line in the middle. But that is cheating, this is a different dataset. The line would go like this:

Ms. B: Great, but this is actually the same dataset I presented before.

Mr. A: What? Nope, the other one was like in a circular form, this is different.

Ms. B: I’ll explain it to you. In this case what I did a coordinate transformation, so we can plot or represent this data in a way we can draw this line. The data is the same, but the representation was different. And it happens that in the new representation it was a straightforward task to draw that line.

Mr. A: Ok I see what you mean. But what does this have to do with Deep Learning? Or Machine Learning?

Ms. B: Deep Learning is a subset of Machine Learning. But a very specific subset, inside of Representation Learning. Deep Learning enables us to create a system, a program, that can search for different representations (in this case a coordinate change), and then find a way of calculating the percentage of categories being classified correctly, when we do that, we are doing Deep Learning and of course Machine Learning.

Mr. A: That sounds fancy. But what about the old fashioned neural nets we saw decades ago? Why are there so important now? And what is their role in Deep Learning?

Ms. B: That’s a very good question. We can define Deep Learning as representation learning using different kinds of neural networks and optimize the hyperparameters of the net to get (learn) the best representation for our data. But why neural nets? They are very flexible, and let us find useful representation of highly non-linear data, like the one I showed you before. The concept is not new, but the tools we use now to build them are very different from the past. They have all the great advances in the AI and Machine Learning world.

Mr. A: They are the same neural networks, but the only difference are the tools? Like programming languages?

Ms. B: In part that’s true. But there’s much more. Until the late 2000s, we were still missing a reliable way to train very deep neural networks. We had been working on the theory for years, but now, with the development of several important theoretical and algorithmic improvements, the advances in hardware, and the exponential generation and accumulation of data, Deep Learning came naturally to fit this missing spot to transform the way we do machine learning. The modern languages allowed us to program these theoretical part, and with distributed computing and the power of big data, we managed to create this revolution.

Mr. A: That sounds very interesting. What are these theoretical advances and improvements you mention?

Ms. B: They come from years of investigation all around the world. Some from big Universities, or companies, but there are great contributions that came from the open source world too. Without all these fancy grants and stuff. I actually found a timeline that can help me explain this to you. Take a look:

You are right, this is an old theory. But that doesn’t mean it’s not relevant anymore. From there I can say the ideas of Back Propagation, better initialization of the parameters of the nets, better activation functions, the concept of Dropout, and some types of networks like Convolutional Neural Nets, Residual Nets, Region Bases CNNs, Recurrent Neural Networks and Generative Adversarial networks, are one of the most important advances we made in the Deep Learning world. I wish I could have time to explain all of them to you, but sadly I have to run to the next conference!

Mr. A: Ok now I think I understand better what this “Deep Learning” thing is all about. You never told me the meaning of the words Deep Learning.

Ms. B: You are right! The Deep in Deep Learning isn’t a reference to any kind of deeper understanding achieved by the approach; rather, it stands for this idea of successive layers of representations. Like this:

And the learning is what I told you before, it’s an automatic search process for better representation for the data we have, with the goal to understand patterns, make prediction and model the world around us.

Mr. A: Thanks B! Before leaving me, can you please suggest some good reading about the subject?

Ms. B: Sure! I’ll have a document with that, let me find it … here it is! Please write them down and lock the door after you exit. Great seeing you! Talk to you soon!

--

--

Data scientist, physicist and computer engineer. Love sharing ideas, thoughts and contributing to Open Source in Machine Learning and Deep Learning ;).