Deep Learning for the Masses (… and The Semantic Layer)

Deep learning is everywhere right now, in your watch, in your televisor, your phone, and in someway the platform you are using to read this article. Here I’ll talk about how can you start changing your business using Deep Learning in a very simple way. But first, you need to know about the Semantic Layer.

Favio Vázquez
Towards Data Science

--

Introduction to Deep Learning

The scope of this article is not introducing Deep Learning, I’ve done that in other articles you can find here:

But if you want a taste of it from here this is what I can say to you:

… deep learning is representation learning using different kinds of neural networks [deep neural networks] and optimizing the hyperparameters of the nets to get (learn) the best representation for our data.

If you want to know where that comes from read the articles above.

Deep Learning is not that hard

This is harder

Right now deep learning for organizations is not that hard. I’m not saying that Deep Learning as a whole is easy, research in the field requires a lot of knowledge of mathematics, calculus, statistics, machine learning, computing and more. You can see where Deep Learning is coming from from this timeline I created a while ago:

From there I can say the ideas of Back Propagation, better initialization of the parameters of the nets, better activation functions, the concept of Dropout, and some types of networks like Convolutional Neural Nets, Residual Nets, Region Bases CNNs, Recurrent Neural Networks and Generative Adversarial networks, are one of the most important advances we made in the Deep Learning world.

But how can you use Deep Learning right now?

hehe

Data comes first

Oh, you’d want to be them. Right?. Or maybe not, IDK.

Well do you want to know the secret? The secret sauce that the big technology companies use? It’s not only Deep Learning (maybe it’s not Deep Learning at all)

I’m not going to give a full speech here, but it all starts with the data. As you can image, data is an important asset (maybe the most important one) for companies right now. So before you can apply machine learning or deep learning, at all, you need to have it, know what you have, understand it, govern it, clean it, analyze it, standardize it (maybe more) and then you can think of using it.

Taking from Brian Godsey’s amazing article:

In whichever form, data is now ubiquitous, and rather than being merely a tool that analysts might use to draw conclusions, it has become a purpose of its own. Companies now seem to collect data as an end, not a means, though many of them claim to be planning to use the data in the future. Independent of other defining characteristics of the Information Age, data has gained its own role, its own organizations, and its own value.

So you can see that it’s not only applying the latest algorithms to your data, it’s to be able to have it in a good format and understand it to then use it.

The Semantic Layer

These layers mean something. Ok but this is not the semantic layer. Keep reading.

This is something unusual to hear from me, but I’ve been investigating a lot, and working with several companies, and they all seem to have the same problem. Their data.

Data availability, data quality, data ingestion, data integration and more are common problems that will affect not only the data science practice, but the organization as a whole.

There are ways for cleaning your data and prepare for machine learning and there are great tools and methodology for that, you can read more here:

But that asumes that you have a process for ingesting and integrating your data. Right now there are great tools to for AutoML, and I’ve talked about that before:

And other comercial tools like DataRobot:

But what about automatic ingestion and integration?

That’s one of the amazing benefits of the semantic layer. But what on earth is the semantic layer?

The word semantic itself implies meaning or understanding. As such, the semantic layer is related to data in concerning the meaning and not the structure of data.

Understanding it’s a very important process that I’ve talked about before:

Here I mention (from Lex Fridman) that:

Understanding is the ability to turn complex information into simple, useful information.

When we are understanding we are decoding the parts that form this complex thing, and transforming the raw data we got in the beginning to something useful and simple to see. We do this by modeling. And as you can imagine we need such models to understand the meaning of data.

Linked Data and the Knowledge Graph

The first thing we need to do is Link Data. The goal of Linked Data is to publish structured data in such a way that it can be easily consumed and combined with other Linked Data.

Linked Data is the new de-facto standard for data publication and interoperability on the Web and is moving into enterprises as well. Big players such as Google, Facebook, Amazon and Microsoft, already adopted some of the principles behind it.

The process of linking data is the beginning of something called the Knowledge Graph. A knowledge graph is an advanced way to map all of knowledge on a particular topic to fill the gaps on how is the data related, or wormholes inside of databases.

The knowledge graph consists in integrated collections of data and information that also contains huge numbers of links between different data.

The key here is that instead of looking for possible answers, under this new model we’re seeking an answer. We want the facts — where those facts come from is less important.

The data here can represent concepts, objects, things, people and actually whatever you have in mind. The graph fills in the relationships, the connections between the concepts.

Here’s an amazing introduction to the knowledge graph by Google from 6 years ago (yep 6 years):

What does the Knowledge Graph mean for you and your company?

In the old-fashion way, the data model in the data warehouse, while an awesome achievement, cannot absorb the huge amount of data that is coming at us. The process of creating relational data models just can’t keep up. In addition, the extracts of data that are used to power data discovery are also too small.

Data Lakes, based on Hadoop or Cloud storage have therefore proliferated into data swamps — without the required management and governance capabilities.

https://timoelliott.com/blog/2014/12/from-data-lakes-to-data-swamps.html

Have you asked your data engineers and scientists if they understand all the data your organization has? Do it.

It’s extremely hard also to analyze all the data you have. And understand what relationships are behind it.

Because they are graphs, knowledge graphs are more intuitive. People don’t think in tables, but they do immediately understand graphs. When you draw the structure of a knowledge graph on a whiteboard, it is obvious what it means to most people.

Knowledge graphs also allow you to create structures for the relationships in the graph. You can tell a graph that parents have children and parents can be children and children can be brothers or sisters, and all of these are people. Providing such descriptive information allows new information to be inferred from the graph such as the fact that if two people have the same parents they must be siblings.

Scaling the Semantic Layer for your organization

When searching for something that can help me and you in implementing an end-to-end platform for delivering a true Semantic Layer at enterprise scale I found a great platform: Anzo created by a company called Cambdrige Semantics.

I’m going to edit this. Keep reading!!

You can build something called “The Enterprise Knowledge Graph” with Anzo.

The nodes and edges of the graph flexibly capture a high-resolution twin of every data source — structured or unstructured. The graph can help users answer any question quickly and interactively, allowing users to converse with the data to uncover insights.

In addition to making every day big data analytics problems easy the graph unlocks new possibilities where graph is particularly well suited. The graph, based on open-standards is a platform for continuous improvement. Within the graph, sources are quickly linked and harmonized using business rules, text analytics and even machine learning (this is going to be important soon).

Also I loved the idea of a Data Fabric. Then I realized that other people use the same concept. It reminded me of the time-space fabric. I went ahead and defined the Data Fabric (without knowing is that what the authors mean and without reading other definitions).

The concept of the Space-Time Fabric in Physics is a construct created to explain the continuous of space and time, and it’s made of four (or eleven or twenty six depending on the theory you are) dimensions. Inside of this construct gravity is a manifestation of warping the fabric of space-time.

From Ethan Siegel: You can talk about space as a fabric, but if you do, be aware that what you’re doing is implicitly reducing your perspective down to a two-dimensional analogy. Space in our Universe is three dimensional, and when you combine it with time, you get a four dimensional quantity.

So what would a Data Fabric? If we think of the definition in Physics, we can say that for an organization:

The Data Fabric is the platform that supports all the data in the company. How it’s managed, described, combined and universally accessed. This platform is formed from an Enterprise Knowledge Graph to create an uniform and unified data environment.

And with Anzo this is possible. This is what a Data Fabric with Anzo can look like (it kinda looks like the space-time fabric, awesome!):

The things on top of the data fabric are data layers. These data layers can add stuff like data cleansing, transformation, linking and access control — dynamically enhancing the in-memory graph in an iterative manner.

Data Layers in this stacked fashion are very flexible, meaning that you can easily turn layers on or off, and remove, copy and create layers as needed.

With Anzo you have automatic query generation (yep that’s a thing) and using them against the complex graph makes extracting features easy and eventually fully automated!

With the several components of Anzo an user can truly have a conversation with their data — quickly and easily pivoting to take the analysis in new directions based on the answers to questions, without specialized query knowledge, they can traverse even the most complicated multi-dimensional data on the way to building exploratory charts, filters, tables and even network views.

And with the connection of Open Source technologies like Spark, Featuretools and Optimus you can fully prepare your data and finally make it ready for machine and deep learning.

I’ll write more about this in the future, but for now, let’s think we have our data fabric and everything on point, and we want to do machine and deep learning.

Deep Learning for You

Ok, Deep Learning. You want to use it. What are its main applications?

Here you can see some of them:

In the few years since deep learning has been the king of the AI world, it achieved great things, François Chollet list following breakthroughs of Deep Learning:

  • Near-human level image classification.
  • Near-human level speech recognition.
  • Near-human level handwriting transcription.
  • Improved machine translation.
  • Improved text-to-speech conversion.
  • Digital assistants such as Google Now or Amazon Alexa.
  • Near-human level autonomous driving.
  • Improved ad targeting, as used by Google, Baidu, and Bing.
  • Improved search results on the web.
  • Answering natural language questions.

So there’s a lot of things you can do with it. Now, how can you do them?

Sadly, there is a (big) shortage of AI expertise that creates a significant barrier for organizations ready to adopt AI. Normally we do Deep Learning programming, and learning new APIs, some harder than others, some are really easy an expressive like Keras.

Right now you can use a more expressive way of creating deep learning models. And that’s using Deep Cognition. I’ve talked about it before:

Their platform, Deep Learning Studio is available as cloud solution, Desktop Solution ( http://deepcognition.ai/desktop/ ) where software will run on your machine or Enterprise Solution ( Private Cloud or On Premise solution).

You can use pre-trained models as well as use built-in assistive features simplify and accelerate the model development process. You can also import model code and edit the model with the visual interface.

The platform automatically saves each model version as you iterate and tune hyperparameters to improve performance. You can compare performance across versions to find your optimal design.

This system is built with the premise of making AI easy for everyone, you don’t have to be an expert when creating this complex models, but my recommendation is that it’s good that you have an idea of what you are doing, read some of the TensorFlow or Keras documentation, watch some videos and be informed. If you are an expert in the subject great! This will make your life much easier and you can still apply your expertise when building the models.

You can actually download the code that produced the predictions, and as you will see it is written in Keras. You can then upload the code and test it with the notebook that the system provides or use it in your laptop or other platforms.

The Semantic Layer and Deep Learning

So with the connection of the semantic layer with a platform like Anzo and a Deep Learning system like Deep Learning Studio you can accelerate the use of data and AI for your company. This is the path that I’m imagining it can work for almost all organizations:

I went ahead and modified the original picture. I think this is with a touch of Python, Spark and stuff like that can be the future of data science and data technologies.

I think that this together with a methodology like the Agile Business-Science Problem Framework (ABSPF) can really bring value to an organization from an end-to-end perspective. More on ABSPF:

I think we can change the world for the better, improve our lives, the way we work, think and solve problems, and if we channel all the resources we have right now to make these area of knowledge to work together for a greater good, we can make a tremendous positive impact in the world and our lives.

This is the beginning of a longer conversation I want to start with you. I hope it helped you getting started in this amazing area, or maybe just discover something new.

If this articles helped you please share it with your friends!

If you have questions just follow me on Twitter:

and LinkedIn:

See you there :)

--

--