Deep Learning (DL) and Machine Learning (ML) are two subsets of Artificial Intelligence (AI). ML is different from traditional programming so that its ability to learn from data without providing explicit rules. It can identify hidden patterns behind data. ML is handy for analyzing structured data. DL goes beyond ML when we consider the complexity of models. It provides an Artificial Neural Network (ANN) approach. DL is super useful for analyzing unstructured data (images, texts, voice, etc.). ML is not much suitable for these types of data.
ML and DL have been around us over the past decades. The emerging of these two technologies has happened recently due to the following factors:
- Large quantities of data
- Computational resources
- Algorithms
- Open-sourced frameworks
This post is all about these four factors that have contributed to the emerging of ML and DL.
Large quantities of data
In this digital era, the usage of smart devices (phones, laptops, etc.) and web services has increased significantly. This generates a lot of data in various formats such as text, images, voice, numbers, etc. Most people use social media. When you put a like on a post, it is also a kind of data. When you make a review on a product you purchased from an online site, it is a kind of text data that can be used to get valuable insights about customer satisfaction.
Data is the most valuable asset in ML and DL. Recently, ML and DL techniques are used in various ways to deal with data:
- Data extracting: Data doesn’t come to our hands. We should extract them from various sources. These sources include web pages, databases, images, etc. ML and DL techniques are used extensively to extract data from these sources.
- Data preprocessing: Most of the real-world data is messy. We need to preprocess them before using them. Missing value handling and outlier detection are the most fundamental data preprocessing tasks. ML and DL techniques are used for this in addition to general approaches.
- Modeling: To analyze the data, we need to create models. Most of them are ML and DL models. These models can learn from data and make predictions on new unseen data.
Computational resources
Availability of computational resources leads to the emerging of ML and DL. ML models that have a large number of hyperparameters need great computing power. Deep learning models with large numbers of layers also need great computing power. The Graphics Processing Units (GPUs) deliver the computing power needed for these tasks. Nowadays, a computer with a good GPU can be bought at an affordable price.
Another thing is that the existence of Cloud Computing. You don’t need to have powerful hardware on your local machine to run power-consuming algorithms. With the help of cloud software, you can run them anywhere, anytime. Most of the services are freely available. However, you need to pay for some additional services as you use them. AWS and Microsoft Azure are well-known cloud computing services available today.
Algorithms
Major improvements in algorithms make ML and DL techniques more usable. Nowadays, algorithms are available for any kind of Data. Linear Regression, Logistic Regression, Support Vector Machines, etc are available for linear data. Decision Trees, Random Forests, XGBoost (Tree-based algorithms) are available for non-linear data. Random Forests and XGBoost algorithms are very powerful so that their predictions are highly accurate.
Convolutional Neural Networks (CNNs) are available for image data. Recurrent Neural Networks (RNNs) are available for text or sequential data.
To take advantage of neural networks, you need to have large quantities of data. The performance of general ML algorithms does not increase with large quantities of data. If you have large quantities of data, using neural networks will increase the performance of your models.
Open-sourced frameworks
The above-mentioned algorithms and many other algorithms are freely available through the open-sourced frameworks. Python has high-level frameworks for ML and DL tasks.
For general ML, Scikit-learn is the best framework. Its syntax is very consistent. It includes many Machine Learning models which can be easily implemented using a few lines of code. TensorFlow and Keras developed by Google are the most popular DL frameworks. PyTorch from Facebook is another great option for a DL framework.
There are strong communities behind these frameworks. They often add new things to these frameworks. Highly standard documentations are also available so that anyone can refer them to learn more about these frameworks.
Until next time, happy learning for everyone! Meanwhile, you can read my other posts at:
https://rukshanpramoditha.medium.com
Rukshan Pramoditha, 2021–07–10