The world’s leading publication for data science, AI, and ML professionals.

7 Best Research Papers To Read To Get Started With Deep Learning Projects

The seven best research papers that have stood the test of time and will help you to create amazing projects

Photo by UX Indonesia on Unsplash
Photo by UX Indonesia on Unsplash

Research papers are an integral part of learning more about the several new methodologies that are introduced and discovered in the world of Artificial Intelligence (AI). All the data scientists and researchers come together on a common platform to share their meticulous work and knowledge to help grow the flourishing AI community even more to reach greater heights.

With tons of research and innovation brought forward by skilled individuals each day, the overall experience of staying updated with the latest technologies can be overwhelming. This is especially the case for a beginner who is just trying to get engrossed in the world of deep learning. It might be hard to figure out which research papers are the best starting point for developing new projects and gaining an intuitive understanding of the subject.

In this article, we will look at seven of the best research papers that developers and Data Science research papers that developers and data science enthusiasts must read. These research papers have stood the test of time and provide a baseline for many of the implementations that are already implemented or yet to be implemented in the future.

For deep learning, it is always best to have your own device or system for computing complex problems. Before proceeding further into this article, I would suggest looking into some of the best PC builds for deep learning in the numerous price ranges from the following article link provided below.

Best PC Builds For Deep Learning In Every Budget Ranges


Getting Started with Research Papers for Deep Learning:

Photo by UX Indonesia on Unsplash
Photo by UX Indonesia on Unsplash

The field of deep learning is enormous. There are several options of research papers to choose from, as each one of the introduced works introduces a new concept or methodology that is useful for the data science and artificial intelligence community. In this section of the article, we will explore seven of the most beneficial and intriguing research papers that have stood the test of time.

1. ResNet:

Research Paper: Deep Residual Learning for Image Recognition

Authors: Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun

Summary:

There are several transfer learning models that are used by data scientists to achieve optimal results on a particular task. The AlexNet model was the first to be introduced to win an image processing challenge in 2012, and since then, transfer learning models like VGG-16 have been the most influential pieces of deep learning.

We will focus on the ResNet architecture for this article because the ResNet network manages to achieve slighter improvements than its counterparts. Another significant reason for considering the ResNet network is the fact that it has many variations depending on the type and number of residual blocks that you plan to include. Some of the ResNet structures are ResNet-18, ResNet-34, ResNet-50, ResNet-101, etc.

The ResNet architecture makes use of residual blocks. This concept is quite significant as it solves some of the issues of other shallower networks, which suffer from the problems of vanishing or exploding gradients. The residual blocks pass to compute the sum of the output of a previous block with a current and deeper layer in the model.

Significance (Why read this paper?):

Transfer learning is a major part of deep learning. We can make use of the learned information from one model and utilize the data to build another custom model on top of it for performing numerous tasks. Even without building a custom architecture, we can use the original transfer learning model to perform a particular task. Using transfer learning models avoids the need to create and build your own model from scratch each time.


2. YOLO:

Photo by Julia Zolotova on Unsplash
Photo by Julia Zolotova on Unsplash

Research Paper: You Only Look Once: Unified, Real-Time Object Detection

Authors: Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi

Summary:

Object detection (alongside face recognition) has always been a catching point of deep learning models. Ever since the introduction of the YOLO model, we have been able to solve the complex problem of object detection by creating a bounding box around specific objects of significance that the model is trying to determine. The YOLO network makes use of a series of convolutional neural networks for learning how to detect objects in real-time during training.

The YOLO model has been improved and developed continuously since its original release in 2015. We have had substantial improvements to each of these methods with each version, such as YOLO-v2 and YOLO-v3. The most recent YOLO version, as of writing this article, is the YOLO-v6 mechanism. Each of these architectures has constantly progressed to make additional refinements to improve efficiency for the object detection tasks.

Significance (Why read this paper?):

Computer vision is one of the most popular branches of artificial intelligence. A deep learning model that can solve these complex computer vision problems, such as real-time object detection and face recognition, is highly valuable. YOLO is one of the best methods for solving the object detection problem with high precision. If the viewers are interested in mastering the basics of computer vision, I would recommend the viewers check out the following guide provided below.

OpenCV: Complete Beginners Guide To Master the Basics Of Computer Vision With Code!


3. U-Net:

Research Paper: U-Net: Convolutional Networks for Biomedical Image Segmentation

Authors: Olaf Ronneberger, Philipp Fischer, and Thomas Brox

Summary:

The task of segmenting involves categorizing similar parts of an image into a cluster. All the identical classes are classified and segmented into a particular entity. With the segmentation of images, most of the complexities of an image can be removed, allowing the user to make further computations for image processing and analysis.

Once the segmentation is performed on an image, it opens up numerous possibilities to interpret the data more effectively. One such model that performs this task effectively is the U-Net network. The U-Net model architecture, which comprises an encoder and decoder type network, accepts an input image that needs to be segmented.

Depending on the number of classes and the particular type of task, the image that is passed through the network goes through several stages of convolution, downsampling, and finally upsampling to meet the specific task. The network also makes use of skip connections to avoid any degradation problems and carry out the useful information in each downsampling to the upsampling stage.

Significance (Why read this paper?):

The U-Net is a revolutionary paper for the segmentation of different types of computer vision problems. A large number of tasks, especially in the field of medical image processing, make use of the U-Net architecture. There have been several variations that have been derived from the U-Net network that are also useful for segmentation projects. Once we have a U-Net model to segment a type of specific image, we can make use of that image for further analysis and computing.


4. Batch Normalization:

Research Paper: Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Authors: Sergey Ioffe, Christian Szegedy

Summary:

In deep learning, it is often a good idea to normalize the data. Let us consider the MNIST dataset. In the MNIST dataset, once we get the values of the grayscale images of the digits 0–9 in the form of numpy arrays, we have the range of values from 0–255. It is often a good idea to standardize and normalize these data elements to a range of values between 0. and 1. floating variables.

Batch Normalization layers do a somewhat similar action where a mini-batch mean, and a mini-batch variance is computed to normalize the data accordingly. The Batch Normalization layer helps to speed up the training process and also reduces the significance of weight initializations. These layers also help to regularize the model training and slightly overcome issues of over-fitting.

Significance (Why read this paper?):

Batch Normalization layers are an integral part of most modern deep learning architectures. While constructing any type of complex neural network, a batch normalization layer can be considered a high utility entity. These layers take the input from one of the layers and map it to the other layer by normalizing the data to accelerate the computation by reducing the internal covariate shift. These Batch Normalization layers are especially useful for convolutional neural networks, where they allow each of the layers to function more independently.


5. Transformers:

Photo by Arseny Togulev on Unsplash
Photo by Arseny Togulev on Unsplash

Research Paper: Attention Is All You Need

Authors: Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, Illia Polosukhin

Summary:

Making use of simple recurrent neural networks to solve complex tasks often results in a few major issues for highly complex tasks. Two of the primary drawbacks were exploding and vanishing gradients, where essential information was lost during longer sequences of transmission of data. Long-Short Term Memory (LSTM) models were able to fix most of the elemental issues with RNNs. By utilizing these LSTM networks in sequence-to-sequence models, we were able to achieve highly successful results on a wide variety of natural language processing tasks.

The transformers network utilizes a connection of an encoder and decoder type architecture with an attention mechanism. The attention layer provides interconnectivity between the decoder and the encoder allowing it to access the hidden states. This process allows the model to have a higher weightage towards the specific entities (such as keywords across sentences). There are different types of attention mechanisms, such as dot attention, self-attention, and multi-head attention, among others.

Significance (Why read this paper?):

Transformers are some of the best deep learning tools that are extremely useful for solving a wide array of natural language processing tasks. These transformers have the capability to perform complex language tasks ranging from machine translation from one language to another, question-answering systems, chatbots, text classification problems, and so much more. The possibilities of transformers are limitless, and this research paper serves as a great conceptual base platform for every other research paper inspired by it, such as the Generative Pre-trained Transformer (GPT) and Bidirectional Encoder Representations from Transformers (BERT) models.


6. Generative Adversarial Networks (GANs):

Photo by ArtSpiley on Unsplash
Photo by ArtSpiley on Unsplash

Research Paper: Generative Adversarial Nets

Authors: Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio

Summary:

One of the more popular research papers which were initially introduced in 2014 by Ian Goodfellow and his team was Generative Adversarial Networks. These architectural deep learning frameworks are extremely impactful in generating completely new data. The functionality of these adversarial networks makes use of a generator and discriminator network, where the two architectures compete with each other to improve the overall results.

The generator tries to generate unique data that looks like real sample images. On the other hand, the discriminator tries to detect the generated samples and classify them as real or fake. Both these two networks are trained simultaneously in a constant continuous loop. Once the generator is able to bypass the discriminators checking system and generate realistic images, we have a completely trained Generative Adversarial Network. This model can generate unique data from scratch for a specific type of data.

Significance (Why read this paper?):

There are multiple iterations and variations of Generative Adversarial Networks such as DCGANs, Cycle GANs, SRGANs, W-GAN, and so much more. These architectures are some of the most used elements in the generation of new data in the field of deep learning today. With generative networks gaining more popularity now than previously before, the advancements that await this branch are enormous. It is highly recommended to start with the following research paper to keep up with the continuous advancements of these generative archetypes.


7. Autoencoders:

Research Paper: Autoencoders

Authors: Dor Bank, Noam Koenigstein, Raja Giryes

Summary:

The autoencoders are another type of generative network that is useful for numerous applications. The autoencoders make use of an encoder and decoder type network along with a latent dimensional space. The encoder stage of the autoencoders takes in an input that can be interpreted by the network into the latent dimensional space containing vectors. These vectors contained in the latent space are compressed in nature.

Hence, the autoencoders are useful for tasks of dimensionality reduction where the original image of a particular size is compressed into a latent dimensional space. With this compressed latent dimensional space, the decoder can reconstruct the same image with a lesser dimensional space. The reconstructed image is similar to the original image but has lesser vectors in comparison to the original image.

Significance (Why read this paper?):

Autoencoders have numerous applications that are used by data scientists and deep learning researchers. Apart from dimensionality reduction applications, as we previously discussed, these autoencoders are also useful for tasks like image denoising, feature extraction, and anomaly detection. Apart from the mentioned applications, a variation of the autoencoders called variational autoencoders is useful for image generation, similar to GANs. Hence, it is safe to say that these autoencoders have enormous potential in the field of deep learning.


Conclusion:

Photo by Firmbee.com on Unsplash
Photo by Firmbee.com on Unsplash

Research has formalized curiosity. It is poking and prying with a purpose. — Zora Neale Hurston

Research and innovation are integral pillars of development and learning. The modern quality of research has risen to reach greater heights. Each of them contains large amounts of knowledge for an individual to enlighten themselves with. The quality of the high-level research papers is especially true for deep learning, which involves tons of research and time investment.

In this article, we understood the basic aspects of the seven best research papers that have stood the test of time. Hence, they are a resourceful asset for all beginner data scientists to learn more about and explore further. We also understood the significance of these research papers and the specific concepts they cover. Before diving into the millions of options for research papers for specific topics, I recommend checking these out to gain further exposure to the subject of deep learning.

If you want to get notified about my articles as soon as they go up, check out the following link to subscribe for email recommendations. If you wish to support other authors and me, then subscribe to the below link.

Join Medium with my referral link – Bharath K

If you have any queries related to the various points stated in this article, then feel free to let me know in the comments below. I will try to get back to you with a response as soon as possible.

Check out some of my other articles in relation to the topic covered in this piece that you might also enjoy reading!

7 Python Programming Tips To Improve Your Productivity

Best Seaborn Visualizations for Data Science

Visualizing CPU, Memory, And GPU Utilities with Python

Thank you all for sticking on till the end. I hope all of you enjoyed reading the article. Wish you all a wonderful day!


Related Articles