The world’s leading publication for data science, AI, and ML professionals.

Free hands-on tutorials to get started in Natural Language Processing 📚

Getting started with NLP doesn't have to be hard

Photo by Jason Leung on Unsplash
Photo by Jason Leung on Unsplash

Natural Language Processing(NLP) is a subfield of Artificial Intelligence that has been progressing with leaps and bounds in recent years. The tremendous progress has been made possible due to the collaborative efforts of a number of people in the research, academia, and industrial domain. Some of these institutions have been kind enough to make their material public so that people worldwide could benefit from them. This article is a compilation of five such excellent resources which could prove to be highly beneficial to people starting out in NLP or even people with experience. Created by experts in their domain, these courses offer a mix of interactive exercises, crisp theory, and real-world case studies.


1. Fundamentals of NLP (Chapter 1) by dair.ai

Source: Github Page of dair.ai
Source: Github Page of dair.ai

dair.ai is a community effort to democratize artificial intelligence research, education, and technologies. All their projects are freely hosted on GitHub, and they disseminate knowledge through meetups, newsletters, blogs, and study materials. One such work in progress project is on Fundamentals of NLP, which essentially will be a series on NLP principles from scratch. Currently, only the first chapter is available, and if you are an absolute beginner, this should be a good starting point. The chapter is available both in colab and as a web version. The chapter covers the following NLP basics in lucid and easy to understand the terminology:

Image by Author
Image by Author

After going through the tutorial, you will have a sufficient understanding of the motivation behind some of the important NLP concepts. This will set the stage for the advanced stuff.


2. Scikit-learn for text data

[Source: The scikit-learn developers](http://r    The scikit-learn developers)
[Source: The scikit-learn developers](http://r The scikit-learn developers)

Scikit-learn, a popular Python library for Machine Learning, has several hands-on tutorials for working with text data. The hands-on tutorial walks you through some of the essential aspects of text analytics using the 20 Newsgroups data set: a collection of approximately 20,000 newsgroup documents, partitioned evenly across 20 different newsgroups. At the end of the tutorial, you will have a fair idea of the following aspects of NLP:

Image by Author
Image by Author

The tutorial is also followed by a few exercises to practice and cement the concepts further.


3. NLP Course by Elena (Lena) Voita

source: https://lena-voita.github.io/nlp_course.html
source: https://lena-voita.github.io/nlp_course.html

Lena is a Ph.D. student at the University of Edinburgh. She recently published an NLP course titled NLP Course | For You. This is an extension to the (ML for) Natural Language Processing course that she teaches at the Yandex School of Data Analysis (YSDA) since fall 2018. This course is pretty unique in many ways:

  • Firstly, it free and openly available.
  • Secondly, the course has been designed for convenience, clarity, and is learner-friendly. This makes it different from a lot of other NLP courses that I have encountered in the past. The fact that Elena had focussed so much on personalizing the course is pretty commendable. In her words:

I wanted to make these materials so that you (yes, you!) could study on your own, what you like, and at your pace. My main purpose is to help you enter your own very personal adventure. For you.

The course currently covers the following modules:

  • Word Embeddings
  • Text Classification
  • Language Modelling
  • Seq2seq and Attention

Every module is a combination of carefully curated content, including interactive parts, exercises, links to related research papers, and even some NLP games. The course is being actively developed with a module on transfer learning, which is estimated to be added in November. There is even a supplementary section on Convolution Networks.


4. Advanced NLP with spaCy

source: https://course.spacy.io/en/
source: https://course.spacy.io/en/

spaCy is a modern Python library for industrial-strength Natural Language Processing. Advanced NLP with spaCy is a free interactive course created by Ines Montani – one of the core developers of spaCy. In this free and interactive online course; you’ll learn how to use spaCy to build advanced natural language understanding systems, using both rule-based and machine learning approaches. The course consists of four chapters, which are further broken down into small bite-sized interactive modules:

By the end of this course, you’ll have sufficient experience to build your own little projects in NLP using spacy.


5. A Code-First Intro to Natural Language Processing by Fastai

Source: Videos for the NLP course from fast.ai
Source: Videos for the NLP course from fast.ai

A Code-First Introduction to Natural Language Processing is a course delivered by Rachel Thomas which follows the fast.ai top-down methodology of teaching. Here is an excerpt from the course’s official blog post:

The course teaches a blend of traditional NLP topics (including regex, SVD, naive bayes, tokenization) and recent neural network approaches (including RNNs, seq2seq, attention, and the transformer architecture), as well as addressing urgent ethical issues, such as bias and disinformation.

The course comes equipped with Jupyter Notebooks and accompanying videos touching on some of the important aspects of NLP like:


Conclusion

In this article, we looked at five different NLP resources beneficial for people either starting out or with some sort of experience in language processing. This isn’t an exhaustive list and there are other well-known materials which I might have missed. However, I have personally gone through all the above resources and hence can recommend them without an ounce of doubt. You can either take all the courses or choose some of them. Whichever path you go, make sure you practice your skills since only having theoretical knowledge will not take you much further, in the industry.


Related Articles