With 2000 attendees, 50+ reputable fellows, and a 22% acceptance rate for papers, the North American Chapter of the Association for Computational Linguistics (NAACL-HLT) is a scientific force to be reckoned with.

Since its inception in 1998, NAACL-HLT has been at the forefront of contemporary computational linguistics, facilitating research and information exchange between individual scientists and professional societies the world over. This year’s NAACL-HLT is scheduled to take place in Mexico City between June 6 and 11.

With six eagerly anticipated NLP (Natural Language Processing) tutorials on the agenda, let’s briefly look at what each one has to offer:
Pretrained Transformers for Text Ranking: BERT and Beyond
❓Content. This tutorial will offer an overview of text ranking using neural network architectures, namely BERT (Bidirectional Encoder Representations from Transformers). The aim of ranking is to generate an ordered list of texts taken from a corpus, and BERT coupled with self-supervised pretraining have opened numerous possibilities for NLP. Specifically, the speakers will be looking at both multi-stage and direct text ranking architectures with a focus on techniques for handling long documents, as well as addressing effectiveness vs. efficiency tradeoffs.
💡 Key insights. The ranking problem is fundamental to Information Retrieval and Question Answering. As the Transformer-based representations show impressive results on a variety of NLP tasks, it might be interesting to apply pre-training BERT for ranking as well as for reducing the number of model parameters.
Fine-grained Interpretation and Causation Analysis in Deep NLP Models
❓Content. Deep neural networks (DNNs) play a crucial role in many NLP tasks, among them MT (Machine Translation), question answering, and summarization. In this tutorial, the speakers will examine how to interpret components of a neural network model from two standpoints: 1. intrinsic analysis (with regard to a desired language task); and 2. causation analysis (with regard to decisions made by the model). Both the interpretability of model prediction and associated toolkits that can aid in fine-grained interpretation will be discussed.
💡 Key insights. DNNs learn implicit representations of natural language phenomena that require a significant effort to be properly interpreted during performance analysis. This area of study is investigated at the BlackboxNLP workshop series. By analyzing individual neurons and redundancy in neural networks, it is possible to reveal meaningful distributions of various linguistic properties in neurons.
Deep Learning and Graph Neural Networks in NLP
❓Content. Graph Neural Networks (GNNs) have offered a welcome solution to many NLP problems; nevertheless, transforming original text sequence data into graph-structured data still poses a series of obstacles. This tutorial will show you how to use Graph4NLP, a recently developed open-source library, to overcome these NLP challenges through advanced GNN-based models. This approach can help make advances in MT, NLG (Natural Language Generation), and Semantic Parsing among other areas.
💡 Key insights. Graph Neural Networks are receiving significant attention from the Machine Learning community these days, while the NLP field is focused on Transformer-based representations. Similar to the TextGraphs workshop series, this tutorial offers a different point of view for enhancing your models by using structured information that lies in language graphs and networks with a ready-to-go open source library.
Automatic Evaluation Metrics in Natural Language Generation
❓Content. With DL (Deep Learning), research on NLG has accelerated greatly; however, AEMs (Automatic Evaluation Metrics) that facilitate improvement in DL research are in need of a tune-up. In this tutorial, the speakers will examine the evolution of AEMs along with the latest emerging trends and attempt to answer the most pressing questions, including: how can AEMs be organized into a coherent taxonomy; what are the shortcomings of the existing AEMs; and what possible new pathways to improvement could be taken?
💡 Key insights. Evaluating conversational agents in unstructured domains is non-trivial because the model needs to match human judgments. Fortunately, there are methods like ADEM and AQG for approaching this problem. If you develop a dialogue system, this tutorial will provide actionable insights for improving the flow of conversations.
Long-Sequence Processing in NLP
❓Content. The ability to process long-sequence documents is vital for many NLP tasks, among them document classification, summarization, question answering, and coreference resolution. Concurrently, many Transformer (BERT-type) models are too expensive for this purpose. In this hands-on tutorial with coding exercises, the speakers will evaluate hierarchical, graph-based, and retrieval-based methods of long-sequence processing and document-level representation learning; offer an overview of different Transformer and memory-saving methods; and delve into emerging new research in the field.
💡 Key insights. Transformer-based representations show state-of-the-art results on a huge variety of NLP benchmarks and downstream applications. However, both training and inference with these models are expensive and it’s helpful to consider additional powerful signals, such as SciBERT, SPECTER, and others.
Crowdsourcing Natural Language Data at Scale
❓Content. In this tutorial based on Toloka’s six-year industry experience, our team will look at data labeling using public crowdsourcing marketplaces with a focus on task design and decomposition, quality control techniques, and annotator selection. Relevant mathematical background along with many useful tricks of the trade will be revealed, followed by a hands-on production task, during which audience members will be able to launch their label collection projects on one of the largest crowdsourcing platforms and share their annotation ideas with each other.
💡 Key insights. As machine learning methods require more and more labeled data, we need to use approaches that produce such data at scale. Popular evaluation datasets, such as SQuAD, MultiNLI, and others are already built using crowdsourcing – an option worth considering. Building crowdsourcing pipelines is a specific skill that requires practice with task decomposition and quality control. A hands-on exercise gives you the chance to polish this skill.
What a great set of tutorials. And there’s still so much more to explore. What about you? What are you most looking forward about the conference? Feel free to share some suggestions down in the comments👇
References
[1] Pretrained Transformers for Text Ranking: BERT and Beyond, Jimmy Lin et al. 2020.
[2] PARADE: Passage Representation Aggregation for Document Reranking, Canjia Li et al. 2020.
[3] What Is One Grain of Sand in the Desert? Analyzing Individual Neurons in Deep NLP Models, Dalvi et al. 2019.
[4] Analyzing Redundancy in Pretrained Transformer Models, Dalvi et al. 2020.
[5] A Comprehensive Survey on Graph Neural Networks, Wu et al. 2020.
[6] Re-Evaluating ADEM: A Deeper Look at Scoring Dialogue Responses, Sai et al. 2019.
[7] Towards a Better Metric for Evaluating Question Generation Systems, Nema and Khapra 2018.
[8] SciBERT: A Pretrained Language Model for Scientific Text, Beltagy et al. 2019.
[9] SPECTER: Document-level Representation Learning using Citation-informed Transformers, Cohan et al. 2020.