Practical NLP: Summarising Short and Long Speeches With Hugging Face’s Pipeline

Compared to sentiment analysis or classification, text summarisation is a far less ubiquitous NLP task due to the time and resources needed to execute it well. Hugging Face’s transformers pipeline has changed that. Here’s a quick demo of how you can summarise short and long speeches easily.

Published in

Towards Data Science

7 min readJul 27, 2020

Screen grabs from PAP.org.sg (left) and WP.sg (right).

Summarising a speech is more art than science, some might argue. But recent advances in NLP could well test the validity of that argument.

In particular, Hugging Face’s (HF) transformers summarisation pipeline has made the task easier, faster and more efficient to execute. Admittedly, there’s still a hit-and-miss quality to current results. But there are also flashes of brilliance that hint at the possibilities to come as language models become more sophisticated.

This post will demonstrate how you can easily use HF’s pipeline to summarise both short and long speeches. A minor work-around is needed for long speeches due to the maximum sequence limit for models used in the pipeline. I’ve also included the code for a simple text summarisation web app that you can run on your local machine.

1. DATA, NOTEBOOK AND CAVEATS

Here’s the Github repo for this post. The toy dataset used in this post is based on a small collection of political speeches/statements made during Singapore’s General Election in July 2020. The word-count for the speeches, by politicians from the ruling People’s Action Party (PAP) as well as the Opposition, range from 236 words to 3,746 words.

The simple workflow outlined in my notebook should work for any other collection of speeches you care to put together in a CSV file. The HF summarisation pipeline doesn’t work for non-English speeches as far as I know.

I’m using the pipeline out of the box, meaning the results stem from the default bart-large-cnn model. Feel free to test with other models tuned for this task.

2. SUMMARISING SHORT SPEECHES

It is unlikely that anyone needs to batch run summaries of dozens or hundreds of short speeches in one go. But in the off-chance that you do, the HF pipeline’s ease of use is a lifesaver.

Here, I ran the pipeline on seven speeches under 800 words. Results and time taken may vary according to the computer and version of the transformers library you are using.

This was run on a 2018 Mac Mini with 64Gb of RAM on version 2.11.0 of the transformers library. You can download the results here and compare the summaries with the original text:

Some of the summaries are so-so. There’s a clear tendency for the model to use the first and second sentences in a speech. This could work well for summaries of news stories, but less so for those involving political/corporate speeches.

But the model turned up one brilliant summary, the fifth one where Workers’ Party (WP) leader Pritam Singh laid out his party’s main pitch to voters. See the original statement here, and here’s a side-by-side comparison with the summary by the Bart model.

In just 59 words, the Bart model perfectly distilled the essence of the WP’s campaign message from a 627-word statement. Those who are familiar with Singapore politics would appreciate how sharp this summary is, particularly given that the model was not tuned on data involving Singapore news or politics.

It is not unreasonable to ask whether this summary is a fluke or happy coincidence. I ran the same set of speeches through the latest version of transformers (version 3.0.2) and got a less impressive translation: “The workers party offers singaporeans a choice and an alternative voice in parliament . The wp has contributed significantly to the democratic processes in our country . Former pap mps who wrote their memoirs after have helped to expose the limitations of a parliament dominated by the pap”.

Consistency of quality appears to be a problem at this point. A fuller test, unfortunately, is outside the scope of this post.

3. SUMMARISING A LONG SPEECH

This is likely the use case for most people: Summarising a long speech running into thousands of words. To get around the sequence length limits in the pipeline/models, I used a function to break up the text into a number of chunks of fixed-character length.

To keep this post concise, I decided to run the HF pipeline on the longest speech — 3,746 words — in the toy dataset. I broke up the speech into 9 chunks of 2,500 characters each.

The function allows you to easily set the character length that you prefer. This approach is admittedly clumsy, but I find it useful for a closer inspection of the results. Details are in my notebook.

Here’s the link to the original speech, and a CSV file comparing the original text with the mini-summaries can be dowloaded here. This is a side-by-side comparison of how the model summarised it :

Image on left: Screen grab from Pap.org.sg. Download the CSV results file of the long-speech summarisation here.

The 536-word “combined summary” is not as brilliant as the WP example I highlighted above, but it’s pretty decent (except for the section highlighted in red, which I’ll discuss in a bit) for a first draft. If I’m in a crunch, this is something I can quickly edit into a more useable form.

Singaporean users will easily pick out an obvious mistake in the summary of the sixth section: “Pap Lee Kuan Yew seeks not just your mandate but your strong mandate to lead singapore through this crisis.”

The original paragraph reads: “Investors will scrutinise the election results, and act on their conclusions. So will others, both friends and adversaries of Singapore. That is why in this election, the PAP seeks not just your mandate, but your strong mandate, to lead Singapore through this crisis.”

Lee Kuan Yew, Singapore’s founding prime minister, died in March 2015. However, those with more than a passing knowledge of Singapore politics might flippantly argue that the model was in fact correct and not erroneous with the above summary mentioning Lee Kuan Yew. But that is a discussion for another day ;p.

4. WEB APP

I tried to deploy a simple web app to let non-notebook users try the summarisation pipeline, but discovered that free accounts on Pythonanywhere.com don’t have enough storage space for the libraries required, particularly pytorch.

So instead I uploaded a version that runs on local machines, if you are keen to let colleagues at the workplace try it out. The code for the web app was adapted from two sources: the main one from Github and a second one by Plotly using DistilBart.

If the app returns an error, it means the text you’ve copied has exceeded the maximum number of tokens. Try summarising in smaller chunks or use the function in the notebook to break up the text. I’ll try to deploy the app online in future if Pythonanywhere.com includes pytorch as part of their pre-installed packages.

5. CONCLUSION

With more language models being added by the week, there’s ample scope for experimentation and more robust tests of the HF pipeline’s summarisation output. My own experience so far is that using a different model doesn’t necessarily translate into better results.

Towards the end of my notebook, I switched to the T5 model for a trial but found that the summarisations for three short speeches weren’t better than those churned out by the default Bart model.

But we are certainly getting a good glimpse of the future possibilities as companies like Hugging Face and OpenAI continue to push the NLP envelope. I’m certainly curious to see how GPT-3 performs on this task, given the impressive results it has achieved on a diverse range of tasks so far.

Meanwhile, those who want to experiment further by fine tuning a transformer model with their own dataset can check out these two notebooks here and here. Finding a good dataset for your use case, however, could be more challenging than you think.

If you spot errors in this post or the notebook, ping me at:

Twitter: Chua Chin Hon

LinkedIn: www.linkedin.com/in/chuachinhon