Introduction
Text summarisation refers to a set of techniques applied in the context of Natural Language Processing (NLP) and are capable of shortening the original text transcription in a way that the key information is preserved.
Text summarisation is useful in contexts where there’s a need for consuming large chunks of data easier and possibly quicker. Additionally, such applications can be applied in contexts where we need to deal with audio files. This means that the first step would be to perform Speech to Text prior to Text Summarisation and then use that output as input to the service that will be performing the summarisation task.
In today’s article we are going to explore how to summarise audio and video files using an intuitive and very easy to use API.
Summarizing Audio Files with a simple API
We will performing audio summarisation using AssemblyAI API Auto Chapters features that provides summary over time for audio files that were previously transcribed using the Speech-To-Text API.
For this tutorial, and as an audio file, we will be using Biden’s speech that was given to the US Congress on the 28th of April, 2021.
The first thing you would need (especially if you are planning to follow along this tutorial) is to get your API Key from AssemblyAI website (for free).
Now the second thing we need to do is to upload our audio file to the hosting service of AssemblyAI which in turn will give us back a link that we’ll use it to the subsequent request in order to perform the actual transcription and summarisation.
This above call returns the upload url that essentially hosts our uploaded audio file. Now that we’ve done that, we can go ahead and get the transcription of the audio file as well as the summarised chapters, as generated by AssemblyAI API algorithms.
In the above call, note that we have to set auto_chapters
to True
in order to instruct the API to perform summarisation on the transcribed text.
Interpreting the response
The response from the previous API call would look like the one shown below:
In the returned output, the full text transcription can be found in the text
key while the summarised chapters generated by AssemblyAI in chapters
key. For every extracted chapter, the response will also include the starting and ending timestamps as well as the summary
that essentially includes a couple of sentences that summarise the audio for that particular timeframe along with the headline
.
Full Code
The full code that was used for this tutorial in order to upload an audio file to AssemblyAI API, perform speech-to-text and summarization can be found below.
Final Thoughts
In today’s short guide we discussed about how to perform summarization over audio or video files using AssemblyAI API feature called Auto Chapters. As part of this tutorial, we covered only a small subset of the features offered by their API, so make sure to check their official documentation if you want to see a full list of their offerings.
Become a member and read every story on Medium. Your membership fee directly supports me and other writers you read. You’ll also get full access to every story on Medium.