No matter your political leaning, the first 2020 Presidential Election Debate was a dumpster fire. It was hard for a human to understand what was being shouted over each other. How will AWS Transcribe do? Let’s set it up and take a look.
The Data
A basic Google search found the audio clip. I had to try a couple until I landed on one that didn’t have news commentary.
The Setup
I already have an AWS account and subscribed to Amazon Transcribe a month ago. All I had to do is click Create Job. I chose the General model and asked for a multi-speaker breakdown with a total of three speakers.

The audio split job ran longer than the regular job. Makes sense. Considering the file was 90 minutes long, 20 minutes run time is reasonable.

The Results
I ran two jobs, one with multi-speaker identification and one with no breakdown. You can get a translation preview, and you can download the entire script.

The transcript arrives in a JSON file. I converted it to text and loaded both sets to Kaggle so that you can review them on your own.
https://www.kaggle.com/silverfoxdss/20-presidential-debate-sept-29th-audio-transcript
The transcripts
Well, Amazon Transcribe straight out of the box couldn’t figure that mess out either! I highlighted two spots in the debate where I remember overlapping conversations. Yes, I used the word ‘conversations’ loosely.
"Uh, Mr Vice President, if Senate Republicans, we were talking originally about the Supreme Court here if Senate Republicans go ahead and confirm Justice Barrett, Uh, there has been talk about ending the filibuster or even packing the court, adding to the nine justices there. You call this a distraction by the president. But in fact, it wasn’t brought up by the president. It was brought up by some of your Democratic colleagues in understanding the Congress. So my question to you is you have refused in the past to talk about it. Are you willing to tell the American people tonight whether or not you will support either ending the filibuster or packing the whatever position I taken that that will become the issue?
The issue is the American people should speak. You should go out and vote. You’re in voting now. Vote and let your senators know how strongly you feel. Let vote now, make sure you, in fact, let people know where.
Senators, I’m not going to answer the question, because the answer, because the question is the question is, is is radical left. Would you shut up, man? Who is on your list? Joe, this is on your gentlemen. I think this is unprintable. Pack the court. We have not given.
We have ended the segment. We’re gonna move on to the second segment.
That was really productive. Segment wasn’t Keep yapping, man. The people understand, Joe. 47 years, you’ve done nothing. They understand. All right.
The second subject is covert 19 which is an awfully serious subject, So let’s try to be serious about it.
Here is a small snippet at the JSON trying to split the conversation up. It appears to try to break the conversation up by voice but doesn’t seem to identify which speaker it is. This JSON requires additional code to get it readable and useful.
{"transcript":"gentleman. I think this is unprintable. Pack the court. We have not given. We have ended the segment. We're gonna move on to the second segment. That was really productive. Segment wasn't. Yeah.","items":[{"start_time":"1013.97","confidence":"0.368","end_time":"1014.8","type":"pronunciation","content":"gentleman"},{"confidence":"0.0","type":"punctuation","content":"."},{"start_time":"1014.81","confidence":"1.0","end_time":"1014.93","type":"pronunciation","content":"I"},{"start_time":"1014.93","confidence":"1.0","end_time":"1015.38","type":"pronunciation","content":"think"},{"start_time":"1015.39","confidence":"0.998","end_time":"1015.57","type":"pronunciation","content":"this"},{"start_time":"1015.57","confidence":"0.998","end_time":"1015.75","type":"pronunciation","content":"is"},{"start_time":"1015.76","confidence":"0.861","end_time":"1016.76","type":"pronunciation","content":"unprintable"},{"confidence":"0.0","type":"punctuation","content":"."},{"start_time":"1016.76","confidence":"0.5","end_time":"1017.01","type":"pronunciation","content":"Pack"},{"start_time":"1017.01","confidence":"0.782","end_time":"1017.11","type":"pronunciation","content":"the"},{"start_time":"1017.11","confidence":"0.907","end_time":"1017.5","type":"pronunciation","content":"court"},{"confidence":"0.0","type":"punctuation","content":"."},{"start_time":"1017.51","confidence":"0.586","end_time":"1017.64","type":"pronunciation","content":"We"},{"start_time":"1017.64","confidence":"0.647","end_time":"1017.87","type":"pronunciation","content":"have"},{"start_time":"1017.88","confidence":"0.976","end_time":"1018.27","type":"pronunciation","content":"not"},{"start_time":"1018.27","confidence":"0.188","end_time":"1018.73","type":"pronunciation","content":"given"},{"confidence":"0.0","type":"punctuation","content":"."},{"start_time":"1018.74","confidence":"0.456","end_time":"1018.93","type":"pronunciation","content":"We"},{"start_time":"1018.93","confidence":"0.978","end_time":"1019.09","type":"pronunciation","content":"have"},{"start_time":"1019.09","confidence":"1.0","end_time":"1019.56","type":"pronunciation","content":"ended"},{"start_time":"1019.57","confidence":"0.693","end_time":"1019.73","type":"pronunciation","content":"the"},{"start_time":"1019.73","confidence":"1.0","end_time":"1020.19","type":"pronunciation","content":"segment"},{"confidence":"0.0","type":"punctuation","content":"."},{"start_time":"1020.19","confidence":"0.981","end_time":"1020.28","type":"pronunciation","content":"We're"},{"start_time":"1020.28","confidence":"0.938","end_time":"1020.44","type":"pronunciation","content":"gonna"},{"start_time":"1020.44","confidence":"1.0","end_time":"1020.62","type":"pronunciation","content":"move"},{"start_time":"1020.62","confidence":"0.779","end_time":"1020.76","type":"pronunciation","content":"on"},{"start_time":"1020.76","confidence":"0.779","end_time":"1020.84","type":"pronunciation","content":"to"},{"start_time":"1020.84","confidence":"1.0","end_time":"1020.95","type":"pronunciation","content":"the"},{"start_time":"1020.95","confidence":"1.0","end_time":"1021.32","type":"pronunciation","content":"second"},{"start_time":"1021.32","confidence":"0.998","end_time":"1021.83","type":"pronunciation","content":"segment"},{"confidence":"0.0","type":"punctuation","content":"."},{"start_time":"1021.84","confidence":"0.908","end_time":"1021.97","type":"pronunciation","content":"That"},{"start_time":"1021.97","confidence":"1.0","end_time":"1022.1","type":"pronunciation","content":"was"},{"start_time":"1022.1","confidence":"0.998","end_time":"1022.4","type":"pronunciation","content":"really"},{"start_time":"1022.41","confidence":"1.0","end_time":"1023.15","type":"pronunciation","content":"productive"},{"confidence":"0.0","type":"punctuation","content":"."},{"start_time":"1023.15","confidence":"0.998","end_time":"1023.53","type":"pronunciation","content":"Segment"},{"start_time":"1023.53","confidence":"0.72","end_time":"1023.97","type":"pronunciation","content":"wasn't"},{"confidence":"0.0","type":"punctuation","content":"."},{"start_time":"1024.44","confidence":"0.024","end_time":"1024.5","type":"pronunciation","content":"Yeah"},{"confidence":"0.0","type":"punctuation","content":"."}]}]},{"start_time":"1021.97","end_time":"1023.97","alternatives":
Conclusion
Phew! The transcription and JSON are as confusing as it was on live TV. I leave you with a word cloud of the debate text. It will be interesting to compare them across the debates. Does the tone get more optimistic and upbeat when in a townhall setting?
