Can we generate Automatic Cricket Commentary using Neural Networks ?

Urwa Muaz
Towards Data Science
4 min readMay 20, 2019

--

Like everything else, the world of cricket has also gone through a lot of technological transformations in the recent years. They way cricket is played and and how it is viewed all around the world have both changed as a result. In this post we discuss if neural networks are capable of generating cricket commentary by just watching it.

There has been some work in the literature (can be found here, here and here) but they do not use neural networks. Being a believer in end to end deep learning, I think neural networks will seal the deal on this task in the near future. This is a hard problem to tackle, because apart from visual feature extraction, it involves very complex temporal dynamics and handling of long term dependencies. This is because commentary is generally highly contextualized by the development of current game, its significance in broader perspective (friendly match vs tournament), and histories of teams and players involved. Decontextualized explanation of what is happening appears to be a easier problem to solve and I can think of an architecture that can used for modelling this.

Drawing ideas from the recent emergence of spatio-temporal neural networks, I think a reasonable architecture should include a convolutional neural network to extract visual features from static frames, recurrent neural network to model complex non-linear temporal dynamics of these features, and decoder encoder architecture on top of them for end to end (video to commentary) learning. It seems manageable to build a decent amount of data for training this network, with cricket footage as input and commentary as the supervision signal. I spy a very promising project idea here for the interested people!

Cricket shot classification appears to be a vital component of this automatic commentary generation system. A very interesting recent work focuses on this problem and uses a CNN and LSTM based architecture for classification of video clips into relevant shots, it shows promising results. Player localization and pose estimation are very important for accurate shot classification. In the following sections, we will do a rudimentary exploration of efficacy of human pose estimation for identification of cricket shot from static images.

--

--