Understanding the Backbone of Video Classification: The I3D Architecture
Published in
4 min readJun 7, 2020
One of the distinctive differences between information in a single image and information in a video is the temporal element. This has led to improvements of deep learning model architectures to incorporate 3D processing in order to additionally process temporal information. This article summarizes the architectural changes from images to video through the I3D model.