Speeding up vision transformer prediction by 9 times with PyTorch, ONNX and TensorRT

How to use 16bit float, TensorRT, network rewriting and multi-threading to dramatically speed up deep learning model prediction

Wei Yi
Towards Data Science
11 min readJun 4, 2023

--

Photo by Sanjeevan SatheesKumar on Unsplash

Vision transformer such as UNET, SwinUNETR are state-of-the-art in computer vision tasks, such as semantic…

--

--

I am a principal data scientist at AstraZeneca. Previously I worked at SecondMind, Microsoft Research, and also was CTO of a hedge fund EQB.