An empirical approach to speedup your BERT inference with ONNX/Torchscript

Maxence Alluin
Towards Data Science
8 min readFeb 5, 2021

--

Photo by Denny Müller on Unsplash

In recent years models based on the Transformer architecture have been the driving force behind NLP breakthroughs in research and industry. BERT, XLNET, GPT or XLM are some of the models that improved the state of the art and reached the top of popular benchmarks like GLUE.

--

--