An empirical approach to speedup your BERT inference with ONNX/Torchscript
Published in
8 min readFeb 5, 2021
In recent years models based on the Transformer architecture have been the driving force behind NLP breakthroughs in research and industry. BERT, XLNET, GPT or XLM are some of the models that improved the state of the art and reached the top of popular benchmarks like GLUE.