GPTQ or bitsandbytes: Which Quantization Method to Use for LLMs — Examples with Llama 2

Large language model quantization for affordable fine-tuning and inference on your computer

Published in

Towards Data Science

7 min readAug 25, 2023

As large language models (LLM) got bigger with more and more parameters, new techniques to reduce their memory usage have also been proposed.