GPTQ or bitsandbytes: Which Quantization Method to Use for LLMs — Examples with Llama 2
Large language model quantization for affordable fine-tuning and inference on your computer
Published in
7 min readAug 25, 2023
As large language models (LLM) got bigger with more and more parameters, new techniques to reduce their memory usage have also been proposed.