Hosting Multiple LLMs on a Single Endpoint
Utilize SageMaker Inference Components to Host Flan & Falcon in a Cost & Performance Efficient Manner
Published in
10 min readJan 11, 2024
The past year has witnessed an explosion in the Large Language Model (LLM) space with a number of new models paired with various technologies and tools to help train, host, and evaluate these models. Specifically…