Author: Shirley Li
-
Key architecture innovation behind DeepSeek-V2 and DeepSeek-V3 for faster inference
9 min read -
Mastering the art of fine-tuning: Learnings for training your own LLMs.
22 min read -
Scaling from 117M to 175B: Insights into GPT-2 and GPT-3.
10 min read -
Understanding the Evolution of ChatGPT: Part 1-An In-Depth Look at GPT-1 and What Inspired It
Deep LearningTracing the roots of ChatGPT: GPT-1, the foundation of OpenAI’s LLMs
11 min read