Large Language Models
-
While building my own LLM-based application, I found many prompt engineering guides, but few equivalent…
8 min read -
Key architecture innovation behind DeepSeek-V2 and DeepSeek-V3 for faster inference
9 min read -
Exploring techniques to prompt VLMs
21 min read -
A deep dive into “Not All Tokens Are What You Need for Pretraining”
7 min read -
Mastering the art of fine-tuning: Learnings for training your own LLMs.
22 min read -
How to become an LLM Scientist or Engineer from scratch
21 min read -
A LangGraph-based advanced agentic RAG with standard business guides, AI-based web search, trusted sources, and…
35 min read -
Scaling from 117M to 175B: Insights into GPT-2 and GPT-3.
10 min read -
Speeding Up Llama: A Hybrid Approach to Attention Mechanisms
12 min read