Author: Alex Dremov
-
Find out how Flash Attention works. Afterward, we’ll refine our understanding by writing a GPU…
7 min read -
We’ll begin with torch.compile, move on to writing a custom Triton kernel, and finally dive…
5 min read -
If all machine learning engineers want one thing, it’s faster model training - maybe after good test…
12 min read