Author: Harys Dalvi
-
How “election pundit predictions” betray a misunderstanding of probability
10 min read -
Looking into the math and the data reveals that transformers are both overused and underused.
15 min read -
The traditional reasoning behind why we need nonlinear activation functions is only one dimension of…
9 min read -
Data scientists use EDA for everything. Why not word embeddings?
16 min read