How Does the Segment-Anything Model’s (SAM’s) Encoder Work?

A deep dive into how image content embedding, sine and cosine positional embedding, guidance click embedding, and dense mask embedding are produced

Wei Yi
Towards Data Science
16 min readMay 14, 2024

--

Photo by benjamin lehman on Unsplash

This article explains how the Segment-Anything model’s (SAM) encoder works. SAM uses a ImageEncoderViT class to produce…

--

--

I am a principal data scientist at AstraZeneca. Previously I worked at SecondMind, Microsoft Research, and also was CTO of a hedge fund EQB.