Paper explained: Masked Autoencoders Are Scalable Vision Learners
How reconstructing masked parts of an image can be beneficial
Published in
7 min readDec 29, 2021
Autoencoders have a history of success for Natural Language Processing tasks. The BERT model started masking word in different parts of a sentence and tried to reconstruct the full sentence by predicting the words to be filled into the blanks. Recent work has aimed to transfer this idea to the computer vision domain.