Why you should NOT use MS MARCO to evaluate semantic search
And likely not many other widely used datasets either
Published in
7 min readMar 23, 2020
If we want to investigate the power and limitations of semantic vectors (pre-trained or not), we should ideally prioritize datasets that are less biased towards term-matching signals. This piece shows that the MS MARCO dataset is more biased towards those signals than we expected and that the same issues are likely present in many other datasets due to similar data collection designs.