Duplicate question detection using Word2Vec, XGBoost and Autoencoders

Akhilesh Khope
Towards Data Science
5 min readOct 9, 2018

--

Photo by Emily Morter on Unsplash

In this post, I tackle the problem of classifying questions pairs based on whether they are duplicate or not duplicate. This is important for companies like Quora, or Stack Overflow where multiple questions posted are duplicates of questions already answered. If a duplicate question is spotted by an algorithm, the user can be directed to it and reach the answer faster.

--

--