Classifying 4M Reddit Posts in 4k Subreddits: an End-to-end Machine Learning Pipeline

Apply multi-label text classification to build a data product with fastText and FastAPI.

Ari Bajo
Towards Data Science
9 min readApr 3, 2020

--

End-to-end ML pipeline with data artifacts, executions and API endpoints.

Finding the right subreddit to submit your post can be tricky, especially for people new to Reddit. There are thousands of active subreddits with overlapping content. If it…

--

--

Freelance Data Engineer & Technical Writer. I digest 50+ tech blogs by French companies → guriosity.com