NLP Tools

Read a Multi-Column PDF Using PyMuPDF in Python

A step-by-step introduction into the wonderful world of OCR (with pictures)

Ednalyn C. De Dios
Towards Data Science
5 min readFeb 22, 2022

--

Photo by Jaizer Capangpangan on Unsplash

OCR or optical character recognition is the technology used to automate text extraction from either an image or a document. The text found on these images and documents can be anything typed, handwritten, displayed on…

--

--