Extracting Text from a PDF Using Python
Jan. 6, 2019 0 comments
Recently I needed to extract text from a PDF file using Python. Quick googling led me to PyPDF2 package, however I wasn't able to extract any text from my test PDF with it. The test PDF was created with Google Docs (a very common scenario) and did not have any fancy formatting, so PyPDF2 was disqualified for my purposes. After further googling I found pdfminer package and its Python 3 compatible version — pdfminer.six. (...)
Featured Posts
-
Running Multiple Celery Beat Instances in One Python Project
Feb. 1, 2021 -
Setting Up MySQL in LibreELEC on Raspberry Pi
Nov. 17, 2017 -
Autodocumenting your Python code with Sphinx - part 2
Feb. 24, 2016