Updated specific locations to be searchable, take a look at Las Vegas as an example.

This post has been de-listed

It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.

30
Is there a better way to get text from pdf files using Python?
Post Flair (click to view more posts with a particular flair)
Post Body

Hello. Rank Python newbie here with a question. I have been working with texts converted from pdfs using Python PyPdf2. No problem there as I got the code working well cycling through multiple pdfs with no problems EXCEPT for the low quality of the texts. I've had to do a lot of tweaking to the texts and it's time consuming. On a whim I manually selected and copied a pdf and then pasted it to Notepad. I had previously converted this pdf to text using Python (PyPDF2) and the difference in quality between the two text files was staggering. PyPdf2 text extraction just doesn't stand up in quality to a manual C&P. If I had had C&P text files I could have saved myself a lot of time. I get a high number of new pdfs every day and do not have the time to C&P them manually. That said, here's my question:

Is there a way to use Python to select and copy a pdf file like it was being done manually and then paste it to a text file rather than use the standard Python PyPDF2 text extraction method?

Author
Account Strength
100%
Account Age
7 years
Verified Email
Yes
Verified Flair
No
Total Karma
2,038
Link Karma
47
Comment Karma
1,991
Profile updated: 3 days ago
Posts updated: 5 months ago

Subreddit

Post Details

We try to extract some basic information from the post title. This is not always successful or accurate, please use your best judgement and compare these values to the post title and body for confirmation.
Posted
1 year ago