This post has been de-listed
It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.
Hello. Rank Python newbie here with a question. I have been working with texts converted from pdfs using Python. No problem there as I got the code working well cycling through multiple pdfs with no problems EXCEPT for the low quality of the texts. I've had to do a lot of tweaking to the texts and it's time consuming. On a whim I manually selected and copied a pdf and then pasted it to a text file. I had previously converted this pdf to text using Python (PyPDF2) and the difference in quality between the two text files was staggering. PyPdf2 text extraction just doesn't stand up in quality to a manual C&P. If I had had C&P text files I could have saved myself a lot of time. I get a high number of new pdfs every day and do not have the time to C&P them manually. That said, here's my question:
Is there a way to use Python to select and copy a pdf file like it was being done manually and then paste it to a text file rather than use the standard Python PyPDF2 text extraction method?
Subreddit
Post Details
- Posted
- 1 year ago
- Reddit URL
- View post on reddit.com
- External URL
- reddit.com/r/pythonhelp/...