Let OpenAI's new GPT-4-Vision see PDFs, URLs, Images, and Video with one line of Python

This post has been de-listed

It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.

Post Flair (click to view more posts with a particular flair)

Resource

Post Body

What it is:

The Pipe is a tool for feeding visually complex data (pdf, urls, docx, pptx, csv, youtube videos, etc) into vision-language models such as GPT-4. It is open source and entirely written in Python, so hopefully I posted this in the right place lol. this

Why I made this:

I tried to make an application to chat with my documents and web pages. Traditional scrapers would just give me only text, and would fail to accurately scrape most of the text anyways. This tool allows you to give GPT high quality text AND visual data in an LLM-ready prompt format, so it can see the document just like you or me.

Cheers!

thepi.pe

Author

User Disabled

Account Strength

Disabled 5 months ago

Account Age

9 years

Verified Email

Yes

Verified Flair

Total Karma

9,205

Link Karma

7,283

Comment Karma

1,805

Profile updated: 4 days ago

Posts updated: 5 months ago

Emcf

Subreddit

r/Python

Post Details

We try to extract some basic information from the post title. This is not always successful or accurate, please use your best judgement and compare these values to the post title and body for confirmation.

Posted: 6 months ago
Reddit URL: View post on reddit.com
External URL: reddit.com/r/Python/comm...