This post has been de-listed
It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.
I'm looking for a nice pretrained captioning framework that I can fine-tune. Ideally, I'm looking for a github link to a state-of-the art captioning codebase that can generate longform (1-4 sentence) captions. Basically something that uses some form of CNN to form a latent embedding that is then fed to a transformer-based language model or similar. I've done this for a class project in the past, but for a real-world project, I'm preferably looking for something a bit better than my from-scratch solution.
Any tips appreciated!
Subreddit
Post Details
- Posted
- 4 years ago
- Reddit URL
- View post on reddit.com
- External URL
- reddit.com/r/LanguageTec...