This post has been de-listed
It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.
Hi. For some context I'm not much into JS. So if this can be done only with python-django, I'd prefer that.
I've explored JSON 2Video API, but am still trying to interpret the documentation to see how exactly to implement my idea for my use case.
I want to make a program where the user can upload an audio, says a song or podcast recording. The site then creates the transcript to apply as subtitles (I tried assembly AI). When the AI transcribes the user should be able to correct the words as the the ai makes mistakes. The corrected subtitles should still match the timing of the audio. The user should be able to apply different styles to the transcript to create a video that will play the sound with the new subtitles. As in it should be able to change the background color of one word to be blue. Another green. Fonts. Bold. Size. Etc.
I have this thought process outlined but just working for how to implement it exactly. I have tried using webspeech api and web audio. But none give exactly what I've described. Or I can't find explicitly in the docs. I also can't find anything from Tiktok api, and thought I'd find it in Capcut API.
If JS is necessary I can use that too. If you need more context my dms and comments are open.
Subreddit
Post Details
- Posted
- 3 months ago
- Reddit URL
- View post on reddit.com
- External URL
- reddit.com/r/learnprogra...