Idea for Flux inference using AWS

This post has been de-listed

It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.

Post Flair (click to view more posts with a particular flair)

discussion

Post Body

I have an idea to create an EC2 instance with a CUDA AMI (with NVIDIA GPUs), deploy stable diffusion with Flux Dev and create an API from the EC2 instance only open to the VPC. Then creating a Lambda function with an external API that will:

The Lambda function starts the EC2 instance with a CUDA-enabled AMI.
It waits for the instance to be in running state using a waiter.
Calls an API endpoint on the EC2 instance using the VPC.
Stops the EC2 instance after receiving the API response.
Returns the API response from the instance back to the public API Gateway endpoint.

I understand that the inference time will be long because of the time it takes to start the instance, but the main purpose is to cut down costs and only have the expensive GPU EC2 instance running when needed.

Has anyone else tried this? Can this actually be feasible?

Author

Account Strength

100%

Account Age

11 years

Verified Email

Yes

Verified Flair

Total Karma

2,845

Link Karma

273

Comment Karma

2,572

Profile updated: 1 day ago

Posts updated: 1 month ago

TheSoundOfMusak

Subreddit

r/aws

Post Details

We try to extract some basic information from the post title. This is not always successful or accurate, please use your best judgement and compare these values to the post title and body for confirmation.

Posted: 2 months ago
Reddit URL: View post on reddit.com
External URL: reddit.com/r/aws/comment...