This post has been de-listed
It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.
6
Idea for Flux inference using AWS
Post Flair (click to view more posts with a particular flair)
Post Body
I have an idea to create an EC2 instance with a CUDA AMI (with NVIDIA GPUs), deploy stable diffusion with Flux Dev and create an API from the EC2 instance only open to the VPC. Then creating a Lambda function with an external API that will:
- The Lambda function starts the EC2 instance with a CUDA-enabled AMI.
- It waits for the instance to be in running state using a waiter.
- Calls an API endpoint on the EC2 instance using the VPC.
- Stops the EC2 instance after receiving the API response.
- Returns the API response from the instance back to the public API Gateway endpoint.
I understand that the inference time will be long because of the time it takes to start the instance, but the main purpose is to cut down costs and only have the expensive GPU EC2 instance running when needed.
Has anyone else tried this? Can this actually be feasible?
Author
Account Strength
100%
Account Age
11 years
Verified Email
Yes
Verified Flair
No
Total Karma
2,764
Link Karma
259
Comment Karma
2,505
Profile updated: 3 days ago
Posts updated: 3 weeks ago
Subreddit
Post Details
We try to extract some basic information from the post title. This is not
always successful or accurate, please use your best judgement and compare
these values to the post title and body for confirmation.
- Posted
- 2 months ago
- Reddit URL
- View post on reddit.com
- External URL
- reddit.com/r/aws/comment...