This post has been de-listed
It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.
Necessary disclaimer that I am a novice in ML with probably just enough knowledge to be dangerous. I am starting a project trying to use ML in order to identify a vehicle object for sales images. Ultimately trying to segment the foreground from the background.
Single instance even if other instances are in the background.
My questions start to arise when collecting images to be used with training. I already found a dataset that has a uniform input image size. I want to add additional images to the training/testing data but I have been told that I can not use images of different sizes or that images have to be resized for uniformity but it brings up even more questions.
If different aspect ratio images are resized to the standard input dimension, this means that images could be stretched/compressed differently to fit the standard input dimension. Will this cause the model to be less accurate? Are there certain model architectures that will allow this dynamic input dimension ranges and certain architectures that will not?
If I use a standardized image dimension with lets say a 4:3 aspect ratio for training. And then I use the model to predict an image that is 3:2. Should I assume that this will be less accurate? Does making a model for each of the most popular aspect ratios make sense.
I found tensorflow has a function that will crop or pad an image to the target dimensions. Is this a common practice that should be considered for my workflow? Will images submitted for prediction need to follow the same workflow for prediction accuracy?
I am trying to gain enough of an understanding to start planning the model and the workflow of the application.
Am I asking the right questions?
Is there any other information that would be helpful?
Subreddit
Post Details
- Posted
- 3 years ago
- Reddit URL
- View post on reddit.com
- External URL
- reddit.com/r/learnmachin...