This post has been de-listed
It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.
Hi everyone, I dont' know if this is the correct subreddit for this question, but in case let me know where i should post.
I am facing a regression task where I have to predict the price of some diamonds using a tabular dataset. I know that highly correlated features need to be removed and I noticed that 4 features, namely "carat", "x", "y" and "z", have correlation above 0.9. The problem is the performance of the model. If I keep all variables, I get a mean squared error of 702188 , If I keep only one of the three, I get 3107246 . I am using a PoissonRegressor(alpha=0.001, max_iter=5000) because the target variable has mean equal to the standard deviation. I really don't understand why my model has this behaviour, do you have any idea? Thanks for any help
Subreddit
Post Details
- Posted
- 7 months ago
- Reddit URL
- View post on reddit.com
- External URL
- reddit.com/r/MLQuestions...