Coming soon - Get a detailed view of why an account is flagged as spam!
view details

This post has been de-listed

It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.

6
Critic Network is failling to predict rewards
Post Body

https://preview.redd.it/y7o4wybc5ezd1.png?width=2058&format=png&auto=webp&s=66421482f30fdad610078cf6e12499ced4e87810

I am training an Actor - Critic model but it is not effectively learning the task. I realised, Critic Loss is not decreasing while training and decided to get an output of True rewards and critic outputs to compare critic networks performance. As you can see in the plot, it is not learning anything at all. I tried training with Vanilla LSTM and also another model with custom LSTM block with residual connection and feed forward network but both of them is doing same.

I am using shared layers for both Actor and Critic heads and single optimizer to train. What can be problem here?

Author
Account Strength
40%
Account Age
1 year
Verified Email
Yes
Verified Flair
No
Total Karma
14
Link Karma
4
Comment Karma
10
Profile updated: 2 days ago
Posts updated: 3 days ago

Subreddit

Post Details

We try to extract some basic information from the post title. This is not always successful or accurate, please use your best judgement and compare these values to the post title and body for confirmation.
Posted
2 months ago