It seems wait is enough to let the model complete the task

I recently ran several RL experiments using this environment, but I found that my policy model tends to output the wait action repeatedly. In the current setup, a single wait action advances the environment by 10 iterations/steps, and if the agent performs more than 10 consecutive waits, the environment automatically considers the task completed.

I’m trying to understand how to address this issue. How can I prevent the model from overusing the wait action or adjust the environment so this doesn’t prematurely end the task?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

It seems wait is enough to let the model complete the task #79

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

It seems wait is enough to let the model complete the task #79

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions