Skip to content

Target and Local Brains Simultaneous Update #17

@davera-017

Description

@davera-017

In the class Agent the local and target brains are getting updated with the same frequency. In the case of DQN, for example, this is an undesired behavior since the local brain should be trained more frequently than the target brain gets updated.

class Agent(BaseAgent):
  ...
  def update(self, experience):
          self.memory_buffer.add(experience)
          if self.update_step % self.update_every == 0:
              try:
                  batch = self.memory_buffer.sample()
              except RuntimeError:
                  # If the batch can't be obtained, skip the update proc
                  return
              pred, target = self.estimator(batch)
              self.local_brain.update(batch, pred, target)
              self.target_brain.update_from(self.local_brain)
          self.update_step += 1

Also, in a similar direction, the Train class forcibly performs an update of the brains via agent.update() at every step. I think that it is a good idea to add a train_freq parameter such that one gains control over the experiencing/training ratio.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions