Skip to content

RuntimeError: max(): Expected reduction dim to be specified for input.numel() == 0. Specify the reduction dim with the 'dim' argument. #446

@Fulin-Gao

Description

@Fulin-Gao

We ran into an issue with the calc-x training example, which throws the error below. Do you have any suggestions on how to address this?

ERROR Algorithm bundle crashed; signaling stop event client_server.py:155
Traceback (most recent call last):
File "/usr/local/lib/python3.12/dist-packages/agentlightning/execution/client_server.py",
line 144, in _execute_algorithm
await algorithm(wrapper_store, stop_evt)
File "/usr/local/lib/python3.12/dist-packages/agentlightning/trainer/trainer.py", line
527, in _algorithm_bundle
algorithm.run(
File
"/usr/local/lib/python3.12/dist-packages/agentlightning/algorithm/verl/interface.py", line
184, in run
run_ppo(
File "/usr/local/lib/python3.12/dist-packages/agentlightning/verl/entrypoint.py", line
78, in run_ppo
ray.get(
File "/usr/local/lib/python3.12/dist-packages/ray/_private/auto_init_hook.py", line 22,
in auto_init_wrapper
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/ray/_private/client_mode_hook.py", line
104, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/ray/_private/worker.py", line 2967, in get
values, debugger_breakpoint = worker.get_objects(
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/ray/_private/worker.py", line 1015, in
get_objects
raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(RuntimeError): [36mray::TaskRunner.run()[39m (pid=3508839,
ip=10.155.71.188, actor_id=d76b31cf33b73c5bf275b36607000000,
repr=<agentlightning.verl.entrypoint.TaskRunner object at 0x7f805a9d5b50>)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/agentlightning/verl/entrypoint.py", line
244, in run
trainer.fit()
File "/usr/local/lib/python3.12/dist-packages/agentlightning/verl/trainer.py", line 507,
in fit
metrics = self._train_step(batch_dict)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/agentlightning/verl/trainer.py", line 369,
in _train_step
metrics.update(compute_data_metrics(batch=batch, use_critic=self.use_critic,
suffix="_before_processing"))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/agentlightning/verl/trainer.py", line 125,
in compute_data_metrics
"critic/advantages/max" + suffix: torch.max(valid_adv).detach().item(),
^^^^^^^^^^^^^^^^^^^^
RuntimeError: max(): Expected reduction dim to be specified for input.numel() == 0. Specify
the reduction dim with the 'dim' argument.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions