-
Notifications
You must be signed in to change notification settings - Fork 9
Description
请问在运行eval.py的时候有碰到显存不足的问题吗?
我的显卡是4090的,在运行eval.py时监控内存的使用情况如下图。
报错的内容为:
Error executing job with overrides: []
Traceback (most recent call last):
File "eval.py", line 276, in main
info.update(evaluate())
File "/home/itlab/isaacsim/isaac_sim-2022.2.0/extscache/omni.pip.torch-1_13_0-0.1.4+104.1.lx64/torch-1-13-0/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "eval.py", line 235, in evaluate
return_contiguous=False
File "/home/itlab/ybl/SimpleFlight/third_party/tensordict/tensordict/tensordict.py", line 6677, in clone
*[td.clone() for td in self.tensordicts],
File "/home/itlab/ybl/SimpleFlight/third_party/tensordict/tensordict/tensordict.py", line 6677, in
*[td.clone() for td in self.tensordicts],
File "/home/itlab/ybl/SimpleFlight/third_party/tensordict/tensordict/tensordict.py", line 4448, in clone
source={key: _clone_value(value, recurse) for key, value in self.items()},
File "/home/itlab/ybl/SimpleFlight/third_party/tensordict/tensordict/tensordict.py", line 4448, in
source={key: _clone_value(value, recurse) for key, value in self.items()},
File "/home/itlab/ybl/SimpleFlight/third_party/tensordict/tensordict/tensordict.py", line 8580, in _clone_value
return value.clone()
File "/home/itlab/ybl/SimpleFlight/third_party/tensordict/tensordict/tensordict.py", line 4448, in clone
source={key: _clone_value(value, recurse) for key, value in self.items()},
File "/home/itlab/ybl/SimpleFlight/third_party/tensordict/tensordict/tensordict.py", line 4448, in
source={key: _clone_value(value, recurse) for key, value in self.items()},
File "/home/itlab/ybl/SimpleFlight/third_party/tensordict/tensordict/tensordict.py", line 8580, in _clone_value
return value.clone()
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 23.52 GiB total capacity; 16.07 GiB already allocated; 19.62 MiB free; 16.08 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
Exception ignored in: <function _make_registry.._Registry.del at 0x7f552b0e8710>
Traceback (most recent call last):
File "/home/itlab/isaacsim/isaac_sim-2022.2.0/kit/extscore/omni.kit.viewport.registry/omni/kit/viewport/registry/registry.py", line 103, in del
File "/home/itlab/isaacsim/isaac_sim-2022.2.0/kit/extscore/omni.kit.viewport.registry/omni/kit/viewport/registry/registry.py", line 98, in destroy
TypeError: 'NoneType' object is not callable
Exception ignored in: <function _make_registry.._Registry.del at 0x7f552b0e8710>
Traceback (most recent call last):
File "/home/itlab/isaacsim/isaac_sim-2022.2.0/kit/extscore/omni.kit.viewport.registry/omni/kit/viewport/registry/registry.py", line 103, in del
File "/home/itlab/isaacsim/isaac_sim-2022.2.0/kit/extscore/omni.kit.viewport.registry/omni/kit/viewport/registry/registry.py", line 98, in destroy
TypeError: 'NoneType' object is not callable
Exception ignored in: <function SettingChangeSubscription.del at 0x7f587690f710>
Traceback (most recent call last):
File "/home/itlab/isaacsim/isaac_sim-2022.2.0/kit/kernel/py/omni/kit/app/_impl/init.py", line 114, in del
AttributeError: 'NoneType' object has no attribute 'get_settings'
Exception ignored in: <function RegisteredActions.del at 0x7f48510eac20>
Traceback (most recent call last):
File "/home/itlab/isaacsim/isaac_sim-2022.2.0/extscache/omni.kit.viewport.menubar.lighting-104.0.8/omni/kit/viewport/menubar/lighting/actions.py", line 345, in del
File "/home/itlab/isaacsim/isaac_sim-2022.2.0/extscache/omni.kit.viewport.menubar.lighting-104.0.8/omni/kit/viewport/menubar/lighting/actions.py", line 350, in destroy
TypeError: 'NoneType' object is not callable
2025-11-20 08:04:40 [583,484ms] [Error] [omni.usd] TF_PYTHON_EXCEPTION: in TfPyConvertPythonExceptionToTfErrors at line 114 of /buildAgent/work/ca6c508eae419cf8/USD/pxr/base/tf/pyError.cpp -- Tf Python Exception
2025-11-20 08:04:40 [583,484ms] [Error] [omni.usd] TF_PYTHON_EXCEPTION: in TfPyConvertPythonExceptionToTfErrors at line 114 of /buildAgent/work/ca6c508eae419cf8/USD/pxr/base/tf/pyError.cpp -- Tf Python Exception
2025-11-20 08:04:40 [583,484ms] [Error] [omni.usd] TF_PYTHON_EXCEPTION: in TfPyConvertPythonExceptionToTfErrors at line 114 of /buildAgent/work/ca6c508eae419cf8/USD/pxr/base/tf/pyError.cpp -- Tf Python Exception
2025-11-20 08:04:40 [583,484ms] [Error] [omni.usd] TF_PYTHON_EXCEPTION: in TfPyConvertPythonExceptionToTfErrors at line 114 of /buildAgent/work/ca6c508eae419cf8/USD/pxr/base/tf/pyError.cpp -- Tf Python Exception
2025-11-20 08:04:40 [583,484ms] [Error] [omni.usd] TF_PYTHON_EXCEPTION: in TfPyConvertPythonExceptionToTfErrors at line 114 of /buildAgent/work/ca6c508eae419cf8/USD/pxr/base/tf/pyError.cpp -- Tf Python Exception
2025-11-20 08:04:40 [583,485ms] [Warning] [omni.usd] Warning: in operator() at line 95 of /buildAgent/work/ca6c508eae419cf8/USD/pxr/base/tf/pyFunction.h -- Tried to call a method on an expired python instance
2025-11-20 08:04:40 [583,486ms] [Error] [omni.usd] TF_PYTHON_EXCEPTION: in TfPyConvertPythonExceptionToTfErrors at line 114 of /buildAgent/work/ca6c508eae419cf8/USD/pxr/base/tf/pyError.cpp -- Tf Python Exception
2025-11-20 08:04:40 [583,486ms] [Error] [omni.usd] TF_PYTHON_EXCEPTION: in TfPyConvertPythonExceptionToTfErrors at line 114 of /buildAgent/work/ca6c508eae419cf8/USD/pxr/base/tf/pyError.cpp -- Tf Python Exception
2025-11-20 08:04:40 [583,486ms] [Error] [omni.usd] TF_PYTHON_EXCEPTION: in TfPyConvertPythonExceptionToTfErrors at line 114 of /buildAgent/work/ca6c508eae419cf8/USD/pxr/base/tf/pyError.cpp -- Tf Python Exception
2025-11-20 08:04:40 [583,486ms] [Error] [omni.usd] TF_PYTHON_EXCEPTION: in TfPyConvertPythonExceptionToTfErrors at line 114 of /buildAgent/work/ca6c508eae419cf8/USD/pxr/base/tf/pyError.cpp -- Tf Python Exception
2025-11-20 08:04:40 [583,486ms] [Error] [omni.usd] TF_PYTHON_EXCEPTION: in TfPyConvertPythonExceptionToTfErrors at line 114 of /buildAgent/work/ca6c508eae419cf8/USD/pxr/base/tf/pyError.cpp -- Tf Python Exception
2025-11-20 08:04:40 [583,486ms] [Warning] [omni.usd] Warning: in operator() at line 95 of /buildAgent/work/ca6c508eae419cf8/USD/pxr/base/tf/pyFunction.h -- Tried to call a method on an expired python instance
2025-11-20 08:04:40 [583,487ms] [Error] [omni.usd] TF_PYTHON_EXCEPTION: in TfPyConvertPythonExceptionToTfErrors at line 114 of /buildAgent/work/ca6c508eae419cf8/USD/pxr/base/tf/pyError.cpp -- Tf Python Exception
2025-11-20 08:04:40 [583,487ms] [Error] [omni.usd] TF_PYTHON_EXCEPTION: in TfPyConvertPythonExceptionToTfErrors at line 114 of /buildAgent/work/ca6c508eae419cf8/USD/pxr/base/tf/pyError.cpp -- Tf Python Exception
2025-11-20 08:04:40 [583,487ms] [Error] [omni.usd] TF_PYTHON_EXCEPTION: in TfPyConvertPythonExceptionToTfErrors at line 114 of /buildAgent/work/ca6c508eae419cf8/USD/pxr/base/tf/pyError.cpp -- Tf Python Exception
2025-11-20 08:04:40 [583,487ms] [Error] [omni.usd] TF_PYTHON_EXCEPTION: in TfPyConvertPythonExceptionToTfErrors at line 114 of /buildAgent/work/ca6c508eae419cf8/USD/pxr/base/tf/pyError.cpp -- Tf Python Exception
2025-11-20 08:04:40 [583,487ms] [Error] [omni.usd] TF_PYTHON_EXCEPTION: in TfPyConvertPythonExceptionToTfErrors at line 114 of /buildAgent/work/ca6c508eae419cf8/USD/pxr/base/tf/pyError.cpp -- Tf Python Exception
2025-11-20 08:04:40 [583,487ms] [Warning] [omni.usd] Warning: in operator() at line 95 of /buildAgent/work/ca6c508eae419cf8/USD/pxr/base/tf/pyFunction.h -- Tried to call a method on an expired python instance
2025-11-20 08:04:40 [583,487ms] [Error] [omni.usd] TF_PYTHON_EXCEPTION: in TfPyConvertPythonExceptionToTfErrors at line 114 of /buildAgent/work/ca6c508eae419cf8/USD/pxr/base/tf/pyError.cpp -- Tf Python Exception
2025-11-20 08:04:40 [583,487ms] [Error] [omni.usd] TF_PYTHON_EXCEPTION: in TfPyConvertPythonExceptionToTfErrors at line 114 of /buildAgent/work/ca6c508eae419cf8/USD/pxr/base/tf/pyError.cpp -- Tf Python Exception
2025-11-20 08:04:40 [583,487ms] [Error] [omni.usd] TF_PYTHON_EXCEPTION: in TfPyConvertPythonExceptionToTfErrors at line 114 of /buildAgent/work/ca6c508eae419cf8/USD/pxr/base/tf/pyError.cpp -- Tf Python Exception
2025-11-20 08:04:40 [583,487ms] [Error] [omni.usd] TF_PYTHON_EXCEPTION: in TfPyConvertPythonExceptionToTfErrors at line 114 of /buildAgent/work/ca6c508eae419cf8/USD/pxr/base/tf/pyError.cpp -- Tf Python Exception
2025-11-20 08:04:40 [583,487ms] [Error] [omni.usd] TF_PYTHON_EXCEPTION: in TfPyConvertPythonExceptionToTfErrors at line 114 of /buildAgent/work/ca6c508eae419cf8/USD/pxr/base/tf/pyError.cpp -- Tf Python Exception
2025-11-20 08:04:40 [583,487ms] [Warning] [omni.usd] Warning: in operator() at line 95 of /buildAgent/work/ca6c508eae419cf8/USD/pxr/base/tf/pyFunction.h -- Tried to call a method on an expired python instance
2025-11-20 08:04:40 [583,783ms] [Warning] [carb.audio.context] 1 contexts were leaked
段错误 (核心已转储)