-
Notifications
You must be signed in to change notification settings - Fork 5
Description
[2024-11-04 11:41:27,602] [INFO] [loss_scaler.py:190:update_scale] [deepspeed] OVERFLOW! Rank 0 Skipping step. Attempted loss scal e: 8192, but hysteresis is 2. Reducing hysteresis to 1
[2024-11-04 11:41:27,623] [INFO] [loss_scaler.py:183:update_scale] [deepspeed] OVERFLOW! Rank 0 Skipping step. Attempted loss scal e: 8192, reducing to 4096
[2024-11-04 11:43:36,061] [INFO] [loss_scaler.py:190:update_scale] [deepspeed] OVERFLOW! Rank 0 Skipping step. Attempted loss scal e: 8192, but hysteresis is 2. Reducing hysteresis to 1
[2024-11-04 11:43:39,575] [INFO] [loss_scaler.py:183:update_scale] [deepspeed] OVERFLOW! Rank 0 Skipping step. Attempted loss scal e: 8192, reducing to 4096
[2024-11-04 11:45:48,644] [INFO] [loss_scaler.py:190:update_scale] [deepspeed] OVERFLOW! Rank 0 Skipping step. Attempted loss scal e: 8192, but hysteresis is 2. Reducing hysteresis to 1
[2024-11-04 11:45:51,164] [INFO] [loss_scaler.py:183:update_scale] [deepspeed] OVERFLOW! Rank 0 Skipping step. Attempted loss scal e: 8192, reducing to 4096
[2024-11-04 11:48:02,638] [INFO] [loss_scaler.py:190:update_scale] [deepspeed] OVERFLOW! Rank 0 Skipping step. Attempted loss scale: 8192, but hysteresis is 2. Reducing hysteresis to 1
[2024-11-04 11:48:02,902] [INFO] [loss_scaler.py:183:update_scale] [deepspeed] OVERFLOW! Rank 0 Skipping step. Attempted loss scale: 8192, reducing to 4096
为什么会这样呢 怎么解决 我的机器是8卡3090