fix: use closed-form LR calculation to fix polynomial decay formula bug #67
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Problem
The
get_lr()method had a mathematically incorrect formula:Mathematical Proof
Let
d(l) = decay_factor(the correct ratio between consecutive steps)The code computes:
g(l) = (1 + d(l)) / 2For a decaying schedule where
0 < d(l) < 1:g(l) - d(l) = (1 - d(l)) / 2 > 0g(l) > d(l)alwaysResult: LR decays slower than the intended polynomial schedule.
Visual Impact
Solution
Use the existing
_get_closed_form_lr()which correctly computes:Additional Benefits
base_lrsinstead of accumulating incremental updatesFiles Changed
training/lr_scheduler.py- Replaceget_lr()implementation