Fix IndexError in scan_tag() and check_key() on empty input#913
Open
Fix IndexError in scan_tag() and check_key() on empty input#913
Conversation
Calling peek(1) without checking buffer bounds causes IndexError when the buffer doesn't have enough characters available. This occurs with empty or very short input where peek(1) tries to access buffer[pointer+1] but only buffer[pointer] exists (the EOF marker '\0'). Fixed by adding bounds checking before peek() calls in: - check_key(): Check if pointer+1 < len(buffer) before peek(1) - check_value(): Check if pointer+1 < len(buffer) before peek(1) - scan_tag(): Check buffer bounds before both peek(1) and peek() calls When buffer bounds are insufficient, fall back to peek() which returns the EOF marker '\0', ensuring proper handling of end-of-stream cases. Fixes yaml#906
Author
|
Friendly ping - any chance someone could take a look at this when they get a chance? Happy to make any changes if needed. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #906 by adding buffer bounds checking before
peek(1)calls in scanner methods.When the input is an empty string or very short, calling
peek(1)without validating buffer bounds causesIndexErrorbecause the buffer only contains the EOF marker (\0) at position 0, butpeek(1)tries to access position 1.Changes
Added bounds checking in three methods in
lib/yaml/scanner.py:check_key()- Check ifpointer + 1 < len(buffer)beforepeek(1)check_value()- Check ifpointer + 1 < len(buffer)beforepeek(1)scan_tag()- Check buffer bounds before bothpeek(1)andpeek()callsWhen buffer bounds are insufficient, the code falls back to
peek()which safely returns the EOF marker\0.Testing
Verified the fix with multiple test cases:
All existing tests pass:
tests/test_dump_load.py- 3 tests passedtests/legacy_tests/test_errors.py- 189 tests passedManual Test Results
The fix ensures proper handling of edge cases while maintaining backward compatibility with all existing functionality.