-
Notifications
You must be signed in to change notification settings - Fork 33
Open
Description
version: v0.4.12
According to the JSON specification (RFC 8259), trailing dot is prohibited for floating point.
However, when I define the new JSON grammar like below, I found the lexer's suspicious behavior.
// RFC 8259 without complex STRING definition
?start: value
?value: object
| array
| STRING
| NUMBER
| "true" -> true
| "false" -> false
| "null" -> null
object: "{" [member ("," member)*] "}"
member: STRING ":" value
array : "[" [value ("," value)*] "]"
NUMBER: MINUS? INT FRAC? EXP?
MINUS: "-"
INT: "0" | ("1".."9") DIGIT*
DIGIT: "0".."9"
FRAC: "." DIGIT+
EXP: ("e"|"E") ["+"|"-"] DIGIT+
STRING: /\"[^"]*\"/
WS: /[ \t\f\r\n]/+
%ignore WS
The observed behavior:
>>> grammar_engine._parse_partial_code(0, '{ "cap": 10.0', b'', accepted_generation=True)
(remainder : b'10.0', remainder_state: RemainderState.MAYBE_COMPLETE, accept_sequences: {accept_terminals: ['NUMBER', 'COMMA'], accept_terminals: ['NUMBER', 'WS', 'COMMA'], accept_terminals: ['LBRACE'], accept_terminals: ['WS'], accept_terminals: ['NULL'], accept_terminals: ['STRING'], accept_terminals: ['NUMBER', 'WS', 'RBRACE'], accept_terminals: ['NUMBER', 'RBRACE'], accept_terminals: ['TRUE'], accept_terminals: ['FALSE'], accept_terminals: ['LSQB']}, next_ac_indents: None, False)
# ↑ This looks correct.
>>> grammar_engine._parse_partial_code(0, '{ "cap": 10.', b'', accepted_generation=True)
(remainder : b'.', remainder_state: RemainderState.INCOMPLETE, accept_sequences: {accept_terminals: ['COMMA'], accept_terminals: ['WS'], accept_terminals: ['RBRACE']}, next_ac_indents: None, False)
# ↑ This reminder must be '10.' ?It seems that the lexer will be confused when its state moves along accepted (digits) -> live-state (trailing dot) -> accepted (digits).
Metadata
Metadata
Assignees
Labels
No labels