Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 9 additions & 6 deletions blark/iec.lark
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,9 @@ _library_element_declaration: data_type_declaration
| ";"

// B.1.1
IDENTIFIER: /[A-Za-z_][A-Za-z0-9_]*/i
// Identifiers need to ignore certain keywords, but NOT typenames, as those may be used
// 'as values' with SIZEOF for example
IDENTIFIER: /\b(?!(ABSTRACT|ACTION|AND|AND_THEN|AT|BY|CASE|CONTINUE|DO|ELSE|ELSIF|END_ACTION|END_CASE|END_FOR|END_FUNCTION|END_FUNCTIONBLOCK|END_FUNCTION_BLOCK|END_IF|END_INTERFACE|END_METHOD|END_PROGRAM|END_PROPERTY|END_REPEAT|END_STRUCT|END_TYPE|END_UNION|END_VAR|END_WHILE|EXIT|EXTENDS|FINAL|FOR|FUNCTION|FUNCTIONBLOCK|FUNCTION_BLOCK|IF|IMPLEMENTS|INTERFACE|INTERNAL|JMP|METHOD|MOD|NOT|OF|OR|OR_ELSE|PERSISTENT|PRIVATE|PROGRAM|PROPERTY|PROTECTED|PUBLIC|READ_ONLY|READ_WRITE|REFERENCE|REPEAT|RETURN|STRUCT|THEN|TO|TYPE|UNION|UNTIL|VAR|VAR_ACCESS|VAR_EXTERNAL|VAR_GLOBAL|VAR_INPUT|VAR_INST|VAR_IN_OUT|VAR_OUTPUT|VAR_STAT|VAR_TEMP|WHILE|XOR)\b)[A-Za-z_][A-Za-z0-9_]*\b/i
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a note in #108 - I think there may be some overlap here. Should this perhaps also include TRUE/FALSE?

I'm sure there's some overhead for treating identifiers like this, but this may be the best way we have with lark.
It's possible adjusting priorities of those reserved words (as suggested in #108) could help address this in a different way, though it would certainly touch many more lines.

Copy link
Contributor Author

@DenSinH DenSinH Jul 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, type names and TRUE / FALSE should not be excluded, as they are in fact identifiers (and parsed as such in type declarations / expressions). The reason I added this is because (in combination with the access_specifier change for properties I believe) it would parse properties with PRIVATE, PROTECTED or PUBLIC as the name, which is of course wrong. Including all keywords made it so declarations were not parsed properly anymore, and neither were expressions like SIZEOF(UINT)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do believe that TRUE and FALSE should be included here. With the current code from this PR, blark accepts this as a valid input:

FUNCTION Test : BOOL
VAR_INPUT
    TRUE : BOOL;
END_VAR
END_FUNCTION

Given the same snippet, TwinCAT complains "Unexpected Token 'TRUE' found", which is more in line with what I expect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And it did not accept that as valid input before? I think I only made the IDENTIFIER parsing more strict...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also: if TRUE / FALSE are excluded from identifiers, SIZEOF(TRUE) will not be parsed by blark (I have not tried this but I assume this will happen). Basically, anything that can be used as an argument to SIZEOF should be parsed as an identifier, or SIZEOF should be handled separately in the grammar.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DenSinH No no, it did accept this as valid input before, sorry for being unclear. I'm just making an argument in favor of including TRUE and FALSE in the exclusion list for IDENTIFIER while you're at it.

If I add TRUE and FALSE to the exclusion list, then my example from above is rejected as invalid. However, SIZEOF(TRUE) still parses fine because the argument to SIZEOF is an expression, and TRUE or FALSE are valid expressions. So I don't see a problem with that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are correct in suggesting that SIZEOF should be handled separately in the grammar, though. It currently seems to be handled in a generic way – that is, the same as any other function call. Which means that if SIZEOF(BOOL) is currently accepted as valid code, the same would likely go for nonsense like INT_TO_UINT(BOOL). But if that's the case, it should probably be handled in a separate issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, whether this kind of stuff should happen in a parser is up for debate though, at least to what extent. I guess to do it completely correctly, there would have to be a type_name rule, which parses (you guessed it) built-in type names, or an identifier. Then SIZEOF(...) would only accept type names. That is, currently declarations are parsed as:

simple_specification: elementary_type_name
                    | simple_type_name
                    | DOTTED_IDENTIFIER

this DOTTED_IDENTIFIER would have to be type_name, which in turn also accepts DOTTED_IDENTIFIER and so on.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good discussion. I think some more thought will be needed to get to a proper solution - a bit more than I have the time for at the moment.

In the meantime, would you mind submitting your priority-boosting PR @alajovic?


// B.1.2
constant: time_literal
Expand Down Expand Up @@ -346,7 +348,7 @@ _array_initial_element: expression
| enumerated_value
| array_initialization

structure_type_declaration: structure_type_name_declaration [ extends ] ":" [ indirection_type ] _STRUCT ( structure_element_declaration ";"+ )* _END_STRUCT
structure_type_declaration: structure_type_name_declaration [ extends ] ":" [ indirection_type ] _STRUCT ";"* ( structure_element_declaration ";"+ )* _END_STRUCT

initialized_structure: structure_type_name ":=" structure_initialization

Expand Down Expand Up @@ -446,7 +448,7 @@ fb_decl: fb_decl_name_list ":" function_block_type_name [ ":=" structure_initial

fb_decl_name_list: fb_name ( "," fb_name )*

var_body: ( var_init_decl ";"+ )*
var_body: ";"* ( var_init_decl ";"+ )*
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this, then, the following would be valid?

VAR;
END_VAR

We'll need to add a simple test like that to the suite for future-proofing.

Copy link
Owner

@klauer klauer Jul 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind doing a similar thing as in the above comment, disabling this and finding your real-world source code example of where the current grammar didn't suffice?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, what you sent should be valid, it is accepted by TwinCAT but not by the old version of the grammar.


array_var_declaration: var1_list ":" array_specification

Expand Down Expand Up @@ -570,7 +572,7 @@ ACCESS_SPECIFIER: _ABSTRACT
| _INTERNAL
| _FINAL
access_specifier: ACCESS_SPECIFIER+
extends: _EXTENDS DOTTED_IDENTIFIER
extends: _EXTENDS DOTTED_IDENTIFIER ("," DOTTED_IDENTIFIER)*
implements: _IMPLEMENTS DOTTED_IDENTIFIER ("," DOTTED_IDENTIFIER)*

function_block_type_declaration: FUNCTION_BLOCK [ access_specifier ] derived_function_block_name [ extends ] [ implements ] fb_var_declaration* [ function_block_body ] END_FUNCTION_BLOCK ";"*
Expand All @@ -586,6 +588,7 @@ END_FUNCTION_BLOCK: _END_FUNCTION_BLOCK
| input_output_declarations
| external_var_declarations
| var_declarations
| var_inst_declaration
| temp_var_decls
| static_var_declarations
| incomplete_located_var_declarations
Expand All @@ -608,7 +611,7 @@ function_block_method_declaration: _METHOD [ access_specifier ] DOTTED_IDENTIFIE

?property_return_type: _located_var_spec_init

function_block_property_declaration: _PROPERTY [ access_specifier ] DOTTED_IDENTIFIER [ ":" property_return_type ] ";"* property_var_declaration* [ function_block_body ] _END_PROPERTY ";"*
function_block_property_declaration: _PROPERTY [ access_specifier ] DOTTED_IDENTIFIER [ ":" property_return_type ] ";"* [ access_specifier ] property_var_declaration* [ function_block_body ] _END_PROPERTY ";"*
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PROPERTY support in blark has always been a bit weak - thanks for this one. It would be great if there were Codesys/TwinCAT documentation references to back this up, but I assume you just found this out through trial and error?

We'll need to add a test case for it along the lines of this:

param("function_block_property_declaration", tf.multiline_code_block(
"""
PROPERTY PRIVATE PropertyName : RETURNTYPE
END_PROPERTY
"""
)),

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll be honest to say that I am no PLC engineer myself strictly, it is just something that is in our codebase (which compiles, so it should be valid). Sometimes with things like this it kind of seems like TwinCAT is just very lenient with it or something? Makes me wonder, but yeah it should be valid as it builds for us, a test like that would be good.

I just messed around a bit to undo this change and check where it went wrong, the source code for the failing propery is this:

{attribute 'monitoring' := 'call'}
PROPERTY p_fActValue : LREAL
PROTECTED 
VAR
END_VAR
END_PROPERTY

(i.e. the 'new' situation is with an access specifier after the type. I think perhaps I should have put it before the semicolons though...

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand - and can relate. "Compiles and is valid in TwinCAT" is exactly what blark aims to parse, so this is a good change.

I'll add your example as a test case, unless you wouldn't mind adding it here in this PR.
For a basic smoke test, it pretty much just needs to go in blark/tests/source/. The only catch is that it needs to be a full POU, so throwing some arbitrary FUNCTION_BLOCK around it should suffice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this behavior may be intended: the second access specifier is that of the setter or getter specifically. So the first access specifier (in the property) is the "default" access specifier for the getter and the setter. If the first access specifier is PRIVATE and the second is PUBLIC, then the resulting access specifier is PUBLIC. This can actually be done for the getter and the setter separately, and will raise no exception.


// B.1.5.3
?program_type_name: IDENTIFIER
Expand Down Expand Up @@ -646,7 +649,7 @@ program_access_decl: access_name ":" symbolic_variable ":" non_generic_type_name
| external_var_declarations
| var_declarations

interface_declaration: _INTERFACE IDENTIFIER [ extends ] interface_var_declaration* _END_INTERFACE ";"*
interface_declaration: _INTERFACE IDENTIFIER [ extends ] interface_var_declaration* _END_INTERFACE? ";"*

// B.2.1, B.3.1
LOGICAL_OR: _OR
Expand Down
4 changes: 2 additions & 2 deletions blark/plain.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
from .input import (BlarkCompositeSourceItem, BlarkSourceItem,
register_input_handler)
from .output import OutputBlock, register_output_handler
from .util import AnyPath, SourceType, find_pou_type_and_identifier
from .util import AnyPath, SourceType, find_pou_type_and_identifier_plain


@dataclasses.dataclass
Expand Down Expand Up @@ -40,7 +40,7 @@ def load(
with open(filename, "rt") as fp:
contents = fp.read()

source_type, identifier = find_pou_type_and_identifier(contents)
source_type, identifier = find_pou_type_and_identifier_plain(contents)
# if source_type is None:
# return []
source_type = SourceType.general
Expand Down
18 changes: 10 additions & 8 deletions blark/solution.py
Original file line number Diff line number Diff line change
Expand Up @@ -351,9 +351,7 @@ def from_xml(
) -> Self:
declaration = get_child_located_text(xml, "Declaration", filename=filename)
if declaration is not None:
source_type, identifier = util.find_pou_type_and_identifier(
declaration.value
)
source_type, identifier = util.find_pou_type_and_identifier_xml(xml)
else:
source_type, identifier = None, None

Expand All @@ -366,7 +364,7 @@ def from_xml(
implementation=get_child_located_text(
xml, "Implementation/ST", filename=filename
),
metadata=xml.attrib,
metadata=dict(xml.attrib),
source_type=source_type,
filename=filename,
)
Expand Down Expand Up @@ -948,7 +946,11 @@ def to_blark(self) -> list[Union[BlarkCompositeSourceItem, BlarkSourceItem]]:
property_ident = Identifier.from_string(base_decl.identifier)

parts = []
for get_or_set, obj in (("get", self.get), ("set", self.set)):
get_set = [
("get", SourceType.property_get, self.get),
("set", SourceType.property_set, self.set),
]
for get_or_set, source_type, obj in get_set:
if obj is None:
continue

Expand All @@ -961,7 +963,7 @@ def to_blark(self) -> list[Union[BlarkCompositeSourceItem, BlarkSourceItem]]:
parts=[*property_ident.parts, get_or_set],
decl_impl="declaration",
).to_string(),
type=SourceType.property,
type=source_type,
lines=base_decl.lines + decl.lines,
grammar_rule=SourceType.property.get_grammar_rule(),
implicit_end="END_PROPERTY",
Expand All @@ -977,7 +979,7 @@ def to_blark(self) -> list[Union[BlarkCompositeSourceItem, BlarkSourceItem]]:
parts=[*property_ident.parts, get_or_set],
decl_impl="implementation",
).to_string(),
type=SourceType.property,
type=source_type,
lines=impl.lines,
grammar_rule=SourceType.statement_list.get_grammar_rule(),
implicit_end="",
Expand Down Expand Up @@ -1020,7 +1022,7 @@ def from_xml(
xml: lxml.etree.Element,
parent: TcSource,
) -> Self:
return cls(metadata=xml.attrib, xml=xml, parent=parent)
return cls(metadata=dict(xml.attrib), xml=xml, parent=parent)


@dataclasses.dataclass
Expand Down
Loading
Loading