Skip to content

Conversation

@DenSinH
Copy link
Contributor

@DenSinH DenSinH commented Jul 3, 2025

Hi there, I have been enthousiastically using your package to implement some static checks on our internal TwinCAT library. Some fixes include grammar fixes, like TwinCAT somehow allowing you to define interfaces without an END_INTERFACE statement, or properties as

PROPERTY myProperty : LREAL PUBLIC

(i.e. different positioning of the access specifier)

But also supporting multiple extensions, and some fixes for property handling. There may be more fixes coming as I find errors, but I figured I would open a PR for what we have found so far.

@klauer
Copy link
Owner

klauer commented Jul 3, 2025

Thanks so much for giving blark a try and contributing back your fixes!

It'll take me a few days to get around to reviewing and testing as blark is a bit of a fun side project for me, but I'll try to make some time soon.

Copy link
Owner

@klauer klauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great overall! A few test suite additions would be greatly appreciated if you have the time for them. If not, I can address them on my own.

Would you please merge with master? I've fixed some stale CI config and approved your workflow.

I'd also like to take a bit of time to run your grammar changes through some source code I have locally before going ahead with merging.

?property_return_type: _located_var_spec_init

function_block_property_declaration: _PROPERTY [ access_specifier ] DOTTED_IDENTIFIER [ ":" property_return_type ] ";"* property_var_declaration* [ function_block_body ] _END_PROPERTY ";"*
function_block_property_declaration: _PROPERTY [ access_specifier ] DOTTED_IDENTIFIER [ ":" property_return_type ] ";"* [ access_specifier ] property_var_declaration* [ function_block_body ] _END_PROPERTY ";"*
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PROPERTY support in blark has always been a bit weak - thanks for this one. It would be great if there were Codesys/TwinCAT documentation references to back this up, but I assume you just found this out through trial and error?

We'll need to add a test case for it along the lines of this:

param("function_block_property_declaration", tf.multiline_code_block(
"""
PROPERTY PRIVATE PropertyName : RETURNTYPE
END_PROPERTY
"""
)),

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll be honest to say that I am no PLC engineer myself strictly, it is just something that is in our codebase (which compiles, so it should be valid). Sometimes with things like this it kind of seems like TwinCAT is just very lenient with it or something? Makes me wonder, but yeah it should be valid as it builds for us, a test like that would be good.

I just messed around a bit to undo this change and check where it went wrong, the source code for the failing propery is this:

{attribute 'monitoring' := 'call'}
PROPERTY p_fActValue : LREAL
PROTECTED 
VAR
END_VAR
END_PROPERTY

(i.e. the 'new' situation is with an access specifier after the type. I think perhaps I should have put it before the semicolons though...

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand - and can relate. "Compiles and is valid in TwinCAT" is exactly what blark aims to parse, so this is a good change.

I'll add your example as a test case, unless you wouldn't mind adding it here in this PR.
For a basic smoke test, it pretty much just needs to go in blark/tests/source/. The only catch is that it needs to be a full POU, so throwing some arbitrary FUNCTION_BLOCK around it should suffice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this behavior may be intended: the second access specifier is that of the setter or getter specifically. So the first access specifier (in the property) is the "default" access specifier for the getter and the setter. If the first access specifier is PRIVATE and the second is PUBLIC, then the resulting access specifier is PUBLIC. This can actually be done for the getter and the setter separately, and will raise no exception.

fb_decl_name_list: fb_name ( "," fb_name )*

var_body: ( var_init_decl ";"+ )*
var_body: ";"* ( var_init_decl ";"+ )*
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this, then, the following would be valid?

VAR;
END_VAR

We'll need to add a simple test like that to the suite for future-proofing.

Copy link
Owner

@klauer klauer Jul 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind doing a similar thing as in the above comment, disabling this and finding your real-world source code example of where the current grammar didn't suffice?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, what you sent should be valid, it is accepted by TwinCAT but not by the old version of the grammar.

IDENTIFIER: /[A-Za-z_][A-Za-z0-9_]*/i
// Identifiers need to ignore certain keywords, but NOT typenames, as those may be used
// 'as values' with SIZEOF for example
IDENTIFIER: /\b(?!(ABSTRACT|ACTION|AND|AND_THEN|AT|BY|CASE|CONTINUE|DO|ELSE|ELSIF|END_ACTION|END_CASE|END_FOR|END_FUNCTION|END_FUNCTIONBLOCK|END_FUNCTION_BLOCK|END_IF|END_INTERFACE|END_METHOD|END_PROGRAM|END_PROPERTY|END_REPEAT|END_STRUCT|END_TYPE|END_UNION|END_VAR|END_WHILE|EXIT|EXTENDS|FINAL|FOR|FUNCTION|FUNCTIONBLOCK|FUNCTION_BLOCK|IF|IMPLEMENTS|INTERFACE|INTERNAL|JMP|METHOD|MOD|NOT|OF|OR|OR_ELSE|PERSISTENT|PRIVATE|PROGRAM|PROPERTY|PROTECTED|PUBLIC|READ_ONLY|READ_WRITE|REFERENCE|REPEAT|RETURN|STRUCT|THEN|TO|TYPE|UNION|UNTIL|VAR|VAR_ACCESS|VAR_EXTERNAL|VAR_GLOBAL|VAR_INPUT|VAR_INST|VAR_IN_OUT|VAR_OUTPUT|VAR_STAT|VAR_TEMP|WHILE|XOR)\b)[A-Za-z_][A-Za-z0-9_]*\b/i
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a note in #108 - I think there may be some overlap here. Should this perhaps also include TRUE/FALSE?

I'm sure there's some overhead for treating identifiers like this, but this may be the best way we have with lark.
It's possible adjusting priorities of those reserved words (as suggested in #108) could help address this in a different way, though it would certainly touch many more lines.

Copy link
Contributor Author

@DenSinH DenSinH Jul 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, type names and TRUE / FALSE should not be excluded, as they are in fact identifiers (and parsed as such in type declarations / expressions). The reason I added this is because (in combination with the access_specifier change for properties I believe) it would parse properties with PRIVATE, PROTECTED or PUBLIC as the name, which is of course wrong. Including all keywords made it so declarations were not parsed properly anymore, and neither were expressions like SIZEOF(UINT)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do believe that TRUE and FALSE should be included here. With the current code from this PR, blark accepts this as a valid input:

FUNCTION Test : BOOL
VAR_INPUT
    TRUE : BOOL;
END_VAR
END_FUNCTION

Given the same snippet, TwinCAT complains "Unexpected Token 'TRUE' found", which is more in line with what I expect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And it did not accept that as valid input before? I think I only made the IDENTIFIER parsing more strict...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also: if TRUE / FALSE are excluded from identifiers, SIZEOF(TRUE) will not be parsed by blark (I have not tried this but I assume this will happen). Basically, anything that can be used as an argument to SIZEOF should be parsed as an identifier, or SIZEOF should be handled separately in the grammar.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DenSinH No no, it did accept this as valid input before, sorry for being unclear. I'm just making an argument in favor of including TRUE and FALSE in the exclusion list for IDENTIFIER while you're at it.

If I add TRUE and FALSE to the exclusion list, then my example from above is rejected as invalid. However, SIZEOF(TRUE) still parses fine because the argument to SIZEOF is an expression, and TRUE or FALSE are valid expressions. So I don't see a problem with that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are correct in suggesting that SIZEOF should be handled separately in the grammar, though. It currently seems to be handled in a generic way – that is, the same as any other function call. Which means that if SIZEOF(BOOL) is currently accepted as valid code, the same would likely go for nonsense like INT_TO_UINT(BOOL). But if that's the case, it should probably be handled in a separate issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, whether this kind of stuff should happen in a parser is up for debate though, at least to what extent. I guess to do it completely correctly, there would have to be a type_name rule, which parses (you guessed it) built-in type names, or an identifier. Then SIZEOF(...) would only accept type names. That is, currently declarations are parsed as:

simple_specification: elementary_type_name
                    | simple_type_name
                    | DOTTED_IDENTIFIER

this DOTTED_IDENTIFIER would have to be type_name, which in turn also accepts DOTTED_IDENTIFIER and so on.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good discussion. I think some more thought will be needed to get to a proper solution - a bit more than I have the time for at the moment.

In the meantime, would you mind submitting your priority-boosting PR @alajovic?

return Property(
name=name,
access=access,
access=access or access_,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the latter access_ gets priority. Dumping the code again would result in access being dropped, if somehow both were specified. I think that's probably OK.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My guess is it would be invalid (and rejected) by twincat if you specify both, I could try

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I (somewhat embarrassingly) don't have access to TwinCAT right now to do the testing. If you'd be willing, I'd appreciate the additional testing to see what it accepts vs fails on. Like my comment above, if TwinCAT parses it, blark should as well.

Copy link
Contributor Author

@DenSinH DenSinH Jul 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I implemented it is backwards, actually the second access specifier should be dominant but isn't in the current implementation. Though of course, based my comment above, these should actually be specific to the getter and the setter

@klauer
Copy link
Owner

klauer commented Jul 11, 2025

I did a few fixes and merged with master in the pr111 branch; hopefully it should be passing tests now.

If you could fast forward to that before you do any additional changes that'd be great. 👍

@DenSinH
Copy link
Contributor Author

DenSinH commented Jul 11, 2025

Rebased onto that branch, so I should be up to date now

klauer added a commit that referenced this pull request Aug 22, 2025
@klauer klauer merged commit 1e11cc1 into klauer:master Aug 22, 2025
23 of 38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants