Skip to content

Comments

Fix stuck light transitioning flag after task cancellation#672

Merged
TheJulianJES merged 5 commits intozigpy:devfrom
TheJulianJES:tjj/light_restart_mode_automation_cancelled_error
Feb 25, 2026
Merged

Fix stuck light transitioning flag after task cancellation#672
TheJulianJES merged 5 commits intozigpy:devfrom
TheJulianJES:tjj/light_restart_mode_automation_cancelled_error

Conversation

@TheJulianJES
Copy link
Contributor

@TheJulianJES TheJulianJES commented Feb 22, 2026

The Git diff is horrible to look at, but it's just try/finally being added in those parts and one small guard (59611bf).
EDIT: Git diff is slightly improved by introducing another method where the actual implementation is moved to, with async_turn_on now calling that with but wrapped with try / raise to release the flag at the end.

AI summary

When a mode: restart automation fires while a light turn-on or turn-off is in progress, Home Assistant cancels the running asyncio task. If that cancellation lands between async_transition_set_flag() and async_transition_start_timer(), the _transitioning_individual flag is left True with no timer to clear it. The light then ignores all incoming attribute reports indefinitely.

  • Wrap the body of async_turn_on and async_turn_off in try/finally blocks. The finally clause calls async_transition_complete() when the flag is stuck (set but no timer running), covering task cancellation and unexpected exceptions.
    • EDIT: With the last commit, async_turn_on was essentially renamed and the new async_turn_on now just calls that method with try / raise, so we don't have to indent the entire turn on method. This reduces diff a bit and might make the intention more clear of what we're doing. I haven't done that for async_turn_off because it's a much shorter method. I can obviously revert this to just indent everything, also do it for turn off, or just leave as is.
  • As a side effect, a pre-existing bug is also fixed: when the second move_to_level call in the new_color_provided_while_off path failed, the code returned without calling async_transition_complete() or starting the transition timer — leaving the flag stuck.
    • An explicit guard is now added there, consistent with all other early-return error paths (59611bf). We should consider if we even need these at all.

@codecov
Copy link

codecov bot commented Feb 22, 2026

Codecov Report

❌ Patch coverage is 89.28571% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 97.46%. Comparing base (aa03919) to head (34e38c4).
⚠️ Report is 8 commits behind head on dev.

Files with missing lines Patch % Lines
zha/application/platforms/light/__init__.py 89.28% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##              dev     #672      +/-   ##
==========================================
+ Coverage   97.44%   97.46%   +0.01%     
==========================================
  Files          62       62              
  Lines       10731    10887     +156     
==========================================
+ Hits        10457    10611     +154     
- Misses        274      276       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment on lines 500 to 506
if result[1] is not Status.SUCCESS:
# 'On' call failed, but as brightness may still transition
# (for FORCE_ON lights), we start the timer to unset the flag after
# the transition_time if necessary.
self.async_transition_start_timer(transition_time)
self.debug("turned on: %s", t_log)
return
Copy link
Contributor Author

@TheJulianJES TheJulianJES Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The missing coverage is because all of these lines were indented. They don't have any coverage.

I'm also not sure if they're really needed, since we should just raise anyway if the device doesn't respond, right? I forgot about what the zigpy request can return though. I think we also raise for unsupported commands..? If so, we should be able to remove all of these in a future PR.

@TheJulianJES
Copy link
Contributor Author

Diff for turn_on is somewhat reduced with 34e38c4 by moving the existing implementation to another method and just keeping the try / finally in async_turn_on.

For async_turn_off, the existing method is kept, just adding try / finally at start/end, indenting everything in between.

This is somewhat of a crucial fix, so it might be nice to get this in.

@@ -527,6 +563,11 @@ async def async_turn_on(
)
t_log["move_to_level_if_color"] = result
if result[1] is not Status.SUCCESS:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking of having zigpy soon handle all ZCL default responses directly, throwing an error for any non-SUCCESS codes. Is it possible for a light to return a non-successful result?

Copy link
Contributor Author

@TheJulianJES TheJulianJES Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's the same question I asked in the comment above. I hoped you would have an answer 😄

I don't think we ever hit these paths..? Before I first touched the light class like 6(?) years ago, this code already existed. Maybe we can just remove soon?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, the existing comments were collapsed in the /changes view...

Yeah, let's remove them and see what breaks.

Copy link
Contributor

@puddly puddly Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, the existing comments were collapsed in the /changes view...

I can't tell you how much I hate that new feature 😅

I'll put up another PR to remove them. We don't have to merge that for this release yet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are quite a few of these

Oh, yeah... The isinstance(result, Exception) check can most definitely go as well. Returning an exception instead of raising seems interesting 😄

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely. We got rid of most of them but the root cause seems to be asyncio.gather(..., return_exceptions=True): https://github.com/search?q=repo%3Azigpy%2Fzha%20return_exceptions%3DTrue&type=code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, ok. That's at least always in places where it kind of makes sense.

Copy link
Contributor

@puddly puddly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thanks!

@TheJulianJES TheJulianJES merged commit 1b6b19f into zigpy:dev Feb 25, 2026
8 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants