Skip to content

Conversation

@knopers8
Copy link
Collaborator

@knopers8 knopers8 commented Aug 6, 2025

This allows us to stop a run on EPNs at GO_ERROR transition by adding a corresponding ODC.EnsureStop hook.
As GO_ERROR can occur with any source state, we make sure to make the actual STOP call only if the ODC partition is in RUNNING.

At the same time, ODC partitions require us to call ODC.Stop if they voluntarily transition to ERROR.
In such case, ODC.Stop allows the remaining healthy devices to finish processing.
By keeping the original ODC.Stop behaviour, we preserve this functionality.

Additionally, the commit includes minor corrections to a few related logs.

Fixes OCTRL-1036.

@knopers8 knopers8 requested a review from justonedev1 as a code owner August 6, 2025 07:32
justonedev1
justonedev1 previously approved these changes Aug 6, 2025
Copy link
Collaborator

@justonedev1 justonedev1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So apart from adding the stop call to hooks it was necessary to change the plugin code as well... well done!

@knopers8 knopers8 changed the title Call ODC STOP only when the partition is in RUNNING WIP Call ODC STOP only when the partition is in RUNNING Aug 6, 2025
@knopers8
Copy link
Collaborator Author

knopers8 commented Aug 6, 2025

Putting on hold until we get confirmation whether we should or should not call ODC STOP when a partition goes to ERROR.

This allows us to stop a run on EPNs at GO_ERROR transition by adding a corresponding ODC.EnsureStop hook.
As GO_ERROR can occur with any source state, we make sure to make the actual STOP call only if the ODC partition is in RUNNING.

At the same time, ODC partitions require us to call ODC.Stop if they voluntarily transition to ERROR.
In such case, ODC.Stop allows the remaining healthy devices to finish processing.
By keeping the original ODC.Stop behaviour, we preserve this functionality.

Additionally, the commit includes minor corrections to a few related logs.

Fixes OCTRL-1036.
@knopers8 knopers8 changed the title WIP Call ODC STOP only when the partition is in RUNNING Call ODC STOP only when the partition is in RUNNING Aug 18, 2025
@knopers8 knopers8 requested a review from justonedev1 August 18, 2025 15:29
@knopers8
Copy link
Collaborator Author

@justonedev1 Ready again. The change compared to the original version is that if an ODC partition goes to ERROR by itself, we should anyway call STOP on it. This will happen thanks to:

if evt.GetServiceName() == "ODC" {

@knopers8 knopers8 merged commit 5ce1357 into AliceO2Group:master Aug 19, 2025
3 checks passed
@knopers8 knopers8 deleted the fix-odc-stop branch August 19, 2025 07:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants