Skip to content

Conversation

@justonedev1
Copy link
Collaborator

Added DCS results and durations per detector after given DCS call. Added states of all detectors to global (All) metric after DCS call.

metric := newMetric("EOR")
defer monitoring.TimerSendSingle(&metric, monitoring.Millisecond)()
eor := "EOR"
detectorDurations := map[dcspb.Detector]time.Duration{}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
detectorDurations := map[dcspb.Detector]time.Duration{}
durationsPerDetector map[dcspb.Detector]time.Duration{}

maybe?

detectorDurations := map[dcspb.Detector]time.Duration{}
start := time.Now()

wholeMetric := newMetric(eor)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
wholeMetric := newMetric(eor)
totalCallDurationMetric := newMetric(eor)

I am afraid the meaning of "wholeMetric" might be less clear outside of the context of this PR. How about this?

) (error, []byte) {
metric := newMetric("SOR")
defer monitoring.TimerSendSingle(&metric, monitoring.Millisecond)()
sor := "SOR"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
sor := "SOR"
callName := "SOR"

It's nitpicking, but perhaps this would be better?

After all, if I wanted to declare e.g. a constant defining a number of FLPs, we would not call it twoHundred but numberOfFLPs.

The same comment applies for other calls.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would agree generally with you, but I prefer calling it sor, eor, pfr in this case. It looks more readable when passing it around knowing it's value and inherently it's meaning. The reason is that wile coding you can see hints inside the editors , so I see something like method: sor making it better than method: callName

I also don't think that your example is quite applicable as twoHundred is really generic, but sor is quite specific thing in our tech stack.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like it's a matter of personal preference and I would not judge one as better than other in our case. It's OK.

}

detectorStatusMap[dcsEvent.GetDetector()] = dcsEvent.GetState()
detectorDurations[dcsEvent.GetDetector()] = time.Since(start)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, the duration will be incorrect if the detector times out or there is some weirder error, am I wrong? In such case, the loop will break before reaching this line.

Not sure what to propose for the timeout case... after all, some detectors might complete an operation and some not, so we would want to set the timeout value for those who did not complete, but we would have to do some guess work instead of relying on dcsEvent.GetDetector().

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are right, I overlooked the break.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we both agree that we are not sure how to fix this, I tried to resolve the problem by adding result tag to the totalCallDurationMetric. It won't fix this problem with incorrect duration with some detectors, but we might at least know that we cannot truest results in a case, when we break outside of the loop early.

Copy link
Collaborator

@knopers8 knopers8 Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. We can then observe what we see in prod and adapt it if needed.

results of dcs calls

renaming
@justonedev1 justonedev1 merged commit 3585cac into master Sep 9, 2025
4 checks passed
@justonedev1 justonedev1 deleted the OCTRL-1041 branch September 9, 2025 09:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants