-
Notifications
You must be signed in to change notification settings - Fork 770
Implement partial success logging in OTLP exporters #4805
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
1008906
72b2325
baefbc0
e008658
d40ab67
5725ccd
818456e
10aedae
b44507a
d84b22a
11d3e09
a00c8bd
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -11,6 +11,7 @@ | |
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
|
|
||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: can revert this |
||
| from os import environ | ||
| from typing import Dict, Literal, Optional, Sequence, Tuple, Union | ||
| from typing import Sequence as TypingSequence | ||
|
|
||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We add a I'm thinking we should update that to reference If you want to take a stab at that change here that'd be great, if not I think we can proceed without that and I can make it separately.. I don't think we should special case the logging exporter because we have this duplicate filter thing.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks, @DylanRussell ! Updated accordingly. Please take another look. Also, any idea what's going on with the failed tests? They look to be unrelated to this change:
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's some sort of flake we get all the time (can't find a commit hash), IDK why we get that. It's OK to ignore |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -374,6 +374,10 @@ def _translate_data( | |
| ) -> ExportServiceRequestT: | ||
| pass | ||
|
|
||
| def _process_response(self, response): | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: can we inline this code instead ? IMO it's more readable |
||
| if response.HasField("partial_success"): | ||
| logger.debug("Partial success:\n%s", response.partial_success) | ||
|
|
||
| def _export( | ||
| self, | ||
| data: SDKDataT, | ||
|
|
@@ -388,11 +392,12 @@ def _export( | |
| deadline_sec = time() + self._timeout | ||
| for retry_num in range(_MAX_RETRYS): | ||
| try: | ||
| self._client.Export( | ||
| response = self._client.Export( | ||
| request=self._translate_data(data), | ||
| metadata=self._headers, | ||
| timeout=deadline_sec - time(), | ||
| ) | ||
| self._process_response(response) | ||
| return self._result.SUCCESS # type: ignore [reportReturnType] | ||
| except RpcError as error: | ||
| retry_info_bin = dict(error.trailing_metadata()).get( # type: ignore [reportAttributeAccessIssue] | ||
|
|
||
| Original file line number | Diff line number | Diff line change | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -46,17 +46,36 @@ class DuplicateFilter(logging.Filter): | |||||||||||||||||
| recursion depth exceeded issue in cases where logging itself results in an exception.""" | ||||||||||||||||||
|
|
||||||||||||||||||
| def filter(self, record): | ||||||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. IMO I would simplify all this and just have the logic as: Although I'm realizing now that we should store the previous 10 Also we have tests for this thing here: opentelemetry-python/opentelemetry-sdk/tests/shared_internal/test_batch_processor.py Lines 259 to 266 in 62da90e
|
||||||||||||||||||
| current_log = ( | ||||||||||||||||||
| # We need to pick a time longer than the OTLP LogExporter timeout | ||||||||||||||||||
| # which defaults to 10 seconds, but not pick something so long that | ||||||||||||||||||
| # it filters out useful logs. | ||||||||||||||||||
| time_boundary = time.time() // 20 | ||||||||||||||||||
| current_log_by_line = ( | ||||||||||||||||||
| record.module, | ||||||||||||||||||
| record.pathname, | ||||||||||||||||||
| record.lineno, | ||||||||||||||||||
| time_boundary, | ||||||||||||||||||
| ) | ||||||||||||||||||
| previous_log_by_line = getattr(self, "last_source", None) | ||||||||||||||||||
| previous_log_by_line_count = getattr(self, "last_source_count", 1) | ||||||||||||||||||
| if current_log_by_line == previous_log_by_line: | ||||||||||||||||||
| recurring_count = previous_log_by_line_count + 1 | ||||||||||||||||||
| self.last_source_count = recurring_count # pylint: disable=attribute-defined-outside-init | ||||||||||||||||||
| if recurring_count > 3: | ||||||||||||||||||
| # Don't allow further processing, since the same line is repeatedly emitting logs. | ||||||||||||||||||
| return False | ||||||||||||||||||
| else: | ||||||||||||||||||
| self.last_source = current_log_by_line # pylint: disable=attribute-defined-outside-init | ||||||||||||||||||
| self.last_source_count = 1 # pylint: disable=attribute-defined-outside-init | ||||||||||||||||||
|
|
||||||||||||||||||
| current_log_by_content = ( | ||||||||||||||||||
| record.module, | ||||||||||||||||||
| record.levelno, | ||||||||||||||||||
| record.msg, | ||||||||||||||||||
| # We need to pick a time longer than the OTLP LogExporter timeout | ||||||||||||||||||
| # which defaults to 10 seconds, but not pick something so long that | ||||||||||||||||||
| # it filters out useful logs. | ||||||||||||||||||
| time.time() // 20, | ||||||||||||||||||
| time_boundary, | ||||||||||||||||||
| ) | ||||||||||||||||||
| if current_log != getattr(self, "last_log", None): | ||||||||||||||||||
| self.last_log = current_log # pylint: disable=attribute-defined-outside-init | ||||||||||||||||||
| if current_log_by_content != getattr(self, "last_log", None): | ||||||||||||||||||
| self.last_log = current_log_by_content # pylint: disable=attribute-defined-outside-init | ||||||||||||||||||
| return True | ||||||||||||||||||
| # False means python's `logging` module will no longer process this log. | ||||||||||||||||||
| return False | ||||||||||||||||||
|
|
||||||||||||||||||

There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could also update the HTTP exporters the same way I think ? https://github.com/open-telemetry/opentelemetry-python/tree/main/exporter/opentelemetry-exporter-otlp-proto-http/src/opentelemetry/exporter/otlp/proto/http