Skip to content

Conversation

@andreibratu
Copy link

@andreibratu andreibratu commented Jan 6, 2025

  • Simplified the code that kept track of the trace built by decorators - instead of relying on the global TRACE_CONTEXT variable, we are only peeking at the status of the ancestor on the level above
  • Aligned the Python SDK with TS SDK: we now properly support spans created by LLM library streaming calls. To do this, the instrument spans have a complete bool property attached to them. The bool is false on span creation and turns true when the streaming ends. The Prompt span is no longer exported immediately after the decorated function exit, but only after all instrumentor spans underneath it are uploaded
  • The exporter correctly completes flow traces - instead of waiting for the program exit to upload the spans, I am attaching a new span attribute instructing the exporter which spans' logs should be uploaded before completing the flow log. To make this work, the processor keeps a running list of spans as they start, which belong in a particular trace. The list is attached to the flow span during the onEnd signal. The spans created by utilities will no longer have a UUIDv4 name but be named after the log type they carry, e.g., humanloop.prompt or humanloop.flow. This is also more dogmatic with how OT does things
  • Renamed the decorators module to utilities to align with the name used in TS SDK

Things to look out for: I am introducing some async wait logic for the streaming case in the processor. I use a thread pool to not block the main thread. Please double-check if it makes sense and if there are no gotchas

# THEN the span are returned bottom to top
assert read_from_opentelemetry_span(span=spans[1], key=HUMANLOOP_FILE_KEY)["tool"]
assert read_from_opentelemetry_span(span=spans[2], key=HUMANLOOP_FILE_KEY)["prompt"]
# assert read_from_opentelemetry_span(span=spans[3], key=HL_FILE_OT_KEY)["flow"]
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer relying on TRACE_FLOW_CONTEXT here, using the ancestor to determine that logs are uploaded correctly.


# Wait for the Prompt span to be exported; It was asynchronously waiting
# on the OpenAI call span to finish first
time.sleep(1)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some time.sleep in tests since the processor now waits async before sending the HL spans to processor

assert flow_span_flow_metadata["is_flow_log"]

# THEN one of the flows is nested inside the other
spans: list[ReadableSpan] = [mock_export_method.call_args_list[i][0][0][0] for i in range(1, 5)]
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Order of spans is no longer guaranteed due to async mechanism in processor, tests should no longer rely or test for order

self._upload_queue.task_done()

def _mark_span_completed(self, span_id: int) -> None:
for flow_log_span_id, flow_children_span_ids in self._flow_log_prerequisites.items():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

potentially dumb question: Right now we mark flow as complete only if self._flow_log_prerequisites.items() is not empty - is it always the case? Aka do we ever need to mark it as complete when self._flow_log_prerequisites.items() is empty

span.name,
parent_span_id,
)
time.sleep(0.1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

discussed offline - if time.sleep potentailly increases latency - maybe remove it

provider="openai",
max_tokens=-1,
temperature=0.7,
<<<<<<< HEAD
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oups

@@ -1,3 +1,6 @@
[project]
name = "humanloop"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary, aka why was it added now?

Copy link
Contributor

@olehvell-h olehvell-h left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense to me!

I haven't propertly understood all logic, so don't count to much on my review. At the same time I don't see a reason not to approve it. Maybe spend a bit of extra time on qaing just to make sure.

@andreibratu andreibratu merged commit 9e7f556 into master Jan 9, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants