Skip to content

Conversation

@andreibratu
Copy link

@andreibratu andreibratu commented Oct 25, 2024

Goal

  • Allow customers to declare Humanloop Files using code

  • Minimise how many lines need to be changed to integrate with Humanloop

How

  • Add decorators in the Python client that inspect the decorated function to create Files on the HL application.

  • Every call to a decorated function creates a Log against that File.

  • Every change to the decorated function results in a new version being created on HL

  • Added 3 decorators: Prompt, Tool, Flow

    • Prompt decorator expects the function to call an LLM provider. When the function is called, we spy on the API call and infer hyperparameters that define the Prompt Version
    • Tool decorator uses type hints and source code to version the associated Tool File
    • Flow decorator imposes a "Trace context": making calls to other decorated functions from a @flow decorated one adds the Logs to a Trace

How to Test this PR

  1. Check out this branch locally e.g. to ~/Desktop/humanloop-python
  2. Check out the humanloop-cookbook repository and browse to the decorator example
  3. Install the modified humanloop library by running poetry add ~/Desktop/humanloop-python (or whatever path you have clone humanloop-python repo). Make sure the current branch in humanloop-python is ENG-1176-sdk-annotations

Design

Primer on OpenTelemetry

  • OpenTelemetry is an open-source standard for instrumenting software. The new decorators rely on it for two reasons:

    1. We use Instrumentors to spy on LLM provider calls
    2. We can easily map from Otel's Trace-Span model to our Flow Trace model
  • Here is a quick 101 to OTel terminology and domain to help you navigate this PR:

    • TracerProvider holds the telemetry configuration for your OTel-powered application
    • A TracerProvider can be used to create Tracers, which act as session objects in a telemetry Trace
    • A Tracer can create Spans and key-value pairs of telemetry information. A Span can be nested under another Span (which we leverage for creating Flow Traces)
    • The TracerProvider can have one or multiple SpanProcessors. Spans created by a Tracer spawned from a TracerProvider will be passed to each SpanProcessor registered with the TracerProvider. SpanProcessors form a pipeline, each being applied in order of being added. A Processor is responsible for filtering or modifying the Span and passing it along.
    • Each SpanProcessor takes one SpanExporter to which it passes the modified or filtered Spans. The SpanExporter can be a Connector that passes the Span to another SpanProcessor or can act as a final destination, uploading the Spans to a backend (e.g., Humanloop, Datadog, etc.).
    • Instrumentors are libraries that spy on third-party libraries e.g. Django, SQLAlchemy and add Spans to the TracerProvider they're added to automatically

Sketch of the Design

  1. Create a TraceProvider in the scope of the Humanloop client. Add Instrumentors to it for spying on LLM providers' libraries (e.g., OpenAI) and obtain a Tracer used as a backend for the decorators.
  2. Whenever a decorated function is called, create a new one in the Trace owned by the Tracer. If the decorated function was called by another decorated function, that Span becomes a child Span of the initial Span
  3. Gather information from the decorated function or the arguments of the function call and add them as attributes to the Spans
  4. Implement a HumanloopSpanProcessor. Its goal is to identify the Instrumentors we've added to spy on libraries of interest and add information from those spans to Spans created by our decorators. Spans not created by the decorators are dropped.
  5. Implement a HumanloopSpanExporter and pair it with the HumanloopSpanProcessor. The Exporter will translate the spans into SDK calls.
  6. @flow decorator acts as an entrypoint: when a Span is the child of another Span, and the call stack begins with a @flow decorated function, the relationship is translated into a Flow Trace on HL. Otherwise, the Span parent-child relationship is still useful in other aspects, such as automatic start_time and end_time for each Span (and the Log it will create)

How To Review this PR:

  1. Review src/humanloop/otel/helpers.py and associated tests. Review src/humanloop/otel/__init__.py and src/humanloop/client.py
  2. Review decorators from src/humanloop/decorators/*
  3. Review src/humanloop/otel/processor.py and src/humanloop/otel/exporter.py
  4. Review tests for src/humanloop/decorators/*

Considerations

  • otel/helpers.py contains:
    • Helpers for identifying Spans that belong to decorators
    • Helpers to read from and write to Spans. Values of OTel span attributes (logged telemetry information) primitives or lists. False values are dropped, and OTel will complain about trying to write one. Dictionary values must be linearised, e.g., writing {'a': 7, 'b': 5} on key foo translates into writing two attributes on the Span: foo.a: 7 and foo.b: 5. This leads to a lot of 'fun' and makes the utils necessary for cleaner code.
    • Due to the limitation mentioned above, a placeholder value named HL_OT_EMPTY_VALUE is sometimes used throughout the code when
  • The OTel concept of Baggage is used for rebuilding the structure of the Flow Trace. For this PR, imagine it as a global context accessed as a stack by a Span. The Span uses it to figure out against which HL Log should its Log be linked in a Flow Trace. This circumvents a limitation of OTel: a Span is aware of the ID of its parent Span but cannot access more data than that.
  • The HLProcessor sends Spans to the HLExporter in a bottom-up manner. With our current backend API, a Flow Trace must sent to HL top-to-bottom. To address this, the Exporter holds a work queue of unexported Spans and delays the upload until all ancestors in the Flow Trace have been uploaded. This is smelly; the backend should add an endpoint for uploading a Trace in one go.
  • Tests for prompt decorator are integration tests run with live LLM provider clients. Had to add API keys as secrets to this repo and modify a Github action. This is safe and has precedent in the hl-alpha repo, where API keys are added to support similar integration tests.

Open Questions

These points should be discussed before merging.

  • The private OTel backend we set up for decorators does not interact with the telemetry setup the customer might have in their app. However, they might want to integrate them, and I need to talk with them about it (this is one of the reasons I went with the Span read/ write utils instead of just dumping a JSON in the root Span attributes: we want to play nice to any OTel setup HL SDK will be added to)
  • I had to bump the minimum Python version from 3.8 to 3.9 to support OpenTelemetry Instrumentors. I will talk with Fern to understand the implications of this. We should also talk with our big customers and check if this would be a breaking change
  • I have not added all the providers we support in the API to our OTel config. I will do this as the PR gets reviewed, as it's trivial.
  • I need to double-check that Instrumentors don't impose constraints on the versions of the LLM provider libraries
  • Should we add a backend endpoint for creating a Flow Trace at once? This would improve the performance of the decorators as right now we upload the Trace one Log at a time, top to bottom, which is slow
  • Prompt template argument should support a list of message instead of just a string
  • Should we add Instrumentor Spans to Flow Traces? This would give the customer perfect insight into what happened in the application. These Spans would not correspond to any Flow File on HL domain, so we should implement no-File Logs as a pre-requisite. Future scope?

@andreibratu andreibratu force-pushed the ENG-1176-sdk-annotations branch 10 times, most recently from 3fa3b35 to 0359dac Compare October 26, 2024 20:57
@andreibratu andreibratu force-pushed the ENG-1176-sdk-annotations branch from f2f6a2c to 4d854ed Compare November 12, 2024 11:59
Comment on lines 490 to 494
# Check if the parameter can be None, either via Optional[T] or T | None type hint
origin = typing.get_origin(parameter.annotation)
# sub_types refers to T inside the annotation
sub_types = typing.get_args(parameter.annotation)
return origin is typing.Union and len(sub_types) > 0 and sub_types[-1] is type(None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this work for None | T

testing locally, also running problems with origin is typing.Union - origin is types.UnionType for me.
(python 3.11)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added support for parsing python 3.10 style union annotation, added tests for it, and created a new action that runs the test suite on both 3.9 and 3.12

@harry-humanloop
Copy link
Contributor

harry-humanloop commented Nov 12, 2024

Collating unresolved comments:

tested with the cookbook's main.py yesterday and it worked well. some concerns around how helpful we're trying to be, but think we're at a good level here (of trying to be helpful while not making that many assumptions/limitations)

that typing.Union/types.UnionType issue for _parameter_is_optional should be looked at.

@andreibratu andreibratu force-pushed the ENG-1176-sdk-annotations branch from ae815f3 to 03261c4 Compare November 12, 2024 15:05
@andreibratu andreibratu force-pushed the ENG-1176-sdk-annotations branch from f621b31 to 52c6c7b Compare November 12, 2024 15:44
@andreibratu andreibratu force-pushed the ENG-1176-sdk-annotations branch from c2e7f67 to 0ae1b89 Compare November 12, 2024 15:59
@andreibratu andreibratu force-pushed the ENG-1176-sdk-annotations branch from 0ae1b89 to b9668d2 Compare November 12, 2024 16:01
@andreibratu
Copy link
Author

andreibratu commented Nov 12, 2024

@harry-humanloop ty for feedback:

  • I have relaxed dependency requirements as much as possible. There are 2 dependencies left that use <=: opentelemetry-sdk and opentelemetry-api. There are two versions above 1.27.0: 1.28.0 and 1.28.1, both released after 1 November 2024. Thus, we should be compatible with virtually all OTel configurations
  • There is extra context on the internal Linear ticket. The initial complaint was fixed by changing the decorator signature to use the KernelParam class, but per offline discussion I still think the Linear ticket stands: the provider shouldn't default to 'openai'
  • We are using copy_context

Copy link
Contributor

@harry-humanloop harry-humanloop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. let's publish a beta version of the sdk.

suspect the dependency changes here need to be propagated to humanloop-docs.

🎉

@andreibratu
Copy link
Author

andreibratu commented Nov 12, 2024

@harry-humanloop it's actually the other way around regrading dependencies; we need to merge this first: https://github.com/humanloop/humanloop-docs/pull/147

@andreibratu andreibratu merged commit 13733a5 into master Nov 12, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants