Decorator fixes #45

andreibratu · 2025-02-24T15:21:32Z

Use /otel endpoint instead of processing spans locally
Helpful error messages on evaluations.run()
Throw errors for invalid decorator uses
Decorators will no longer serialize all errors but raise critical ones -> evaluations will not hang mysteriously
Move away from context variables in favor of OTEL's runtime context API
@flow will also pick up SDK logging e.g. prompts.call(...) or tool.log(...)
Sync with backend API: jinja release, log status use

andreibratu · 2025-02-24T15:25:02Z

Test script

from humanloop import Humanloop
from openai import OpenAI

hl_client = Humanloop(
    api_key="<HL_API_KEY>",
    base_url="http://localhost:80/v5",
)

TEMPLATE = "You are a useful chatbot that speaks like a {{personality}}"

client = OpenAI(api_key="<OPENAI_KEY>")
hl_client.prompts.populate_template(TEMPLATE, inputs={"personality": "pirate"})
messages = [
    {
        "role": "system",
        "content": "You are a useful chatbot that speaks like a ",
    }
]


@hl_client.tool(path="Andrei QA/Calculator")
def calculator(a: int, b: int):
    return a + b


@hl_client.flow(path="Andrei QA/Flow")
def fn(cool: str, beans: int):
    output = calculator(1, 2)
    hl_client.tools.log(
        path="Andrei QA/Log",
        tool={
            "function": {
                "name": "calculator",
                "description": "Adds two numbers",
                "parameters": {
                    "a": {
                        "type": "int",
                        "description": "First number",
                    },
                    "b": {
                        "type": "int",
                        "description": "Second number",
                    },
                },
            }
        },
        inputs={
            "a": 1,
            "b": 2,
        },
        output=output,
    )

    hl_client.prompts.call(
        path="Andrei QA/Call Prompt",
        prompt={
            "provider": "openai",
            "model": "gpt-4o",
            "temperature": 0.8,
            "frequency_penalty": 0.6,
            "presence_penalty": 0.6,
        },
        messages=[
            {
                "content": "Say something funny",
                "role": "system",
            }
        ],
    )

    with hl_client.prompt(path="Andrei QA/Prompt"):
        while True:
            user_input = input("> ")
            if user_input == "exit":
                break
            messages.append(
                {
                    "role": "user",
                    "content": user_input,
                }
            )
            response = (
                client.chat.completions.create(
                    model="gpt-4o",
                    messages=messages,
                    temperature=0.8,
                    frequency_penalty=0.6,
                    presence_penalty=0.6,
                    seed=42,
                )
                .choices[0]
                .message.content
            )
            print(response)
            messages.append(
                {
                    "role": "assistant",
                    "content": response,
                }
            )

    return messages[-1]


fn("cool", 1)

# @hl_client.flow(path="Andrei QA/Flow Evaluate")
# def fn_evaluate(a: int, b: int):
#     return calculator(a, b)


# hl_client.evaluations.run(
#     file={
#         "path": "Andrei QA/Flow Evaluate",
#         "callable": fn_evaluate,
#     },
#     name="Test",
#     dataset={
#         "path": "Andrei QA/Arithmetic",
#         "datapoints": [
#             {
#                 "inputs": {
#                     "a": 1,
#                     "b": 2,
#                 },
#                 "target": {
#                     "value": 3,
#                 },
#             },
#             {
#                 "inputs": {
#                     "a": 3,
#                     "b": 4,
#                 },
#                 "target": {
#                     "value": 7,
#                 },
#             },
#         ],
#     },
#     evaluators=[
#         {
#             "path": "Andrei QA/Equals",
#             "callable": lambda x, y: x["output"] == y["target"]["value"],
#             "return_type": "boolean",
#             "args_type": "target_required",
#         }
#     ],
# )

jamesbaskerville · 2025-02-25T13:06:57Z

src/humanloop/client.py

        *,
-        path: Optional[str] = None,
-        **prompt_kernel: Unpack[DecoratorPromptKernelRequestParams],  # type: ignore
+        path: str,


nit for all of these: we should support passing either path or id, no?

I'd keep it to path to reduce cognitive load

jamesbaskerville · 2025-02-25T13:07:30Z

src/humanloop/client.py

        else:
            self._opentelemetry_tracer = opentelemetry_tracer

+    @contextmanager


as discussed earlier, this is cool but probably unnecessary -- simpler to just keep it as a decorator on a callable

jamesbaskerville · 2025-02-25T13:09:05Z

src/humanloop/otel/exporter.py

-        """Upload spans to Humanloop.
-
-        Ran by worker threads. The threads use the self._shutdown flag to wait
-        for Spans to arrive. Setting a timeout on self._upload_queue.get() risks
-        shutting down the thread early as no Spans are produced e.g. while waiting
-        for user input into the instrumented feature or application.
-
-        Each thread will upload a Span to Humanloop, provided the Span has all its
-        dependencies uploaded. The dependency happens in a Flow Trace context, where
-        the Trace parent must be uploaded first. The Span Processor will send in Spans
-        bottoms-up, while the upload of a Trace happens top-down. If a Span did not
-        have its span uploaded yet, it will be re-queued to be uploaded later.
-        """


should still replace this with a correct docstring

peadaroh · 2025-03-06T13:24:50Z

Just to confirm this refactors the 'polling behaviour' for flow logs to instead check for when it is updated.

jamesbaskerville

generally looks great. Love the error/logging improvements and how simplified the processor/exporter logic is.

Few higher-level things:

why are there more # type: ignore[arg-type] littered all over? what changed to cause that?
need to handle the polling fix as a follow-on, though good we're sleeping in between polls now

jamesbaskerville · 2025-03-07T15:19:57Z

src/humanloop/client.py

+        The Humanloop SDK File decorators use OpenTelemetry internally.
+        You can provide a TracerProvider and a Tracer to integrate
+        with your existing telemetry system. If not provided,
+        an internal TracerProvider will be used.


Does our OTEL exporting stuff still work ok if user passes in a TracerProvider?

jamesbaskerville · 2025-03-07T15:21:50Z

src/humanloop/client.py

+        If a different model, endpoint, or hyperparameter is used, a new
+        Prompt version is created. For example:
        ```
+        @humanloop_client.prompt(path="My Prompt")


nit: inconsistent, above example just uses @prompt. Should standardize

jamesbaskerville · 2025-03-07T15:23:34Z

src/humanloop/client.py

            provided, the function name is used as the path and the File
            is created in the root of your Humanloop organization workspace.

        :param prompt_kernel: Attributes that define the Prompt. See `class:DecoratorPromptKernelRequestParams`


looks like we're getting rid of this?

jamesbaskerville · 2025-03-07T15:28:15Z

tests/integration/chat_agent/__init__.py

are these tests all gone now? Or did they not work?

jamesbaskerville · 2025-03-07T15:32:41Z

src/humanloop/overload.py

+            context = get_decorator_context()
+            if context is None:
+                raise HumanloopRuntimeError("Internal error: trace_id context is set outside a decorator context.")


if I'm understanding error right, this should also be a check in the line 59 block? i.e. apply even when not flow client.

I do like all this validation/error checking though!

jamesbaskerville · 2025-03-07T16:07:20Z

src/humanloop/evals/run.py

+        (trace_info[-200:] + "..." if len(trace_info) > 200 else trace_info) if length_limit else trace_info
+    )


... should be in front if it's the last 200 chars, right?

also feels like length_limit should just be an int instead of a boolean

jamesbaskerville · 2025-03-07T16:09:57Z

src/humanloop/otel/exporter/__init__.py

+                },
+                data=serialize_span(span_to_export),
+            )
+            print("RECV", span_to_export.attributes, response.json(), response.status_code)


fix: change to logger/debug

jamesbaskerville · 2025-03-07T16:10:01Z

src/humanloop/otel/exporter/__init__.py

+                pass
+            else:
+                if eval_context_callback:
+                    print("HELLO")


jamesbaskerville · 2025-03-07T16:13:09Z

src/humanloop/otel/exporter/proto.py

+        ]
+    )
+
+    return MessageToJson(payload)


hmm not 100% sure this will work as expected -- OTLP JSON encoding differs a bit from the proto3 standard mapping. Possible we should just be working in proto anyways? Might make things simpler since it's more of the default rather than JSON.

https://opentelemetry.io/docs/specs/otlp/#json-protobuf-encoding

I guess it's fine if our IDs aren't "correct" as long as they're internally consistent?

jamesbaskerville · 2025-03-07T16:14:38Z

src/humanloop/otel/exporter/proto.py

+                                attributes=[
+                                    KeyValue(
+                                        key=key,
+                                        value=AnyValue(string_value=str(value)),


are they always strings? Aren't they sometimes ints or some other type?

Infinite polling the export queue without a sleep was causing threads to take up a full core of CPU. Simple sleep fixes that.

Andrei Bratu added 3 commits February 19, 2025 17:21

Refactoring code

ce96aaf

simplify decorators

ab78ce2

Use /otel endpoint to process endpoints

ee10e5a

changed name of prompt context

bf6196f

andreibratu changed the title ~~Decorators fixes~~ Decorator fixes Feb 24, 2025

jamesbaskerville reviewed Feb 25, 2025

View reviewed changes

Andrei Bratu added 6 commits February 26, 2025 14:36

refactoring

bc5fc8a

wip

ce09d3d

QA refactor sdk with evals

02e1ed4

Error handling in decorators

2455724

Merge branch 'master' into decorators-fixes

4443ca6

update lock file

2da96ae

Andrei Bratu added 4 commits March 7, 2025 03:44

QA pass on python

8556149

Docstrings, small refactors

246c728

mypy nit

1001227

checking dependencies

201fb95

jamesbaskerville reviewed Mar 7, 2025

View reviewed changes

fix: Add sleep to OTEL exporters (#51)

e5d0525

Infinite polling the export queue without a sleep was causing threads to take up a full core of CPU. Simple sleep fixes that.

jamesbaskerville approved these changes Mar 7, 2025

View reviewed changes

andreibratu merged commit 1a709ee into master Mar 7, 2025
7 checks passed

		(trace_info[-200:] + "..." if len(trace_info) > 200 else trace_info) if length_limit else trace_info
		)

Decorator fixes #45

Decorator fixes #45

Uh oh!

Conversation

andreibratu commented Feb 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

andreibratu commented Feb 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

peadaroh commented Mar 6, 2025

Uh oh!

jamesbaskerville left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

andreibratu commented Feb 24, 2025 •

edited

Loading

andreibratu commented Feb 24, 2025 •

edited

Loading