-
Notifications
You must be signed in to change notification settings - Fork 7
Enable ephemeral prompt caching with LangSmith metrics #28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Correction: LangSmith is being used to provide the dashboard, not LangServe |
c3cd673 to
353850c
Compare
|
Rebased to include #29. |
|
@dlqqq Don't see this issue when I switched to main, or using anthropic directly. |
|
Seems like this might be related to the prompt caching args added, which are not being passed correctly or missing some other config for bedrock. Once I removed the prompt caching block, things seem to work. |
|
@dlqqq |
|
Thanks for catching this. Would it be sufficient to disable this feature if the model ID starts with |
3coins
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
Description
litellm.acompletion().ChatLiteLLMprovider has been made significantly more type-safe, and the_astream()method now clarifies the type of every object created & used there.Demo
(low-resolution video because of GitHub's 10MB file upload limit)
Screen.Recording.2025-12-03.at.5.53.59.PM-2.mov
Minor "breaking" changes to the
ChatLiteLLMproviderI have removed the
_stream()method implementation to avoid code duplication. This can be easily re-implemented (without duplication) if needed in the future; the code comment there details how.I needed to change the API of the
_create_usage_metadata()helper function to provide the cache metrics in LangSmith and to improve its type safety. This means that every other "invocation" method exceptastream()(e.g.generate()) is likely broken since they eventually call this function. This should not have any impact on Jupyternaut since we are always callingastream()anyways.