-
Notifications
You must be signed in to change notification settings - Fork 2.5k
[DNM, WIP] feat(ReAct, History): Refactor (1) dspy.History to be flexible to more input types and (2) ReAct to use dspy.History #9113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
393e4ba to
b24b448
Compare
|
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| # Only infer if mode is the default "flat" and messages clearly match another mode | ||
| if self.mode == "flat": | ||
| inferred = self._infer_mode_from_msg(self.messages[0]) | ||
| if inferred in {"raw", "demo"}: | ||
| object.__setattr__(self, "mode", inferred) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preserve signature histories instead of flattening
The validator only upgrades the default mode from flat to raw or demo, never to signature, so a typical signature-style history like {"question": ..., "answer": ...} remains in flat mode and _format_flat_history emits a single user message. Multi-turn prompts that previously produced user/assistant pairs (e.g., the default History(messages=[{"question": "...", "answer": "..."}]) used in tests/predict/test_predict.py) now collapse the assistant reply into the next user prompt, dropping prior assistant turns from the LM context. Consider keeping signature as the default or inferring signature mode when both input and output keys are present.
Useful? React with 👍 / 👎.
| return value | ||
| try: | ||
| return json.dumps(value) | ||
| except (TypeError, ValueError): | ||
| return str(value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keep tool observations’ custom payloads
Tool outputs are serialized with json.dumps and a str(value) fallback, so non‑JSON DSPy types (e.g., a tool returning dspy.Image objects or tuples) are recorded as plain strings like (Image(...)). Because split_message_content_for_custom_types never sees the custom-type markers, multimodal tool responses disappear from the trajectory (e.g., test_tool_observation_preserves_custom_type would no longer surface image_url parts to the LM). The serializer should preserve DSPy Type encodings instead of stringifying them.
Useful? React with 👍 / 👎.
Will be split into 2 PRs. Do not merge
Refactors both dspy.History and dspy.ReAct.
Goal
Multi turn chat has been a weaker part of the DSPy dx. It is not clear how you should maintain history, nor particularly easy to do so. These PRs should allow you to build multi turn, (optionally tool use) chatbots more easily.
This PR is the start of efforts to alleviate the concerns brought up in #8798
Overview
History
DSPy.History used to be set to only a single input type.
All inputs and outputs needed to match the current signature that is being used otherwise each step would be filtered to match those fields. This was done to guarantee that we could separate into proper user/assistant pairs. We relax that constraint and give the user more flexibility on how to collect and display messages.
History now has 4 possible modes that it can handle:
ReAct
dspy.ReAct uses two signatures internally. ReAct.react and ReAct.extract. ReAct does "{input_fields}, trajectory:str -> tool: ToolCall" in a loop, and then "{input_fields}, trajectory: str -> {output_fields}"
Testing
The primary testing has been done using the following script to ensure that the messages still work properly. Before merging, I will run some tool use benchmarks on both new and old versions across a suite of differently sized models and ensure parity or better for the new versions.