🚀 The feature, motivation and pitch
Improve the LLM.chat method for offline interface to make it closer to the functionality of online service, enchance LLM.chat and LLM.generate interfaces to support streaming output, and support message filtering with reasoning_paperer and tool_caperer. This way, libraries like Outline can better support streaming output for VLLM offline interface as well
Alternatives
Inherit LLM classes to achieve simple functionality
Additional context
No response
Before submitting a new issue...