Skip to content

v0.7.2

Latest

Choose a tag to compare

@XkunW XkunW released this 04 Nov 23:24
· 41 commits to main since this release

What's Changed

  • Fix broken throughput computation
  • Fix multi-node server dying upon first request
  • Improved RDMA support
  • Added flash-infer as backend, added sglang to Docker image for future support
  • Add optional bind path rendering in CLI launch response, change vllm arg and env var rendering to optional
  • Merge CLI arg bind path and config bind path instead of overwrite
  • Add CPUs per task and memory per node to CLI launch command
  • Improve server status error handling
  • Minor code refactor

Full Changelog: v0.7.1...v0.7.2