Skip to content

[Deterministic Inference] Support Deterministic Inference #4651

@gongshaotian

Description

@gongshaotian

MileStone 1: Support Dense Model Deterministic Inference

MileStone 2: Support CUDAGraph、ChunkPrefill、PrefixCache、Moe Model

MileStone 3: Support SpeculativeDecoding、 Parallelism、Quantization

  • Support SpecDecoding
    • MTP
  • Parallelism
    • TP
    • EP
  • Quantization
    • BlockWise FP8

MileStone 4: Support RL trainning

  • RL

Related resources

Metadata

Metadata

Assignees

Type

No type

Projects

Status

Todo

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions