Skip to content

Commit 5ebd2c0

Browse files
author
yejunjin
authored
fix: update README since we support 32k context length (#12)
1 parent 82c0373 commit 5ebd2c0

File tree

2 files changed

+2
-2
lines changed

2 files changed

+2
-2
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -188,7 +188,7 @@ This subsection lists the third-party dependencies for the different stages of D
188188
# Future Plans
189189

190190
- [ ] Accelerate attention with Flash-Attention
191-
- [ ] Expand context length to over 32k
191+
- [x] Expand context length to over 32k
192192
- [ ] Support 4-bit quantization
193193
- [ ] Support quantized models fine-tuned with GPTQ
194194
- [ ] Support MoE architecture

README_CN.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -189,7 +189,7 @@ $$ x_{u8} = x_{fp32} / scale + zeropoint $$
189189
# 未来规划
190190

191191
- [ ] 首包加速:加入CPU实现的Flash-Attention等Attention加速技术;
192-
- [ ] Context Length:扩展到32k以上;
192+
- [x] Context Length:扩展到32k以上;
193193
- [ ] 低bit量化支持:支持4-bit量化;
194194
- [ ] QAT量化支持:支持GPTQ算法量化微调过的模型;
195195
- [ ] MoE:支持MoE模型和架构。

0 commit comments

Comments
 (0)