fix: update README since we support 32k context length (#12)

yejunjin · web-flow · commit 5ebd2c01a3fa · 2024-06-06T19:04:19.000+08:00
diff --git a/README.md b/README.md
@@ -188,7 +188,7 @@ This subsection lists the third-party dependencies for the different stages of D
 # Future Plans
 
 - [ ] Accelerate attention with Flash-Attention
-- [ ] Expand context length to over 32k
+- [x] Expand context length to over 32k
 - [ ] Support 4-bit quantization
 - [ ] Support quantized models fine-tuned with GPTQ
 - [ ] Support MoE architecture
diff --git a/README_CN.md b/README_CN.md
@@ -189,7 +189,7 @@ $$ x_{u8} = x_{fp32} / scale + zeropoint $$
 # 未来规划
 
 - [ ] 首包加速：加入CPU实现的Flash-Attention等Attention加速技术；
-- [ ] Context Length：扩展到32k以上；
+- [x] Context Length：扩展到32k以上；
 - [ ] 低bit量化支持：支持4-bit量化；
 - [ ] QAT量化支持：支持GPTQ算法量化微调过的模型；
 - [ ] MoE：支持MoE模型和架构。