Commit 783ea0f
[rebase]Dev-ucm-v1 rebase to develop (#453)
* [opt] refactor uc connector (#364)
refactor ucm_connector
* [Feat] Implement kv cache broadcast in MLA (#367)
* [Feat] Implement kv cache broadcast in MLA in ucm_connector
* [Style] Change wait for broadcast into single task method
* [feature] add ucm mock connector (#375)
* add ucm mock connector
* fix chunk prefill bug
* [Feat] Support get launch config from yaml (#377)
* [Feat] Support launch from config file
* [Docs] Update documents for launch with yaml
* [Fix] Change load only on first rank into configuration
* [Feat] Add support for hit ratio in yaml
* [Fix] Fix load only first rank in non mla scene
* [fix] refuse monkey patch (#383)
refuse monkey patch
* [bugfix] fix gqa bug (#384)
fix gqa bug
* [bugfix] end == 0 bug (#385)
fix end == 0 bug
* [feature] optimize generate_tensor (#396)
optimize generate_tensor
* [Fix] fix mla bug when no broadcast in wait for save (#398)
* [feat]adapt GQA & modify config.yaml (#407)
* adapt GQA & modify config.yaml
* move process to UCMDirectConnector
* fix comment
* modify hash function
* fix style
* code style and modify hash
* init parent_block_hash_value
* [feat]Adapt vllm_ascend_0110 and Add configurable options (#415)
* Adapt vllm_ascend_0110 and Add configurable options
* avoid type conversion in init kvcache
* [patch]seprate sparse patch (#417)
seprate spase patch
Co-authored-by: lijiachen19 <lijiachen19@huawei.com>
* [bugfix]Support tensor parallelism across servers (#420)
Support tensor parallelism across servers
* [Feat] UCM supports metrics display online via Grafana and Promethues (#414)
* [Feat] Build metrics frame
* [Feat]add metrics(ucm_obser.py + metrics_configs.yaml)
* [Feat] Implementation of metrics logger on the C++ side for storing and retrieving stats
* [Fix] Provide simple grafana and fix bugs
* [feat] change the log position of UCM metrics
* [fix]modify grafana.json
* [Feat] UCM supports metrics display online via Grafana and Promethues
* [Fix] Remove configs to examples and add liscense
---------
Co-authored-by: flesher0813 <1208954694@qq.com>
Co-authored-by: hero<tianxuehan@huawei.com>
* [feat]Merge develop to dev-ucm-v1 and fix code style (#428)
* [fix] fix sparse attention (#397)
fix ascend attention
Co-authored-by: lijiachen19 <lijiachen19@huawei.com>
* [opt] Share Infra implementation and unify status codes (#399)
share infra module
Co-authored-by: Fang Run <Fang_Run@126.com>
* [bugfix] Fix ESA to be compatible with the latest NFSStore. (#401)
fix esa to adapt latest NFSStore
* release v0.1.0rc4 (#402)
Co-authored-by: lijiachen19 <lijiachen19@huawei.com>
* [opt] Remove unused cc impl of dramstore (#406)
remove unused cc impl of dramstore
* [Fix]remove dram docs and modify quick-start doc (#411)
* [Fix]remove dram docs and modify quick-start doc
* modify index.md
---------
Co-authored-by: t00939662 <tianxuehan@huawei.com>
* [Feature] Added performance testing tool based on the PyTest testing framework (#295)
Performance testing tool based on the PyTest testing framework.
* [Misc] Add cpp-linter.yml (#422)
* [docs]add metrics doc (#416)
* [docs]add metrics doc
* modify metrics.md
* modify metrics.md
---------
Co-authored-by: t00939662 <tianxuehan@huawei.com>
* [perf] Modify CUDA SIMD and add Triton hash encoder (#408)
* fix cpp code style
---------
Co-authored-by: Lijiachen1018 <30387633+Lijiachen1018@users.noreply.github.com>
Co-authored-by: lijiachen19 <lijiachen19@huawei.com>
Co-authored-by: Mag1c.H <hemajun815@163.com>
Co-authored-by: Fang Run <Fang_Run@126.com>
Co-authored-by: MaxWang <wangwenxin21@huawei.com>
Co-authored-by: hero0307 <tianxuehan0307@163.com>
Co-authored-by: t00939662 <tianxuehan@huawei.com>
Co-authored-by: ML <85485147+Menglths@users.noreply.github.com>
Co-authored-by: ShiXiaolei <indirashi@163.com>
* add env variable ENABLE_SPARSE (#430)
Co-authored-by: lijiachen19 <lijiachen19@huawei.com>
* Fix(patch): fix patch for vllm-ascend (#433)
Fix(patch): fix patch for vllm-ascend volcengine/verl#2564
Co-authored-by: lijiachen19 <lijiachen19@huawei.com>
* [bugfix] fix accuracy problem when chunked prefill (#438)
* fix accuracy problem when chunked prefill
* [bugfix]fix num_schedule-tokens=1 (#442)
* fix num_schedule-tokens=1
* Simplify the code
* [fix]: Fix sparse patch (#444)
Fix sparse patch
Co-authored-by: lijiachen <lijiachen19@huawei.com>
* [bugfix] The Metrics module uses a non-existent variable self.rank (#445)
* [Feature]Add an access bandwidth test script for ucm_connector (#418)
* Add an access bandwidth test script for 'ucm_connector'
* [bugfix]adapt vllm0.9.1 (#446)
adapt vllm0.9.1
* [Fix]Set the multiprocessing start method of the test tool to 'spawn'. (#447)
Set the multiprocessing start method of the test tool to 'spawn' and add NPU cleanup
* [fix] Adapt all sparse-attention methods to the new connector. (#441)
* sparse to adapt new connector
* Adapt the YAML configuration
* [docs] renew docs for v1 (#448)
renew docs for v1
Co-authored-by: lijiachen19 <lijiachen19@huawei.com>
* set version to 0.1.0 (#450)
* [Feature] GSA adapt nfsStore (#451)
* adapt nfsstore
* fix codestyle
---------
Co-authored-by: ygwpz <543529648@qq.com>
Co-authored-by: harrisonyhq <harrisonyhq@gmail.com>
Co-authored-by: qyh111 <qiuyuhao1@huawei.com>
Co-authored-by: lijiachen19 <lijiachen19@huawei.com>
Co-authored-by: sumingZero <58885253+sumingZero@users.noreply.github.com>
Co-authored-by: flesher0813 <1208954694@qq.com>
Co-authored-by: Mag1c.H <hemajun815@163.com>
Co-authored-by: Fang Run <Fang_Run@126.com>
Co-authored-by: MaxWang <wangwenxin21@huawei.com>
Co-authored-by: hero0307 <tianxuehan0307@163.com>
Co-authored-by: t00939662 <tianxuehan@huawei.com>
Co-authored-by: ML <85485147+Menglths@users.noreply.github.com>
Co-authored-by: ShiXiaolei <indirashi@163.com>
Co-authored-by: zhou-haitao <74044944+zhou-haitao@users.noreply.github.com>
Co-authored-by: zbb200819 <1130072360@qq.com>1 parent 52fe5a7 commit 783ea0f
File tree
60 files changed
+5226
-2598
lines changed- docs/source
- getting-started
- user-guide
- prefix-cache
- sparse-attention
- examples
- metrics
- test
- ucm
- integration/vllm
- patch
- patch_funcs
- v091
- v092
- sparse
- esa
- retrieval/cpy
- gsa
- offload_ops/src
- prefetch
- include
- src
- kvcomp
- hash_retrieval/cpy
- kvstar
- retrieve/core/domain/retrieve_task
- store
- dramstore
- nfsstore
- cc/domain/hotness
- pcstore
- test/e2e
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
60 files changed
+5226
-2598
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
57 | 57 | | |
58 | 58 | | |
59 | 59 | | |
60 | | - | |
| 60 | + | |
61 | 61 | | |
62 | 62 | | |
63 | 63 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
59 | 59 | | |
60 | 60 | | |
61 | 61 | | |
62 | | - | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
63 | 73 | | |
64 | 74 | | |
65 | 75 | | |
| |||
73 | 83 | | |
74 | 84 | | |
75 | 85 | | |
76 | | - | |
77 | | - | |
| 86 | + | |
| 87 | + | |
78 | 88 | | |
79 | 89 | | |
80 | | - | |
81 | | - | |
82 | | - | |
83 | | - | |
| 90 | + | |
84 | 91 | | |
85 | 92 | | |
86 | 93 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
87 | 87 | | |
88 | 88 | | |
89 | 89 | | |
90 | | - | |
91 | | - | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
92 | 99 | | |
93 | 100 | | |
94 | 101 | | |
| |||
101 | 108 | | |
102 | 109 | | |
103 | 110 | | |
104 | | - | |
| 111 | + | |
105 | 112 | | |
106 | 113 | | |
107 | 114 | | |
| |||
131 | 138 | | |
132 | 139 | | |
133 | 140 | | |
134 | | - | |
135 | | - | |
136 | | - | |
137 | | - | |
138 | | - | |
139 | | - | |
140 | | - | |
| 141 | + | |
141 | 142 | | |
142 | 143 | | |
143 | 144 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
| 12 | + | |
12 | 13 | | |
13 | 14 | | |
14 | 15 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
107 | 107 | | |
108 | 108 | | |
109 | 109 | | |
| 110 | + | |
| 111 | + | |
110 | 112 | | |
111 | 113 | | |
112 | 114 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
97 | 97 | | |
98 | 98 | | |
99 | 99 | | |
| 100 | + | |
100 | 101 | | |
101 | 102 | | |
102 | 103 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
| 35 | + | |
35 | 36 | | |
36 | 37 | | |
37 | 38 | | |
| |||
0 commit comments