Skip to content

Commit 37e0cc3

Browse files
patrickhywUbuntu
andauthored
codellama-7b working on GPUs with 24GB memory (#9)
* now only calling initialize_past_key_values once in rest_test.py * fixed readme typos --------- Co-authored-by: Ubuntu <ubuntu@ip-172-31-23-53.ec2.internal>
1 parent a5b8fed commit 37e0cc3

File tree

2 files changed

+19
-9
lines changed

2 files changed

+19
-9
lines changed

README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,7 @@ python3 get_datastore_chat.py --model-path lmsys/vicuna-7b-v1.5 # get datastore_
7373
Build a Python code generation datastore from [The Stack](https://huggingface.co/datasets/bigcode/the-stack) within 20 minutes (requires 924MB disk storage)
7474
```bash
7575
cd datastore
76-
python3 get_datastore_code.py --model-path codellama/CodeLlama-7b-instruct-hf # get datastore_code_small.idx in this folder
76+
python3 get_datastore_code.py --model-path codellama/CodeLlama-7b-instruct-hf # get datastore_stack_small.idx in this folder
7777
```
7878

7979
### Build a large one
@@ -85,7 +85,7 @@ python3 get_datastore_chat.py --model-path lmsys/vicuna-7b-v1.5 --large-datastor
8585
(optionally) Build a Python code generation datastore from [The Stack](https://huggingface.co/datasets/bigcode/the-stack) (requires 27GB disk storage)
8686
```bash
8787
cd datastore
88-
python3 get_datastore_code.py --model-path codellama/CodeLlama-7b-instruct-hf --large-datastore True # get datastore_code_large.idx in this folder
88+
python3 get_datastore_code.py --model-path codellama/CodeLlama-7b-instruct-hf --large-datastore True # get datastore_stack_large.idx in this folder
8989
```
9090

9191
## Inference
@@ -99,7 +99,7 @@ RAYON_NUM_THREADS=6 CUDA_VISIBLE_DEVICES=0 python3 gen_model_answer_rest.py --mo
9999
### Inference on HumanEval
100100
```bash
101101
cd human_eval
102-
RAYON_NUM_THREADS=6 CUDA_VISIBLE_DEVICES=0 python3 rest_test.py --model-path codellama/CodeLlama-7b-instruct-hf --datastore-path ../datastore/datastore_code_small.idx
102+
RAYON_NUM_THREADS=6 CUDA_VISIBLE_DEVICES=0 python3 rest_test.py --model-path codellama/CodeLlama-7b-instruct-hf --datastore-path ../datastore/datastore_stack_small.idx
103103
```
104104

105105
### Free Chat

human_eval/rest_test.py

Lines changed: 16 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -30,12 +30,22 @@ def run_eval(model, tokenizer, datastore, max_token_span, num_draft, temperature
3030
accept_lengths_tree = []
3131
with torch.inference_mode():
3232

33-
past_key_values, past_key_values_data, current_length_data = initialize_past_key_values(model.base_model)
34-
model.past_key_values = past_key_values
35-
model.past_key_values_data = past_key_values_data
36-
model.current_length_data = current_length_data
37-
38-
model.current_length_data.zero_() # this is for rerun
33+
# Initialize the past key and value states
34+
if hasattr(model, "past_key_values"):
35+
past_key_values = model.past_key_values
36+
past_key_values_data = model.past_key_values_data
37+
current_length_data = model.current_length_data
38+
# Reset the past key and value states
39+
current_length_data.zero_()
40+
else:
41+
(
42+
past_key_values,
43+
past_key_values_data,
44+
current_length_data,
45+
) = initialize_past_key_values(model.base_model)
46+
model.past_key_values = past_key_values
47+
model.past_key_values_data = past_key_values_data
48+
model.current_length_data = current_length_data
3949

4050

4151
new_token = 0

0 commit comments

Comments
 (0)