Skip to content

Commit da1f8fe

Browse files
jeejeeleewenbinc-Bin
authored andcommitted
Modify the organization of GLM series (vllm-project#22171)
Cherry-pick: vllm-project@a7b8788 Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
1 parent 4496d68 commit da1f8fe

File tree

15 files changed

+32
-32
lines changed

15 files changed

+32
-32
lines changed

docs/models/supported_models.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -307,7 +307,7 @@ Specified using `--task generate`.
307307
| `BambaForCausalLM` | Bamba | `ibm-ai-platform/Bamba-9B-fp8`, `ibm-ai-platform/Bamba-9B` | ✅︎ | ✅︎ |
308308
| `BloomForCausalLM` | BLOOM, BLOOMZ, BLOOMChat | `bigscience/bloom`, `bigscience/bloomz`, etc. | | ✅︎ |
309309
| `BartForConditionalGeneration` | BART | `facebook/bart-base`, `facebook/bart-large-cnn`, etc. | | |
310-
| `ChatGLMModel`, `ChatGLMForConditionalGeneration` | ChatGLM | `THUDM/chatglm2-6b`, `THUDM/chatglm3-6b`, `ShieldLM-6B-chatglm3`, etc. | ✅︎ | ✅︎ |
310+
| `ChatGLMModel`, `ChatGLMForConditionalGeneration` | ChatGLM | `zai-org/chatglm2-6b`, `zai-org/chatglm3-6b`, `ShieldLM-6B-chatglm3`, etc. | ✅︎ | ✅︎ |
311311
| `CohereForCausalLM`, `Cohere2ForCausalLM` | Command-R | `CohereForAI/c4ai-command-r-v01`, `CohereForAI/c4ai-command-r7b-12-2024`, etc. | ✅︎ | ✅︎ |
312312
| `DbrxForCausalLM` | DBRX | `databricks/dbrx-base`, `databricks/dbrx-instruct`, etc. | | ✅︎ |
313313
| `DeciLMForCausalLM` | DeciLM | `nvidia/Llama-3_3-Nemotron-Super-49B-v1`, etc. | ✅︎ | ✅︎ |
@@ -321,8 +321,8 @@ Specified using `--task generate`.
321321
| `GemmaForCausalLM` | Gemma | `google/gemma-2b`, `google/gemma-1.1-2b-it`, etc. | ✅︎ | ✅︎ |
322322
| `Gemma2ForCausalLM` | Gemma 2 | `google/gemma-2-9b`, `google/gemma-2-27b`, etc. | ✅︎ | ✅︎ |
323323
| `Gemma3ForCausalLM` | Gemma 3 | `google/gemma-3-1b-it`, etc. | ✅︎ | ✅︎ |
324-
| `GlmForCausalLM` | GLM-4 | `THUDM/glm-4-9b-chat-hf`, etc. | ✅︎ | ✅︎ |
325-
| `Glm4ForCausalLM` | GLM-4-0414 | `THUDM/GLM-4-32B-0414`, etc. | ✅︎ | ✅︎ |
324+
| `GlmForCausalLM` | GLM-4 | `zai-org/glm-4-9b-chat-hf`, etc. | ✅︎ | ✅︎ |
325+
| `Glm4ForCausalLM` | GLM-4-0414 | `zai-org/GLM-4-32B-0414`, etc. | ✅︎ | ✅︎ |
326326
| `GPT2LMHeadModel` | GPT-2 | `gpt2`, `gpt2-xl`, etc. | | ✅︎ |
327327
| `GPTBigCodeForCausalLM` | StarCoder, SantaCoder, WizardCoder | `bigcode/starcoder`, `bigcode/gpt_bigcode-santacoder`, `WizardLM/WizardCoder-15B-V1.0`, etc. | ✅︎ | ✅︎ |
328328
| `GPTJForCausalLM` | GPT-J | `EleutherAI/gpt-j-6b`, `nomic-ai/gpt4all-j`, etc. | | ✅︎ |
@@ -521,8 +521,8 @@ Specified using `--task generate`.
521521
| `Florence2ForConditionalGeneration` | Florence-2 | T + I | `microsoft/Florence-2-base`, `microsoft/Florence-2-large` etc. | | | |
522522
| `FuyuForCausalLM` | Fuyu | T + I | `adept/fuyu-8b` etc. | | ✅︎ | ✅︎ |
523523
| `Gemma3ForConditionalGeneration` | Gemma 3 | T + I<sup>+</sup> | `google/gemma-3-4b-it`, `google/gemma-3-27b-it`, etc. | ✅︎ | ✅︎ | ⚠️ |
524-
| `GLM4VForCausalLM`<sup>^</sup> | GLM-4V | T + I | `THUDM/glm-4v-9b`, `THUDM/cogagent-9b-20241220` etc. | ✅︎ | ✅︎ | ✅︎ |
525-
| `Glm4vForConditionalGeneration` | GLM-4.1V-Thinking | T + I<sup>E+</sup> + V<sup>E+</sup> | `THUDM/GLM-4.1V-9B-Thinkg`, etc. | ✅︎ | ✅︎ | ✅︎ |
524+
| `GLM4VForCausalLM`<sup>^</sup> | GLM-4V | T + I | `zai-org/glm-4v-9b`, `zai-org/cogagent-9b-20241220` etc. | ✅︎ | ✅︎ | ✅︎ |
525+
| `Glm4vForConditionalGeneration` | GLM-4.1V-Thinking | T + I<sup>E+</sup> + V<sup>E+</sup> | `zai-org/GLM-4.1V-9B-Thinkg`, etc. | ✅︎ | ✅︎ | ✅︎ |
526526
| `Glm4MoeForCausalLM` | GLM-4.5 | T + I<sup>E+</sup> + V<sup>E+</sup> | `zai-org/GLM-4.5`, etc. | ✅︎ | ✅︎ | ✅︎ |
527527
| `Glm4v_moeForConditionalGeneration` | GLM-4.5V | T + I<sup>E+</sup> + V<sup>E+</sup> | `zai-org/GLM-4.5V-Air`, etc. | ✅︎ | ✅︎ | ✅︎ |
528528
| `GraniteSpeechForConditionalGeneration` | Granite Speech | T + A | `ibm-granite/granite-speech-3.3-8b` | ✅︎ | ✅︎ | ✅︎ |

examples/offline_inference/vision_language.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -221,7 +221,7 @@ def run_gemma3(questions: list[str], modality: str) -> ModelRequestData:
221221
# GLM-4v
222222
def run_glm4v(questions: list[str], modality: str) -> ModelRequestData:
223223
assert modality == "image"
224-
model_name = "THUDM/glm-4v-9b"
224+
model_name = "zai-org/glm-4v-9b"
225225

226226
engine_args = EngineArgs(
227227
model=model_name,
@@ -250,7 +250,7 @@ def run_glm4v(questions: list[str], modality: str) -> ModelRequestData:
250250

251251
# GLM-4.1V
252252
def run_glm4_1v(questions: list[str], modality: str) -> ModelRequestData:
253-
model_name = "THUDM/GLM-4.1V-9B-Thinking"
253+
model_name = "zai-org/GLM-4.1V-9B-Thinking"
254254

255255
engine_args = EngineArgs(
256256
model=model_name,

tests/distributed/test_pipeline_parallel.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -153,7 +153,7 @@ def iter_params(self, model_id: str):
153153
"baichuan-inc/Baichuan-7B": PPTestSettings.fast(),
154154
"baichuan-inc/Baichuan2-13B-Chat": PPTestSettings.fast(),
155155
"bigscience/bloomz-1b1": PPTestSettings.fast(),
156-
"THUDM/chatglm3-6b": PPTestSettings.fast(),
156+
"zai-org/chatglm3-6b": PPTestSettings.fast(),
157157
"CohereForAI/c4ai-command-r-v01": PPTestSettings.fast(load_format="dummy"),
158158
"databricks/dbrx-instruct": PPTestSettings.fast(load_format="dummy"),
159159
"Deci/DeciLM-7B-instruct": PPTestSettings.fast(),
@@ -220,7 +220,7 @@ def iter_params(self, model_id: str):
220220
"Salesforce/blip2-opt-6.7b": PPTestSettings.fast(),
221221
"facebook/chameleon-7b": PPTestSettings.fast(),
222222
"adept/fuyu-8b": PPTestSettings.fast(),
223-
"THUDM/glm-4v-9b": PPTestSettings.fast(),
223+
"zai-org/glm-4v-9b": PPTestSettings.fast(),
224224
"OpenGVLab/InternVL2-1B": PPTestSettings.fast(),
225225
"llava-hf/llava-1.5-7b-hf": PPTestSettings.fast(),
226226
"llava-hf/llava-v1.6-mistral-7b-hf": PPTestSettings.fast(),

tests/lora/test_add_lora.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414
from vllm.sampling_params import SamplingParams
1515
from vllm.utils import merge_async_iterators
1616

17-
MODEL_PATH = "THUDM/chatglm3-6b"
17+
MODEL_PATH = "zai-org/chatglm3-6b"
1818
LORA_RANK = 64
1919
DEFAULT_MAX_LORAS = 4 * 3
2020

tests/lora/test_chatglm3_tp.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66

77
from ..utils import create_new_process_for_each_test, multi_gpu_test
88

9-
MODEL_PATH = "THUDM/chatglm3-6b"
9+
MODEL_PATH = "zai-org/chatglm3-6b"
1010

1111
PROMPT_TEMPLATE = """I want you to act as a SQL terminal in front of an example database, you need only to return the sql command to me.Below is an instruction that describes a task, Write a response that appropriately completes the request.\n"\n##Instruction:\nconcert_singer contains tables such as stadium, singer, concert, singer_in_concert. Table stadium has columns such as Stadium_ID, Location, Name, Capacity, Highest, Lowest, Average. Stadium_ID is the primary key.\nTable singer has columns such as Singer_ID, Name, Country, Song_Name, Song_release_year, Age, Is_male. Singer_ID is the primary key.\nTable concert has columns such as concert_ID, concert_Name, Theme, Stadium_ID, Year. concert_ID is the primary key.\nTable singer_in_concert has columns such as concert_ID, Singer_ID. concert_ID is the primary key.\nThe Stadium_ID of concert is the foreign key of Stadium_ID of stadium.\nThe Singer_ID of singer_in_concert is the foreign key of Singer_ID of singer.\nThe concert_ID of singer_in_concert is the foreign key of concert_ID of concert.\n\n###Input:\n{query}\n\n###Response:""" # noqa: E501
1212

tests/models/language/generation/test_common.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@
5353
marks=[pytest.mark.core_model, pytest.mark.cpu_model],
5454
),
5555
pytest.param(
56-
"THUDM/chatglm3-6b", # chatglm (text-only)
56+
"zai-org/chatglm3-6b", # chatglm (text-only)
5757
),
5858
pytest.param(
5959
"meta-llama/Llama-3.2-1B-Instruct", # llama

tests/models/multimodal/generation/test_common.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -290,7 +290,7 @@
290290
num_logprobs=10,
291291
),
292292
"glm4v": VLMTestInfo(
293-
models=["THUDM/glm-4v-9b"],
293+
models=["zai-org/glm-4v-9b"],
294294
test_type=VLMTestType.IMAGE,
295295
prompt_formatter=lambda img_prompt: f"<|user|>\n{img_prompt}<|assistant|>", # noqa: E501
296296
single_image_prompts=IMAGE_ASSETS.prompts({
@@ -309,7 +309,7 @@
309309
marks=[large_gpu_mark(min_gb=32)],
310310
),
311311
"glm4_1v": VLMTestInfo(
312-
models=["THUDM/GLM-4.1V-9B-Thinking"],
312+
models=["zai-org/GLM-4.1V-9B-Thinking"],
313313
test_type=(VLMTestType.IMAGE, VLMTestType.MULTI_IMAGE),
314314
prompt_formatter=lambda img_prompt: f"<|user|>\n{img_prompt}<|assistant|>", # noqa: E501
315315
img_idx_to_prompt=lambda idx: "<|begin_of_image|><|image|><|end_of_image|>", # noqa: E501
@@ -322,7 +322,7 @@
322322
auto_cls=AutoModelForImageTextToText,
323323
),
324324
"glm4_1v-video": VLMTestInfo(
325-
models=["THUDM/GLM-4.1V-9B-Thinking"],
325+
models=["zai-org/GLM-4.1V-9B-Thinking"],
326326
# GLM4.1V require include video metadata for input
327327
test_type=VLMTestType.CUSTOM_INPUTS,
328328
max_model_len=4096,

tests/models/multimodal/processing/test_common.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -267,8 +267,8 @@ def _test_processing_correctness_one(
267267
"microsoft/Florence-2-base",
268268
"adept/fuyu-8b",
269269
"google/gemma-3-4b-it",
270-
"THUDM/glm-4v-9b",
271-
"THUDM/GLM-4.1V-9B-Thinking",
270+
"zai-org/glm-4v-9b",
271+
"zai-org/GLM-4.1V-9B-Thinking",
272272
"ibm-granite/granite-speech-3.3-2b",
273273
"h2oai/h2ovl-mississippi-800m",
274274
"OpenGVLab/InternVL2-1B",

tests/models/registry.py

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -139,7 +139,7 @@ def check_available_online(
139139
extras={"tiny": "hmellor/tiny-random-BambaForCausalLM"}), # noqa: E501
140140
"BloomForCausalLM": _HfExamplesInfo("bigscience/bloom-560m",
141141
{"1b": "bigscience/bloomz-1b1"}),
142-
"ChatGLMModel": _HfExamplesInfo("THUDM/chatglm3-6b",
142+
"ChatGLMModel": _HfExamplesInfo("zai-org/chatglm3-6b",
143143
trust_remote_code=True,
144144
max_transformers_version="4.48"),
145145
"ChatGLMForConditionalGeneration": _HfExamplesInfo("thu-coai/ShieldLM-6B-chatglm3", # noqa: E501
@@ -164,8 +164,10 @@ def check_available_online(
164164
"GemmaForCausalLM": _HfExamplesInfo("google/gemma-1.1-2b-it"),
165165
"Gemma2ForCausalLM": _HfExamplesInfo("google/gemma-2-9b"),
166166
"Gemma3ForCausalLM": _HfExamplesInfo("google/gemma-3-1b-it"),
167-
"GlmForCausalLM": _HfExamplesInfo("THUDM/glm-4-9b-chat-hf"),
168-
"Glm4ForCausalLM": _HfExamplesInfo("THUDM/GLM-4-9B-0414"),
167+
"GlmForCausalLM": _HfExamplesInfo("zai-org/glm-4-9b-chat-hf"),
168+
"Glm4ForCausalLM": _HfExamplesInfo("zai-org/GLM-4-9B-0414"),
169+
"Glm4MoeForCausalLM": _HfExamplesInfo("zai-org/GLM-4.5",
170+
min_transformers_version="4.54"), # noqa: E501
169171
"GPT2LMHeadModel": _HfExamplesInfo("openai-community/gpt2",
170172
{"alias": "gpt2"}),
171173
"GPTBigCodeForCausalLM": _HfExamplesInfo("bigcode/starcoder",
@@ -319,12 +321,10 @@ def check_available_online(
319321
"FuyuForCausalLM": _HfExamplesInfo("adept/fuyu-8b"),
320322
"Gemma3ForConditionalGeneration": _HfExamplesInfo("google/gemma-3-4b-it"),
321323
"GraniteSpeechForConditionalGeneration": _HfExamplesInfo("ibm-granite/granite-speech-3.3-2b"), # noqa: E501
322-
"GLM4VForCausalLM": _HfExamplesInfo("THUDM/glm-4v-9b",
324+
"GLM4VForCausalLM": _HfExamplesInfo("zai-org/glm-4v-9b",
323325
trust_remote_code=True,
324326
hf_overrides={"architectures": ["GLM4VForCausalLM"]}), # noqa: E501
325-
"Glm4vForConditionalGeneration": _HfExamplesInfo("THUDM/GLM-4.1V-9B-Thinking"), # noqa: E501
326-
"Glm4MoeForCausalLM": _HfExamplesInfo("zai-org/GLM-4.5",
327-
min_transformers_version="4.54"), # noqa: E501
327+
"Glm4vForConditionalGeneration": _HfExamplesInfo("zai-org/GLM-4.1V-9B-Thinking"), # noqa: E501
328328
"Glm4v_moeForConditionalGeneration": _HfExamplesInfo("zai-org/GLM-4.5V-Air",
329329
is_available_online=False), # noqa: E501
330330
"H2OVLChatModel": _HfExamplesInfo("h2oai/h2ovl-mississippi-800m",

tests/tokenization/test_cached_tokenizer.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
get_cached_tokenizer)
1111

1212

13-
@pytest.mark.parametrize("model_id", ["gpt2", "THUDM/chatglm3-6b"])
13+
@pytest.mark.parametrize("model_id", ["gpt2", "zai-org/chatglm3-6b"])
1414
def test_cached_tokenizer(model_id: str):
1515
reference_tokenizer = AutoTokenizer.from_pretrained(model_id,
1616
trust_remote_code=True)

0 commit comments

Comments
 (0)