Commit e2f15ef
committed
Add support for R-4B multimodal model
This commit adds support for the R-4B model (YannQi/R-4B), a multimodal
large language model with auto-thinking capabilities.
Changes:
- convert_hf_to_gguf.py: Added RVisionModel and RTextModel classes to handle
the R model architecture (RForConditionalGeneration)
- RVisionModel uses LFM2 projector type with scale_factor=1 (no patch merging)
- RTextModel extends Qwen3Model for the language component
- Proper tensor name mapping for the projector (pre_norm, linear_1, linear_2)
- tools/mtmd/clip.cpp: Modified build_patch_merge_permute() to support
scale_factor=1, which skips patch merging for models that don't need it
- R model uses SigLIP vision encoder with 729 tokens (27x27 patches)
- Projector: LayerNorm → Linear → GELU → Linear (no patch downsampling)
Architecture:
- Base text model: Qwen3-4B
- Vision encoder: SigLIP (384x384, patch size 14)
- Projector: 2-layer MLP with pre-normalization (no patch merging)
- Feature selection: full (keeps all 729 vision tokens)
Tested with llama-mtmd-cli and successfully generates English responses
with Chinese internal reasoning (<think> tags).1 parent 817d743 commit e2f15ef
2 files changed
+71
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4202 | 4202 | | |
4203 | 4203 | | |
4204 | 4204 | | |
4205 | | - | |
| 4205 | + | |
4206 | 4206 | | |
4207 | 4207 | | |
4208 | 4208 | | |
| |||
4286 | 4286 | | |
4287 | 4287 | | |
4288 | 4288 | | |
| 4289 | + | |
| 4290 | + | |
| 4291 | + | |
| 4292 | + | |
| 4293 | + | |
| 4294 | + | |
| 4295 | + | |
| 4296 | + | |
| 4297 | + | |
| 4298 | + | |
| 4299 | + | |
| 4300 | + | |
| 4301 | + | |
| 4302 | + | |
| 4303 | + | |
| 4304 | + | |
| 4305 | + | |
| 4306 | + | |
| 4307 | + | |
| 4308 | + | |
| 4309 | + | |
| 4310 | + | |
| 4311 | + | |
| 4312 | + | |
| 4313 | + | |
| 4314 | + | |
| 4315 | + | |
| 4316 | + | |
| 4317 | + | |
| 4318 | + | |
| 4319 | + | |
| 4320 | + | |
| 4321 | + | |
| 4322 | + | |
| 4323 | + | |
| 4324 | + | |
| 4325 | + | |
| 4326 | + | |
| 4327 | + | |
| 4328 | + | |
| 4329 | + | |
| 4330 | + | |
| 4331 | + | |
| 4332 | + | |
| 4333 | + | |
| 4334 | + | |
| 4335 | + | |
| 4336 | + | |
| 4337 | + | |
| 4338 | + | |
| 4339 | + | |
| 4340 | + | |
| 4341 | + | |
| 4342 | + | |
| 4343 | + | |
| 4344 | + | |
| 4345 | + | |
| 4346 | + | |
| 4347 | + | |
| 4348 | + | |
| 4349 | + | |
| 4350 | + | |
| 4351 | + | |
| 4352 | + | |
| 4353 | + | |
4289 | 4354 | | |
4290 | 4355 | | |
4291 | 4356 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2444 | 2444 | | |
2445 | 2445 | | |
2446 | 2446 | | |
| 2447 | + | |
| 2448 | + | |
| 2449 | + | |
| 2450 | + | |
| 2451 | + | |
2447 | 2452 | | |
2448 | 2453 | | |
2449 | 2454 | | |
| |||
0 commit comments