Skip to content

Conversation

@juliendenize
Copy link
Contributor

In PRs #14737 #15420, we added our format support for conversion, and therefore mistral-common dependency for conversion, and made chat templates optional. This is because we noticed that chat templates and custom tokenization are generally not matching mistral-common which can affect model performance.

The goal of these PRs was to provide to the community the freedom to experience the best use of our models with the llama.cpp engine.

However part of this was not well welcomed by the community that particularly disliked having mistral-common as a hard dependency as discussed in #16146. This PR aims to remove this hard dependency and instead raise an error if it is not installed. This occurs for converting Mistral models for the following cases:

  • the model conversion is done with our format
  • the model conversion is done with transformers format except for the tokenizers. This is what happens for our releases now as we do not not release a tokenizer config.

Fix #16146

cc @CISC @ngxson hope this matches what was dicussed in #16146 and that it passes the CI in particular for typing.

@juliendenize juliendenize requested a review from CISC as a code owner October 23, 2025 11:12
@github-actions github-actions bot added the python python script changes label Oct 23, 2025
@CISC
Copy link
Collaborator

CISC commented Oct 23, 2025

Moving the imports won't fix the typings btw. :)

You need to use try/except ImportError instead of importlib.

Edit: Actually, I see it's now complaining about vocab.py as well, so you might have to silence the errors per line as well...

@juliendenize juliendenize force-pushed the make_mistral_common_optional branch from cee35a4 to 1671309 Compare October 23, 2025 11:40
@juliendenize
Copy link
Contributor Author

Yep I tried moving things a bit and it didn't work so I just added the flags to ignore the relevant lines if that's ok.

@CISC
Copy link
Collaborator

CISC commented Oct 23, 2025

Yep I tried moving things a bit and it didn't work so I just added the flags to ignore the relevant lines if that's ok.

Yep, though I think try/except ImportError is still preferable. Also, since we are no longer installing mistral-common, you now need to silence errors in gguf-py/gguf/vocab.py as well.

@LostRuins
Copy link
Collaborator

Thank you for taking our concerns into account.

Copy link
Collaborator

@CISC CISC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you! However I'm currently away travelling so great if @ngxson can take a final look and potentially merge.

@CISC CISC requested a review from ngxson October 23, 2025 13:46
Copy link
Collaborator

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Thanks for addressing the issue

@ngxson ngxson merged commit dd62dcf into ggml-org:master Oct 23, 2025
7 checks passed
pwilkin pushed a commit to pwilkin/llama.cpp that referenced this pull request Oct 23, 2025
* Make mistral-common dependency optional

* Fix typing
@arch-btw
Copy link
Contributor

@juliendenize this broke Mistral-Small-3.2-24B-Instruct-2506

INFO:gguf.vocab:Loading Mistral tokenizer from /home/arch/Mistral-Small-3.2-24B-Instruct-2506
INFO:mistral_common.tokens.tokenizers.tekken:Vocab size: 150000
INFO:mistral_common.tokens.tokenizers.tekken:Cutting vocab to first 130072 tokens.
INFO:hf-to-gguf:Converting tokenizer MistralTokenizerType.tekken of size 131072.
INFO:hf-to-gguf:Setting bos, eos, unk and pad token IDs to 1, 2, 0, 11.
WARNING:gguf.gguf_writer:Duplicated key name 'llama.vocab_size', overwriting it with new value 131072 of type UINT32
Traceback (most recent call last):
  File "/home/arch/llama.cpp/convert_hf_to_gguf.py", line 2356, in set_vocab
    self._set_vocab_sentencepiece()
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
  File "/home/arch/llama.cpp/convert_hf_to_gguf.py", line 1229, in _set_vocab_sentencepiece
    tokens, scores, toktypes = self._create_vocab_sentencepiece()
                               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
  File "/home/arch/llama.cpp/convert_hf_to_gguf.py", line 1246, in _create_vocab_sentencepiece
    raise FileNotFoundError(f"File not found: {tokenizer_path}")
FileNotFoundError: File not found: /home/arch/Mistral-Small-3.2-24B-Instruct-2506/tokenizer.model

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/arch/llama.cpp/convert_hf_to_gguf.py", line 2359, in set_vocab
    self._set_vocab_llama_hf()
    ~~~~~~~~~~~~~~~~~~~~~~~~^^
  File "/home/arch/llama.cpp/convert_hf_to_gguf.py", line 1331, in _set_vocab_llama_hf
    vocab = gguf.LlamaHfVocab(self.dir_model)
  File "/home/arch/llama.cpp/gguf-py/gguf/vocab.py", line 505, in __init__
    with open(fname_tokenizer, encoding='utf-8') as f:
         ~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/home/arch/Mistral-Small-3.2-24B-Instruct-2506/tokenizer.json'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/arch/llama.cpp/convert_hf_to_gguf.py", line 10432, in <module>
    main()
    ~~~~^^
  File "/home/arch/llama.cpp/convert_hf_to_gguf.py", line 10426, in main
    model_instance.write()
    ~~~~~~~~~~~~~~~~~~~~^^
  File "/home/arch/llama.cpp/convert_hf_to_gguf.py", line 660, in write
    self.prepare_metadata(vocab_only=False)
    ~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^
  File "/home/arch/llama.cpp/convert_hf_to_gguf.py", line 781, in prepare_metadata
    self.set_vocab()
    ~~~~~~~~~~~~~~^^
  File "/home/arch/llama.cpp/convert_hf_to_gguf.py", line 2362, in set_vocab
    self._set_vocab_gpt2()
    ~~~~~~~~~~~~~~~~~~~~^^
  File "/home/arch/llama.cpp/convert_hf_to_gguf.py", line 1165, in _set_vocab_gpt2
    tokens, toktypes, tokpre = self.get_vocab_base()
                               ~~~~~~~~~~~~~~~~~~~^^
  File "/home/arch/llama.cpp/convert_hf_to_gguf.py", line 873, in get_vocab_base
    tokenizer = AutoTokenizer.from_pretrained(self.dir_model)
  File "/home/arch/llama.cpp/venv/lib/python3.13/site-packages/transformers/models/auto/tokenization_auto.py", line 1156, in from_pretrained
    tokenizer_class_py, tokenizer_class_fast = TOKENIZER_MAPPING[type(config)]
                                               ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^
  File "/home/arch/llama.cpp/venv/lib/python3.13/site-packages/transformers/models/auto/auto_factory.py", line 815, in __getitem__
    raise KeyError(key)
KeyError: <class 'transformers.models.mistral3.configuration_mistral3.Mistral3Config'>

@CISC
Copy link
Collaborator

CISC commented Nov 17, 2025

@arch-btw Did you remember to use --mistral-format?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python python script changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

mistral-common really shouldn't be a mandatory dependency (discuss?)

5 participants