A lightweight, single-header C++11 Jinja2 template engine designed for LLM chat templates. ** (HuggingFace style).
It focuses on supporting the subset of Jinja2 used by modern Large Language Models (LLMs) like Llama 3, Qwen 2.5/3, DeepSeek, and others, enabling seamless inference integration in C++ environments.
- C++11 Compatible: Ensures maximum compatibility across older compiler versions and embedded systems.
- Flexible JSON Backend: Supports both
nlohmann/json(default) andRapidJSONvia a unifiedujsonbridge. - Lightweight: Minimal dependencies, with all required headers included in
third_party/. - LLM Focused: Native support for
messages,tools,add_generation_prompt, and special tokens. - Unified Context: Uses
jinja::json(an alias toujson::json) for seamless context management. - Custom Function Interop: Easily inject C++ functions (e.g.,
strftime_now) into templates. - Robust: Validated against official Python
transformersoutputs using fuzzy matching tests on 390+ cases.
The library consists of two main headers:
jinja.hpp: Core template engine.third_party/ujson.hpp: Unified JSON bridge.
Just copy the jinja.hpp and third_party directory to your project.
You can check the library version using standard macros:
#include "jinja.hpp"
#if JINJA_VERSION_MAJOR >= 0
// Use jinja.cpp features
#endifTested and verified with templates from:
- Qwen 2.5 / 3 (Coder, Math, VL, Omni, Instruct, Thinking, QwQ)
- DeepSeek (V3, R1)
- Llama 3 / 3.1 / 3.2 (Instruct & Vision)
- Mistral
- Gemma
- SmolLM
- Phi
- And more...
- CMake 3.10+
- C++11 compatible compiler (GCC, Clang, MSVC)
mkdir build
cd build
cmake ..
makeTo use RapidJSON instead of nlohmann/json for better performance:
cmake .. -DUJSON_USE_RAPIDJSON=ONNote: Ensure third_party/rapidjson is available.
The project includes a comprehensive test suite based on real-world model templates.
./test_main#include "jinja.hpp"
#include <iostream>
int main() {
std::string template_str = "Hello {{ name }}!";
jinja::Template tpl(template_str);
jinja::json context;
context["name"] = "World";
std::string result = tpl.render(context);
std::cout << result << std::endl; // Output: Hello World!
return 0;
}#include "jinja.hpp"
// Load your tokenizer_config.json's "chat_template"
std::string chat_template_str = "...";
jinja::Template tpl(chat_template_str);
jinja::json messages = jinja::json::array({
{{"role", "user"}, {"content", "Hello!"}}
});
// Apply template
std::string prompt = tpl.apply_chat_template(
messages,
true, // add_generation_prompt
jinja::json::array() // tools
);You can register custom C++ functions to be called from within the template.
tpl.add_function("strftime_now", [](const std::vector<jinja::json>& args) {
// Return current time string
return "2025-12-16";
});For detailed implementation details, see doc/implementation_details.md.
Apache License 2.0. See LICENSE file for details.