huggingface
diff --git a/‎docs/source/en/_toctree.yml‎
Lines changed: 6 additions & 0 deletions b/‎docs/source/en/_toctree.yml‎
Lines changed: 6 additions & 0 deletions
diff --git a/‎docs/source/en/api/loaders/lora.md‎
Lines changed: 5 additions & 0 deletions b/‎docs/source/en/api/loaders/lora.md‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎docs/source/en/api/models/ovisimage_transformer2d.md‎
Lines changed: 24 additions & 0 deletions b/‎docs/source/en/api/models/ovisimage_transformer2d.md‎
Lines changed: 24 additions & 0 deletions
diff --git a/‎docs/source/en/api/pipelines/bria_fibo.md‎
Lines changed: 6 additions & 6 deletions b/‎docs/source/en/api/pipelines/bria_fibo.md‎
Lines changed: 6 additions & 6 deletions
diff --git a/‎docs/source/en/api/pipelines/flux2.md‎
Lines changed: 6 additions & 0 deletions b/‎docs/source/en/api/pipelines/flux2.md‎
Lines changed: 6 additions & 0 deletions
diff --git a/‎docs/source/en/api/pipelines/hunyuan_video15.md‎
Lines changed: 2 additions & 2 deletions b/‎docs/source/en/api/pipelines/hunyuan_video15.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎docs/source/en/api/pipelines/kandinsky5_image.md‎
Lines changed: 112 additions & 0 deletions b/‎docs/source/en/api/pipelines/kandinsky5_image.md‎
Lines changed: 112 additions & 0 deletions
@@ -375,6 +375,8 @@
         title: MochiTransformer3DModel
       - local: api/models/omnigen_transformer
         title: OmniGenTransformer2DModel
+      - local: api/models/ovisimage_transformer2d
+        title: OvisImageTransformer2DModel
       - local: api/models/pixart_transformer2d
         title: PixArtTransformer2DModel
       - local: api/models/prior_transformer
@@ -567,6 +569,8 @@
         title: MultiDiffusion
       - local: api/pipelines/omnigen
         title: OmniGen
+      - local: api/pipelines/ovis_image
+        title: Ovis-Image
       - local: api/pipelines/pag
         title: PAG
       - local: api/pipelines/paint_by_example
@@ -660,6 +664,8 @@
         title: HunyuanVideo1.5
       - local: api/pipelines/i2vgenxl
         title: I2VGen-XL
+      - local: api/pipelines/kandinsky5_image
+        title: Kandinsky 5.0 Image
       - local: api/pipelines/kandinsky5_video
         title: Kandinsky 5.0 Video
       - local: api/pipelines/latte
 
@@ -31,6 +31,7 @@ LoRA is a fast and lightweight training method that inserts and trains a signifi
 - [`AmusedLoraLoaderMixin`] is for the [`AmusedPipeline`].
 - [`HiDreamImageLoraLoaderMixin`] provides similar functions for [HiDream Image](https://huggingface.co/docs/diffusers/main/en/api/pipelines/hidream)
 - [`QwenImageLoraLoaderMixin`] provides similar functions for [Qwen Image](https://huggingface.co/docs/diffusers/main/en/api/pipelines/qwen).
+- [`ZImageLoraLoaderMixin`] provides similar functions for [Z-Image](https://huggingface.co/docs/diffusers/main/en/api/pipelines/zimage).
 - [`Flux2LoraLoaderMixin`] provides similar functions for [Flux2](https://huggingface.co/docs/diffusers/main/en/api/pipelines/flux2).
 - [`LoraBaseMixin`] provides a base class with several utility methods to fuse, unfuse, unload, LoRAs and more.
 
@@ -112,6 +113,10 @@ LoRA is a fast and lightweight training method that inserts and trains a signifi
 
 [[autodoc]] loaders.lora_pipeline.QwenImageLoraLoaderMixin
 
+## ZImageLoraLoaderMixin
+
+[[autodoc]] loaders.lora_pipeline.ZImageLoraLoaderMixin
+
 ## KandinskyLoraLoaderMixin
 [[autodoc]] loaders.lora_pipeline.KandinskyLoraLoaderMixin
 
 
@@ -0,0 +1,24 @@
+<!-- Copyright 2025 The HuggingFace Team. All rights reserved.
+
+Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
+the License. You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
+an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
+specific language governing permissions and limitations under the License. -->
+
+# OvisImageTransformer2DModel
+
+The model can be loaded with the following code snippet.
+
+```python
+from diffusers import OvisImageTransformer2DModel
+
+transformer = OvisImageTransformer2DModel.from_pretrained("AIDC-AI/Ovis-Image-7B", subfolder="transformer", torch_dtype=torch.bfloat16)
+```
+
+## OvisImageTransformer2DModel
+
+[[autodoc]] OvisImageTransformer2DModel
@@ -21,9 +21,10 @@ With only 8 billion parameters, FIBO provides a new level of image quality, prom
 FIBO is trained exclusively on a structured prompt and will not work with freeform text prompts.
 you can use the [FIBO-VLM-prompt-to-JSON](https://huggingface.co/briaai/FIBO-VLM-prompt-to-JSON) model or the [FIBO-gemini-prompt-to-JSON](https://huggingface.co/briaai/FIBO-gemini-prompt-to-JSON)  to convert your freeform text prompt to a structured JSON prompt.
 
-its not recommended to use freeform text prompts directly with FIBO, as it will not produce the best results.
+> [!NOTE]
+> Avoid using freeform text prompts directly with FIBO because it does not produce the best results.
 
-you can learn more about FIBO in  [Bria Fibo Hugging Face page](https://huggingface.co/briaai/FIBO).
+Refer to the Bria Fibo Hugging Face [page](https://huggingface.co/briaai/FIBO) to learn more.
 
 
 ## Usage
@@ -37,9 +38,8 @@ hf auth login
 ```
 
 
-## BriaPipeline
+## BriaFiboPipeline
 
-[[autodoc]] BriaPipeline
+[[autodoc]] BriaFiboPipeline
 	- all
-	- __call__
-
+	- __call__
@@ -26,6 +26,12 @@ Original model checkpoints for Flux can be found [here](https://huggingface.co/b
 >
 > [Caching](../../optimization/cache) may also speed up inference by storing and reusing intermediate outputs.
 
+## Caption upsampling
+
+Flux.2 can potentially generate better better outputs with better prompts. We can "upsample"
+an input prompt by setting the `caption_upsample_temperature` argument in the pipeline call arguments.
+The [official implementation](https://github.com/black-forest-labs/flux2/blob/5a5d316b1b42f6b59a8c9194b77c8256be848432/src/flux2/text_encoder.py#L140) recommends this value to be 0.15.
+
 ## Flux2Pipeline
 
 [[autodoc]] Flux2Pipeline
 
@@ -56,8 +56,8 @@ export_to_video(video, "output.mp4", fps=15)
 
 - HunyuanVideo1.5 use attention masks with variable-length sequences. For best performance, we recommend using an attention backend that handles padding efficiently.
 
-    - **H100/H800:** `_flash_3_hub` or `_flash_varlen_3`
-    - **A100/A800/RTX 4090:** `flash_hub` or `flash_varlen`
+    - **H100/H800:** `_flash_3_hub` or `_flash_3_varlen_hub`
+    - **A100/A800/RTX 4090:** `flash_hub` or `flash_varlen_hub`
     - **Other GPUs:** `sage_hub`
 
 Refer to the [Attention backends](../../optimization/attention_backends) guide for more details about using a different backend.
 
@@ -0,0 +1,112 @@
+<!--Copyright 2025 The HuggingFace Team and Kandinsky Lab Team. All rights reserved.
+Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
+the License. You may obtain a copy of the License at
+http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
+an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
+specific language governing permissions and limitations under the License.
+-->
+
+# Kandinsky 5.0 Image
+
+[Kandinsky 5.0](https://arxiv.org/abs/2511.14993) is a family of diffusion models for Video & Image generation. 
+
+Kandinsky 5.0 Image Lite is a lightweight image generation model (6B parameters) 
+
+The model introduces several key innovations:
+- **Latent diffusion pipeline** with **Flow Matching** for improved training stability
+- **Diffusion Transformer (DiT)** as the main generative backbone with cross-attention to text embeddings
+- Dual text encoding using **Qwen2.5-VL** and **CLIP** for comprehensive text understanding
+- **Flux VAE** for efficient image encoding and decoding
+
+The original codebase can be found at [kandinskylab/Kandinsky-5](https://github.com/kandinskylab/Kandinsky-5).
+
+
+## Available Models
+
+Kandinsky 5.0 Image Lite:
+| model_id | Description | Use Cases |
+|------------|-------------|-----------|
+| [**kandinskylab/Kandinsky-5.0-T2I-Lite-sft-Diffusers**](https://huggingface.co/kandinskylab/Kandinsky-5.0-T2I-Lite-sft-Diffusers) | 6B image Supervised Fine-Tuned model | Highest generation quality |
+| [**kandinskylab/Kandinsky-5.0-I2I-Lite-sft-Diffusers**](https://huggingface.co/kandinskylab/Kandinsky-5.0-I2I-Lite-sft-Diffusers) | 6B image editing Supervised Fine-Tuned model | Highest generation quality |
+| [**kandinskylab/Kandinsky-5.0-T2I-Lite-pretrain-Diffusers**](https://huggingface.co/kandinskylab/Kandinsky-5.0-T2I-Lite-pretrain-Diffusers) | 6B image Base pretrained model | Research and fine-tuning |
+| [**kandinskylab/Kandinsky-5.0-I2I-Lite-pretrain-Diffusers**](https://huggingface.co/kandinskylab/Kandinsky-5.0-I2I-Lite-pretrain-Diffusers) | 6B image editing Base pretrained model | Research and fine-tuning |
+
+## Usage Examples
+
+### Basic Text-to-Image Generation
+
+```python
+import torch
+from diffusers import Kandinsky5T2IPipeline
+
+# Load the pipeline
+model_id = "kandinskylab/Kandinsky-5.0-T2I-Lite-sft-Diffusers"
+pipe = Kandinsky5T2IPipeline.from_pretrained(model_id)
+_ = pipe.to(device='cuda',dtype=torch.bfloat16)
+
+# Generate image
+prompt = "A fluffy, expressive cat wearing a bright red hat with a soft, slightly textured fabric. The hat should look cozy and well-fitted on the cat’s head. On the front of the hat, add clean, bold white text that reads “SWEET”, clearly visible and neatly centered. Ensure the overall lighting highlights the hat’s color and the cat’s fur details."
+
+output = pipe(
+    prompt=prompt,
+    negative_prompt="",
+    height=1024,
+    width=1024,
+    num_inference_steps=50,
+    guidance_scale=3.5,
+).image[0]
+```
+
+### Basic Image-to-Image Generation
+
+```python
+import torch
+from diffusers import Kandinsky5I2IPipeline
+from diffusers.utils import load_image 
+# Load the pipeline
+model_id = "kandinskylab/Kandinsky-5.0-I2I-Lite-sft-Diffusers"
+pipe = Kandinsky5I2IPipeline.from_pretrained(model_id)
+
+_ = pipe.to(device='cuda',dtype=torch.bfloat16)
+pipe.enable_model_cpu_offload()                                               # <--- Enable CPU offloading for single GPU inference
+
+# Edit the input image
+image = load_image(
+    "https://huggingface.co/kandinsky-community/kandinsky-3/resolve/main/assets/title.jpg?download=true"
+)
+
+prompt = "Change the background from a winter night scene to a bright summer day. Place the character on a sandy beach with clear blue sky, soft sunlight, and gentle waves in the distance. Replace the winter clothing with a light short-sleeved T-shirt (in soft pastel colors) and casual shorts. Ensure the character’s fur reflects warm daylight instead of cold winter tones. Add small beach details such as seashells, footprints in the sand, and a few scattered beach toys nearby. Keep the oranges in the scene, but place them naturally on the sand."
+negative_prompt = ""
+
+output = pipe(
+    image=image,
+    prompt=prompt,
+    negative_prompt=negative_prompt,
+    guidance_scale=3.5,
+).image[0]
+```
+
+
+## Kandinsky5T2IPipeline
+
+[[autodoc]] Kandinsky5T2IPipeline
+    - all
+    - __call__
+
+## Kandinsky5I2IPipeline
+
+[[autodoc]] Kandinsky5I2IPipeline
+    - all
+    - __call__
+
+
+## Citation
+```bibtex
+@misc{kandinsky2025,
+    author = {Alexander Belykh and Alexander Varlamov and Alexey Letunovskiy and Anastasia Aliaskina and Anastasia Maltseva and Anastasiia Kargapoltseva and Andrey Shutkin and Anna Averchenkova and Anna Dmitrienko and Bulat Akhmatov and Denis Dimitrov and Denis Koposov and Denis Parkhomenko and Dmitrii and Ilya Vasiliev and Ivan Kirillov and Julia Agafonova and Kirill Chernyshev and Kormilitsyn Semen and Lev Novitskiy and Maria Kovaleva and Mikhail Mamaev and Mikhailov and Nikita Kiselev and Nikita Osterov and Nikolai Gerasimenko and Nikolai Vaulin and Olga Kim and Olga Vdovchenko and Polina Gavrilova and Polina Mikhailova and Tatiana Nikulina and Viacheslav Vasilev and Vladimir Arkhipkin and Vladimir Korviakov and Vladimir Polovnikov and Yury Kolabushin},
+    title = {Kandinsky 5.0: A family of diffusion models for Video & Image generation},
+    howpublished = {\url{https://github.com/kandinskylab/Kandinsky-5}},
+    year = 2025
+}
+```