Skip to content

Conversation

@linoytsaban
Copy link
Collaborator

@linoytsaban linoytsaban commented Aug 5, 2025

Wan 2.2 has 2 transformers, the community has found it to be beneficial to load Wan LoRAs into both transformers and occasionally in different scales as well (this also applies for Wan 2.1 LoRAs, loaded into transformer and transformer_2).
Recently, new lighting LoRA was released for Wan2.2 T2V- with separate weights for transformer (High noise stage) and transformer_2 (Low noise stage)

This PR adds support for LoRA loading into transformer_2 + adds support for lightning LoRA (has alpha keys)

T2V example:

import torch
import numpy as np
from diffusers import WanPipeline, AutoencoderKLWan
from diffusers.utils import export_to_video, load_image

dtype = torch.bfloat16
device = "cuda"
vae = AutoencoderKLWan.from_pretrained("Wan-AI/Wan2.2-T2V-A14B-Diffusers", subfolder="vae", torch_dtype=torch.float32)
pipe = WanPipeline.from_pretrained("Wan-AI/Wan2.2-T2V-A14B-Diffusers", vae=vae, torch_dtype=dtype)
pipe.to(device)

pipe.load_lora_weights(
   "Kijai/WanVideo_comfy", 
   weight_name="Wan22-Lightning/Wan2.2-Lightning_T2V-A14B-4steps-lora_HIGH_fp16.safetensors", 
    adapter_name="lightning"
)
kwargs = {}
kwargs["load_into_transformer_2"] = True
pipe.load_lora_weights(
   "Kijai/WanVideo_comfy", 
   weight_name="Wan22-Lightning/Wan2.2-Lightning_T2V-A14B-4steps-lora_LOW_fp16.safetensors", 
    adapter_name="lightning_2", **kwargs
)
pipe.set_adapters(["lightning", "lightning_2"], adapter_weights=[1., 1.])

height = 480
width = 832

prompt = "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."
negative_prompt = "色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走"
output = pipe(
    prompt=prompt,
    negative_prompt=negative_prompt,
    height=height,
    width=width,
    num_frames=81,
    guidance_scale=1.0,
    guidance_scale_2=1.0,
    num_inference_steps=4,
    generator=torch.manual_seed(0),
).frames[0]
export_to_video(output, "t2v_out.mp4", fps=16)
t2v_out-5.mp4

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@luke14free
Copy link

curious to see an example @linoytsaban would love to try this out

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this. Left some comments.

"""

_lora_loadable_modules = ["transformer"]
_lora_loadable_modules = ["transformer", "transformer_2"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to note that this loader is shared amongst Wan 2.1 and 2.2 as the pipelines are also one and the same. For Wan 2.1, we won't have any transformer_2.

Comment on lines 5283 to 5293
else:
self.load_lora_into_transformer(
state_dict,
transformer=getattr(self, self.transformer_name) if not hasattr(self,
"transformer") else self.transformer,
adapter_name=adapter_name,
metadata=metadata,
_pipeline=self,
low_cpu_mem_usage=low_cpu_mem_usage,
hotswap=hotswap,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why put it under else?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my thought process was that, as opposed to LoRAs with weights for the transformer and text encoder for example, that we load in one load_lora_weights op, here we can have a situation where we have different weights for each transformer, but the state_dict keys are identical. Also, this way we can load the lora into each transformer separately with different adapter names - making it easy to use different scales for each transformer lora (which was seen to be beneficial for quality). I'm happy to improve this logic, but these are the considerations to keep in mind

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. So, in case users want to load both transformers, won't it just load one if load_into_transformer_2=True?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep it would, they would need to load separately to each

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you show some pseudo-code expected from the users? This is another way of loading another adapter into transformer_2:
#12040 (comment)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel strongly about it staying that exact way, but i do think it should remain possible to load different lora weights into the transformers and in different scales

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Let's go with this but with a note in the docstrings saying it's experimental in nature.

@linoytsaban
Copy link
Collaborator Author

I2V example: using Wan2.2 with Wan2.1 lightning LoRA

import torch
import numpy as np
from diffusers import WanImageToVideoPipeline
from diffusers.utils import export_to_video, load_image

model_id = "Wan-AI/Wan2.2-I2V-A14B-Diffusers"
dtype = torch.bfloat16
device = "cuda"

pipe = WanImageToVideoPipeline.from_pretrained(model_id, torch_dtype=dtype)
pipe.to(device)


pipe.load_lora_weights(
   "Kijai/WanVideo_comfy", 
    weight_name="Lightx2v/lightx2v_I2V_14B_480p_cfg_step_distill_rank128_bf16.safetensors", 
    adapter_name="lightning"
)
kwargs = {}
kwargs["load_into_transformer_2"] = True
pipe.load_lora_weights(
  "Kijai/WanVideo_comfy", 
            weight_name="Lightx2v/lightx2v_I2V_14B_480p_cfg_step_distill_rank128_bf16.safetensors", 
    adapter_name="lightning_2", **kwargs
)
pipe.set_adapters(["lightning", "lightning_2"], adapter_weights=[1., 1.])
pipe.fuse_lora(adapter_names=["lightning"], lora_scale=3., components=["transformer"])
pipe.fuse_lora(adapter_names=["lightning_2"], lora_scale=1., components=["transformer_2"])
pipe.unload_lora_weights()

image = load_image(
    "https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/wan_i2v_input.JPG"
)
max_area = 480 * 832
aspect_ratio = image.height / image.width
mod_value = pipe.vae_scale_factor_spatial * pipe.transformer.config.patch_size[1]
height = round(np.sqrt(max_area * aspect_ratio)) // mod_value * mod_value
width = round(np.sqrt(max_area / aspect_ratio)) // mod_value * mod_value
image = image.resize((width, height))
prompt = "POV selfie video, white cat with sunglasses standing on surfboard, relaxed smile, tropical beach behind (clear water, green hills, blue sky with clouds). Surfboard tips, cat falls into ocean, camera plunges underwater with bubbles and sunlight beams. Brief underwater view of cat’s face, then cat resurfaces, still filming selfie, playful summer vacation mood."

negative_prompt = "色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走"
generator = torch.Generator(device=device).manual_seed(42)
output = pipe(
    image=image,
    prompt=prompt,
    negative_prompt=negative_prompt,
    height=height,
    width=width,
    num_frames=81,
    guidance_scale=1,
    num_inference_steps=4,
    generator=generator,
).frames[0]
export_to_video(output, "i2v_output.mp4", fps=16)
i2v_output-84.mp4

@luke14free
Copy link

thanks a lot for the amazing work @linoytsaban just FYI issue #12047 also applies to this PR, I tried and I get the mismatch error with GGUF models, reporting as they are the most popular way to run Wan on consumer hardware.

@mayankagrawal10198
Copy link

mayankagrawal10198 commented Aug 6, 2025

@linoytsaban are we sure if we don't put boundary_ratio args in our generation pipe would still choose transformer2 as low noise ? Bcs I can see first PR on wan2.2 #12004 by @yiyixuxu has these lines

 if self.config.boundary_ratio is not None:
            boundary_timestep = self.config.boundary_ratio * self.scheduler.config.num_train_timesteps
        else:
            boundary_timestep = None

        with self.progress_bar(total=num_inference_steps) as progress_bar:
            for i, t in enumerate(timesteps):
                if self.interrupt:
                    continue

                self._current_timestep = t

                if boundary_timestep is None or t >= boundary_timestep:
                    # wan2.1 or high-noise stage in wan2.2
                    current_model = self.transformer
                    current_guidance_scale = guidance_scale
                else:
                    # low-noise stage in wan2.2
                    current_model = self.transformer_2
                    current_guidance_scale = guidance_scale_2

@linoytsaban
Copy link
Collaborator Author

@linoytsaban are we sure if we don't put boundary_ratio args in our generation pipe would still choose transformer2 as low noise ? Bcs I can see first PR on wan2.2 #12004 by @yiyixuxu has these lines

 if self.config.boundary_ratio is not None:
            boundary_timestep = self.config.boundary_ratio * self.scheduler.config.num_train_timesteps
        else:
            boundary_timestep = None

        with self.progress_bar(total=num_inference_steps) as progress_bar:
            for i, t in enumerate(timesteps):
                if self.interrupt:
                    continue

                self._current_timestep = t

                if boundary_timestep is None or t >= boundary_timestep:
                    # wan2.1 or high-noise stage in wan2.2
                    current_model = self.transformer
                    current_guidance_scale = guidance_scale
                else:
                    # low-noise stage in wan2.2
                    current_model = self.transformer_2
                    current_guidance_scale = guidance_scale_2

yes @mayankagrawal10198 it should still use transformer_2 for the low noise stage since the default config sets the boundary ratio of I2V to 0.9 and T2v to 0.875 (https://huggingface.co/Wan-AI/Wan2.2-I2V-A14B-Diffusers/blob/main/model_index.json, https://huggingface.co/Wan-AI/Wan2.2-T2V-A14B-Diffusers/blob/main/model_index.json), you can pass them explicitly to the pipeline if you wish to experiment with different values

@linoytsaban
Copy link
Collaborator Author

@bot /style

@github-actions
Copy link
Contributor

github-actions bot commented Aug 7, 2025

Style fix is beginning .... View the workflow run here.

@linoytsaban
Copy link
Collaborator Author

@bot /style

@github-actions
Copy link
Contributor

github-actions bot commented Aug 11, 2025

Style bot fixed some files and pushed the changes.

@linoytsaban linoytsaban requested a review from sayakpaul August 13, 2025 13:47
@innokria
Copy link

Hey Guys this is amazing work.. There is now a new concept to do this in 3 stages

3 stage approach==> The first stage uses the original WAN2.2 model, without Lightx2v lora. This allows for faster motions to be generated. The 2nd and 3rd stage uses the High and Low Lightx2v loras like normal.

I will do some experiment on this :)

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments.

Let's also make a note about it in the docs?

@asomoza
Copy link
Member

asomoza commented Aug 18, 2025

Hi @mayankagrawal10198 you can refer to this PR for the boundary explanation, it has some nice images explaining it. Also the steps can't be floats, so they get rounded. This is the same for every other start and end we have in the library.

If you want more details maybe you can open a question in the discussion tab so we can keep this PR clean and with comments about it and not more general unrelated questions.

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go!

@sayakpaul sayakpaul merged commit 8d1de40 into huggingface:main Aug 19, 2025
31 checks passed
@linoytsaban
Copy link
Collaborator Author

Thanks a lot @sayakpaul!

@s3nh
Copy link

s3nh commented Aug 19, 2025

I assume its not compatible with .gguf version rn? awesome work, thanks

@BXset
Copy link

BXset commented Sep 12, 2025

We merged the wan2.2 diffusers base model and wan2.2-lightning model to an new model: https://huggingface.co/FastDM/Wan2.2-T2V-A14B-Merge-Lightning-V1.0-Diffusers
Welcome to using our inference project to get more speed up.

@Passenger12138
Copy link

Thank you very much for your work. When I was using wan22 use wan21_lightx2v loracode to load Lora, I encountered this error

Loading adapter weights from state_dict led to unexpected keys found in the model: condition_embedder.image_embedder.ff.net.0.proj.lora_A.lightx2v_t1.weight, condition_embedder.image_embedder.ff.net.0.proj.lora_B.lightx2v_t1.weight, condition_embedder.image_embedder.ff.net.0.proj.lora_B.lightx2v_t1.bias, condition_embedder.image_embedder.ff.net.2.lora_A.lightx2v_t1.weight, condition_embedder.image_embedder.ff.net.2.lora_B.lightx2v_t1.weight, condition_embedder.image_embedder.ff.net.2.lora_B.lightx2v_t1.bias, blocks.0.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.0.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.0.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.0.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.0.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.0.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.1.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.1.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.1.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.1.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.1.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.1.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.2.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.2.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.2.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.2.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.2.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.2.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.3.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.3.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.3.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.3.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.3.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.3.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.4.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.4.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.4.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.4.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.4.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.4.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.5.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.5.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.5.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.5.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.5.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.5.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.6.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.6.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.6.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.6.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.6.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.6.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.7.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.7.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.7.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.7.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.7.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.7.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.8.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.8.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.8.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.8.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.8.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.8.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.9.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.9.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.9.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.9.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.9.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.9.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.10.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.10.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.10.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.10.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.10.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.10.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.11.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.11.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.11.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.11.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.11.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.11.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.12.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.12.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.12.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.12.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.12.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.12.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.13.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.13.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.13.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.13.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.13.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.13.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.14.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.14.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.14.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.14.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.14.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.14.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.15.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.15.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.15.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.15.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.15.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.15.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.16.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.16.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.16.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.16.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.16.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.16.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.17.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.17.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.17.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.17.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.17.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.17.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.18.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.18.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.18.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.18.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.18.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.18.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.19.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.19.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.19.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.19.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.19.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.19.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.20.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.20.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.20.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.20.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.20.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.20.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.21.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.21.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.21.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.21.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.21.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.21.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.22.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.22.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.22.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.22.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.22.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.22.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.23.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.23.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.23.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.23.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.23.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.23.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.24.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.24.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.24.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.24.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.24.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.24.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.25.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.25.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.25.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.25.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.25.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.25.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.26.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.26.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.26.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.26.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.26.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.26.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.27.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.27.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.27.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.27.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.27.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.27.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.28.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.28.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.28.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.28.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.28.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.28.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.29.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.29.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.29.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.29.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.29.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.29.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.30.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.30.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.30.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.30.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.30.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.30.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.31.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.31.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.31.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.31.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.31.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.31.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.32.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.32.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.32.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.32.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.32.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.32.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.33.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.33.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.33.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.33.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.33.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.33.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.34.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.34.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.34.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.34.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.34.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.34.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.35.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.35.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.35.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.35.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.35.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.35.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.36.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.36.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.36.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.36.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.36.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.36.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.37.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.37.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.37.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.37.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.37.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.37.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.38.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.38.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.38.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.38.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.38.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.38.attn2.add_v_proj.lora_B.lightx2v_t1.bias, blocks.39.attn2.add_k_proj.lora_A.lightx2v_t1.weight, blocks.39.attn2.add_k_proj.lora_B.lightx2v_t1.weight, blocks.39.attn2.add_k_proj.lora_B.lightx2v_t1.bias, blocks.39.attn2.add_v_proj.lora_A.lightx2v_t1.weight, blocks.39.attn2.add_v_proj.lora_B.lightx2v_t1.weight, blocks.39.attn2.add_v_proj.lora_B.lightx2v_t1.bias.
Loading adapter weights from state_dict led to unexpected keys found in the model: condition_embedder.image_embedder.ff.net.0.proj.lora_A.lightx2v_2.weight, condition_embedder.image_embedder.ff.net.0.proj.lora_B.lightx2v_2.weight, condition_embedder.image_embedder.ff.net.0.proj.lora_B.lightx2v_2.bias, condition_embedder.image_embedder.ff.net.2.lora_A.lightx2v_2.weight, condition_embedder.image_embedder.ff.net.2.lora_B.lightx2v_2.weight, condition_embedder.image_embedder.ff.net.2.lora_B.lightx2v_2.bias, blocks.0.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.0.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.0.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.0.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.0.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.0.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.1.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.1.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.1.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.1.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.1.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.1.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.2.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.2.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.2.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.2.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.2.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.2.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.3.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.3.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.3.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.3.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.3.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.3.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.4.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.4.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.4.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.4.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.4.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.4.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.5.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.5.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.5.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.5.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.5.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.5.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.6.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.6.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.6.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.6.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.6.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.6.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.7.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.7.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.7.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.7.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.7.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.7.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.8.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.8.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.8.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.8.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.8.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.8.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.9.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.9.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.9.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.9.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.9.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.9.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.10.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.10.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.10.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.10.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.10.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.10.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.11.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.11.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.11.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.11.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.11.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.11.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.12.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.12.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.12.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.12.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.12.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.12.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.13.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.13.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.13.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.13.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.13.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.13.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.14.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.14.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.14.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.14.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.14.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.14.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.15.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.15.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.15.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.15.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.15.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.15.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.16.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.16.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.16.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.16.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.16.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.16.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.17.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.17.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.17.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.17.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.17.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.17.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.18.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.18.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.18.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.18.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.18.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.18.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.19.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.19.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.19.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.19.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.19.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.19.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.20.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.20.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.20.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.20.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.20.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.20.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.21.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.21.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.21.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.21.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.21.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.21.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.22.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.22.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.22.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.22.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.22.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.22.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.23.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.23.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.23.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.23.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.23.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.23.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.24.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.24.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.24.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.24.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.24.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.24.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.25.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.25.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.25.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.25.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.25.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.25.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.26.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.26.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.26.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.26.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.26.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.26.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.27.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.27.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.27.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.27.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.27.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.27.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.28.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.28.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.28.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.28.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.28.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.28.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.29.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.29.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.29.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.29.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.29.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.29.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.30.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.30.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.30.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.30.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.30.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.30.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.31.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.31.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.31.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.31.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.31.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.31.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.32.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.32.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.32.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.32.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.32.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.32.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.33.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.33.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.33.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.33.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.33.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.33.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.34.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.34.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.34.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.34.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.34.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.34.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.35.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.35.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.35.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.35.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.35.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.35.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.36.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.36.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.36.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.36.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.36.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.36.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.37.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.37.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.37.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.37.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.37.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.37.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.38.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.38.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.38.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.38.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.38.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.38.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.39.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.39.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.39.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.39.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.39.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.39.attn2.add_v_proj.lora_B.lightx2v_2.bias.

code

self.pipe.load_lora_weights(self.lora_path, adapter_name='lightx2v_t1')
self.pipe.set_adapters(["lightx2v_t1"], adapter_weights=[3.0])
if hasattr(self.pipe, "transformer_2") and self.pipe.transformer_2 is not None:
      org_state_dict = safetensors.torch.load_file(self.lora_path)
      converted_state_dict = _convert_non_diffusers_wan_lora_to_diffusers(org_state_dict)
      self.pipe.transformer_2.load_lora_adapter(converted_state_dict, adapter_name="lightx2v_2")
      self.pipe.transformer_2.set_adapters(["lightx2v_t2"], weights=[1.5])

This result indicates that I seem to have encountered keys different from the original transformer and transformer_2 when loading Lora, but when I printed the structure of the model, I seemed to have successfully loaded Lora?Is this just a prompt error or is there a problem with my usage?
If you could give me some advice, I would greatly appreciate it

`Loading adapter weights from state_dict led to unexpected keys found in the model: condition_embedder.image_embedder.ff.net.0.proj.lora_A.lightx2v_2.weight, condition_embedder.image_embedder.ff.net.0.proj.lora_B.lightx2v_2.weight, condition_embedder.image_embedder.ff.net.0.proj.lora_B.lightx2v_2.bias, condition_embedder.image_embedder.ff.net.2.lora_A.lightx2v_2.weight, condition_embedder.image_embedder.ff.net.2.lora_B.lightx2v_2.weight, condition_embedder.image_embedder.ff.net.2.lora_B.lightx2v_2.bias, blocks.0.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.0.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.0.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.0.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.0.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.0.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.1.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.1.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.1.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.1.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.1.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.1.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.2.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.2.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.2.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.2.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.2.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.2.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.3.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.3.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.3.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.3.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.3.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.3.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.4.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.4.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.4.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.4.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.4.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.4.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.5.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.5.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.5.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.5.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.5.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.5.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.6.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.6.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.6.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.6.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.6.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.6.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.7.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.7.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.7.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.7.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.7.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.7.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.8.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.8.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.8.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.8.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.8.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.8.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.9.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.9.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.9.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.9.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.9.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.9.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.10.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.10.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.10.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.10.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.10.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.10.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.11.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.11.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.11.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.11.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.11.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.11.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.12.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.12.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.12.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.12.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.12.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.12.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.13.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.13.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.13.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.13.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.13.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.13.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.14.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.14.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.14.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.14.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.14.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.14.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.15.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.15.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.15.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.15.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.15.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.15.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.16.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.16.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.16.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.16.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.16.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.16.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.17.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.17.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.17.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.17.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.17.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.17.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.18.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.18.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.18.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.18.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.18.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.18.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.19.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.19.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.19.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.19.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.19.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.19.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.20.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.20.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.20.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.20.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.20.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.20.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.21.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.21.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.21.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.21.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.21.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.21.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.22.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.22.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.22.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.22.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.22.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.22.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.23.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.23.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.23.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.23.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.23.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.23.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.24.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.24.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.24.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.24.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.24.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.24.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.25.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.25.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.25.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.25.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.25.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.25.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.26.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.26.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.26.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.26.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.26.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.26.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.27.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.27.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.27.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.27.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.27.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.27.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.28.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.28.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.28.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.28.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.28.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.28.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.29.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.29.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.29.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.29.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.29.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.29.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.30.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.30.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.30.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.30.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.30.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.30.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.31.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.31.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.31.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.31.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.31.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.31.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.32.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.32.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.32.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.32.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.32.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.32.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.33.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.33.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.33.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.33.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.33.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.33.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.34.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.34.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.34.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.34.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.34.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.34.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.35.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.35.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.35.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.35.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.35.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.35.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.36.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.36.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.36.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.36.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.36.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.36.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.37.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.37.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.37.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.37.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.37.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.37.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.38.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.38.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.38.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.38.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.38.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.38.attn2.add_v_proj.lora_B.lightx2v_2.bias, blocks.39.attn2.add_k_proj.lora_A.lightx2v_2.weight, blocks.39.attn2.add_k_proj.lora_B.lightx2v_2.weight, blocks.39.attn2.add_k_proj.lora_B.lightx2v_2.bias, blocks.39.attn2.add_v_proj.lora_A.lightx2v_2.weight, blocks.39.attn2.add_v_proj.lora_B.lightx2v_2.weight, blocks.39.attn2.add_v_proj.lora_B.lightx2v_2.bias.
> /data/code/haobang.geng/code/online_storymv_generate/wan22_diffusers.py(113)__init__()
-> self.pipe.transformer_2.set_adapters(["lightx2v_t2"], weights=[1.5])
(Pdb) self.pipe.transformer_2
WanTransformer3DModel(
  (rope): WanRotaryPosEmbed()
  (patch_embedding): Conv3d(36, 5120, kernel_size=(1, 2, 2), stride=(1, 2, 2))
  (condition_embedder): WanTimeTextImageEmbedding(
    (timesteps_proj): Timesteps()
    (time_embedder): TimestepEmbedding(
      (linear_1): lora.Linear(
        (base_layer): Linear(in_features=256, out_features=5120, bias=True)
        (lora_dropout): ModuleDict(
          (lightx2v_2): Identity()
        )
        (lora_A): ModuleDict(
          (lightx2v_2): Linear(in_features=256, out_features=64, bias=False)
        )
        (lora_B): ModuleDict(
          (lightx2v_2): Linear(in_features=64, out_features=5120, bias=True)
        )
        (lora_embedding_A): ParameterDict()
        (lora_embedding_B): ParameterDict()
        (lora_magnitude_vector): ModuleDict()
      )
      (act): SiLU()
      (linear_2): lora.Linear(
        (base_layer): Linear(in_features=5120, out_features=5120, bias=True)
        (lora_dropout): ModuleDict(
          (lightx2v_2): Identity()
        )
        (lora_A): ModuleDict(
          (lightx2v_2): Linear(in_features=5120, out_features=64, bias=False)
        )
        (lora_B): ModuleDict(
          (lightx2v_2): Linear(in_features=64, out_features=5120, bias=True)
        )
        (lora_embedding_A): ParameterDict()
        (lora_embedding_B): ParameterDict()
        (lora_magnitude_vector): ModuleDict()
      )
    )
    (act_fn): SiLU()
    (time_proj): lora.Linear(
      (base_layer): Linear(in_features=5120, out_features=30720, bias=True)
      (lora_dropout): ModuleDict(
        (lightx2v_2): Identity()
      )
      (lora_A): ModuleDict(
        (lightx2v_2): Linear(in_features=5120, out_features=64, bias=False)
      )
      (lora_B): ModuleDict(
        (lightx2v_2): Linear(in_features=64, out_features=30720, bias=True)
      )
      (lora_embedding_A): ParameterDict()
      (lora_embedding_B): ParameterDict()
      (lora_magnitude_vector): ModuleDict()
    )
    (text_embedder): PixArtAlphaTextProjection(
      (linear_1): lora.Linear(
        (base_layer): Linear(in_features=4096, out_features=5120, bias=True)
        (lora_dropout): ModuleDict(
          (lightx2v_2): Identity()
        )
        (lora_A): ModuleDict(
          (lightx2v_2): Linear(in_features=4096, out_features=64, bias=False)
        )
        (lora_B): ModuleDict(
          (lightx2v_2): Linear(in_features=64, out_features=5120, bias=True)
        )
        (lora_embedding_A): ParameterDict()
        (lora_embedding_B): ParameterDict()
        (lora_magnitude_vector): ModuleDict()
      )
      (act_1): GELU(approximate='tanh')
      (linear_2): lora.Linear(
        (base_layer): Linear(in_features=5120, out_features=5120, bias=True)
        (lora_dropout): ModuleDict(
          (lightx2v_2): Identity()
        )
        (lora_A): ModuleDict(
          (lightx2v_2): Linear(in_features=5120, out_features=64, bias=False)
        )
        (lora_B): ModuleDict(
          (lightx2v_2): Linear(in_features=64, out_features=5120, bias=True)
        )
        (lora_embedding_A): ParameterDict()
        (lora_embedding_B): ParameterDict()
        (lora_magnitude_vector): ModuleDict()
      )
    )
  )
  (blocks): ModuleList(
    (0-39): 40 x WanTransformerBlock(
      (norm1): FP32LayerNorm((5120,), eps=1e-06, elementwise_affine=False)
      (attn1): WanAttention(
        (to_q): lora.Linear(
          (base_layer): Linear(in_features=5120, out_features=5120, bias=True)
          (lora_dropout): ModuleDict(
            (lightx2v_2): Identity()
          )
          (lora_A): ModuleDict(
            (lightx2v_2): Linear(in_features=5120, out_features=64, bias=False)
          )
          (lora_B): ModuleDict(
            (lightx2v_2): Linear(in_features=64, out_features=5120, bias=True)
          )
          (lora_embedding_A): ParameterDict()
          (lora_embedding_B): ParameterDict()
          (lora_magnitude_vector): ModuleDict()
        )
        (to_k): lora.Linear(
          (base_layer): Linear(in_features=5120, out_features=5120, bias=True)
          (lora_dropout): ModuleDict(
            (lightx2v_2): Identity()
          )
          (lora_A): ModuleDict(
            (lightx2v_2): Linear(in_features=5120, out_features=64, bias=False)
          )
          (lora_B): ModuleDict(
            (lightx2v_2): Linear(in_features=64, out_features=5120, bias=True)
          )
          (lora_embedding_A): ParameterDict()
          (lora_embedding_B): ParameterDict()
          (lora_magnitude_vector): ModuleDict()
        )
        (to_v): lora.Linear(
          (base_layer): Linear(in_features=5120, out_features=5120, bias=True)
          (lora_dropout): ModuleDict(
            (lightx2v_2): Identity()
          )
          (lora_A): ModuleDict(
            (lightx2v_2): Linear(in_features=5120, out_features=64, bias=False)
          )
          (lora_B): ModuleDict(
            (lightx2v_2): Linear(in_features=64, out_features=5120, bias=True)
          )
          (lora_embedding_A): ParameterDict()
          (lora_embedding_B): ParameterDict()
          (lora_magnitude_vector): ModuleDict()
        )
        (to_out): ModuleList(
          (0): lora.Linear(
            (base_layer): Linear(in_features=5120, out_features=5120, bias=True)
            (lora_dropout): ModuleDict(
              (lightx2v_2): Identity()
            )
            (lora_A): ModuleDict(
              (lightx2v_2): Linear(in_features=5120, out_features=64, bias=False)
            )
            (lora_B): ModuleDict(
              (lightx2v_2): Linear(in_features=64, out_features=5120, bias=True)
            )
            (lora_embedding_A): ParameterDict()
            (lora_embedding_B): ParameterDict()
            (lora_magnitude_vector): ModuleDict()
          )
          (1): Dropout(p=0.0, inplace=False)
        )
        (norm_q): RMSNorm((5120,), eps=1e-06, elementwise_affine=True)
        (norm_k): RMSNorm((5120,), eps=1e-06, elementwise_affine=True)
      )
      (attn2): WanAttention(
        (to_q): lora.Linear(
          (base_layer): Linear(in_features=5120, out_features=5120, bias=True)
          (lora_dropout): ModuleDict(
            (lightx2v_2): Identity()
          )
          (lora_A): ModuleDict(
            (lightx2v_2): Linear(in_features=5120, out_features=64, bias=False)
          )
          (lora_B): ModuleDict(
            (lightx2v_2): Linear(in_features=64, out_features=5120, bias=True)
          )
          (lora_embedding_A): ParameterDict()
          (lora_embedding_B): ParameterDict()
          (lora_magnitude_vector): ModuleDict()
        )
        (to_k): lora.Linear(
          (base_layer): Linear(in_features=5120, out_features=5120, bias=True)
          (lora_dropout): ModuleDict(
            (lightx2v_2): Identity()
          )
          (lora_A): ModuleDict(
            (lightx2v_2): Linear(in_features=5120, out_features=64, bias=False)
          )
          (lora_B): ModuleDict(
            (lightx2v_2): Linear(in_features=64, out_features=5120, bias=True)
          )
          (lora_embedding_A): ParameterDict()
          (lora_embedding_B): ParameterDict()
          (lora_magnitude_vector): ModuleDict()
        )
        (to_v): lora.Linear(
          (base_layer): Linear(in_features=5120, out_features=5120, bias=True)
          (lora_dropout): ModuleDict(
            (lightx2v_2): Identity()
          )
          (lora_A): ModuleDict(
            (lightx2v_2): Linear(in_features=5120, out_features=64, bias=False)
          )
          (lora_B): ModuleDict(
            (lightx2v_2): Linear(in_features=64, out_features=5120, bias=True)
          )
          (lora_embedding_A): ParameterDict()
          (lora_embedding_B): ParameterDict()
          (lora_magnitude_vector): ModuleDict()
        )
        (to_out): ModuleList(
          (0): lora.Linear(
            (base_layer): Linear(in_features=5120, out_features=5120, bias=True)
            (lora_dropout): ModuleDict(
              (lightx2v_2): Identity()
            )
            (lora_A): ModuleDict(
              (lightx2v_2): Linear(in_features=5120, out_features=64, bias=False)
            )
            (lora_B): ModuleDict(
              (lightx2v_2): Linear(in_features=64, out_features=5120, bias=True)
            )
            (lora_embedding_A): ParameterDict()
            (lora_embedding_B): ParameterDict()
            (lora_magnitude_vector): ModuleDict()
          )
          (1): Dropout(p=0.0, inplace=False)
        )
        (norm_q): RMSNorm((5120,), eps=1e-06, elementwise_affine=True)
        (norm_k): RMSNorm((5120,), eps=1e-06, elementwise_affine=True)
      )
      (norm2): FP32LayerNorm((5120,), eps=1e-06, elementwise_affine=True)
      (ffn): FeedForward(
        (net): ModuleList(
          (0): GELU(
            (proj): lora.Linear(
              (base_layer): Linear(in_features=5120, out_features=13824, bias=True)
              (lora_dropout): ModuleDict(
                (lightx2v_2): Identity()
              )
              (lora_A): ModuleDict(
                (lightx2v_2): Linear(in_features=5120, out_features=64, bias=False)
              )
              (lora_B): ModuleDict(
                (lightx2v_2): Linear(in_features=64, out_features=13824, bias=True)
              )
              (lora_embedding_A): ParameterDict()
              (lora_embedding_B): ParameterDict()
              (lora_magnitude_vector): ModuleDict()
            )
          )
          (1): Dropout(p=0.0, inplace=False)
          (2): lora.Linear(
            (base_layer): Linear(in_features=13824, out_features=5120, bias=True)
            (lora_dropout): ModuleDict(
              (lightx2v_2): Identity()
            )
            (lora_A): ModuleDict(
              (lightx2v_2): Linear(in_features=13824, out_features=64, bias=False)
            )
            (lora_B): ModuleDict(
              (lightx2v_2): Linear(in_features=64, out_features=5120, bias=True)
            )
            (lora_embedding_A): ParameterDict()
            (lora_embedding_B): ParameterDict()
            (lora_magnitude_vector): ModuleDict()
          )
        )
      )
      (norm3): FP32LayerNorm((5120,), eps=1e-06, elementwise_affine=False)
    )
  )
  (norm_out): FP32LayerNorm((5120,), eps=1e-06, elementwise_affine=False)
  (proj_out): lora.Linear(
    (base_layer): Linear(in_features=5120, out_features=64, bias=True)
    (lora_dropout): ModuleDict(
      (lightx2v_2): Identity()
    )
    (lora_A): ModuleDict(
      (lightx2v_2): Linear(in_features=5120, out_features=64, bias=False)
    )
    (lora_B): ModuleDict(
      (lightx2v_2): Linear(in_features=64, out_features=64, bias=True)
    )
    (lora_embedding_A): ParameterDict()
    (lora_embedding_B): ParameterDict()
    (lora_magnitude_vector): ModuleDict()
  )
)

@sayakpaul
Copy link
Member

Please provide a complete yet minimal snippet for debugging.

@Passenger12138
Copy link

torch.compile + channels_last support for Wan 2.2 (T2V / I2V) fails with RuntimeError + Dynamo Unsupported behavior

Issue Description

Hi, I am trying to optimize Wan 2.2 T2V / I2V inference speed on a single RTX 4090, using:

1 Wan2.2 (Diffusers)
2 LightX2V LoRA
3 flash attention
4 group offload (Diffusers 0.30+)
5 torch.compile(mode="max-autotune", fullgraph=True) / torch.channels_last (as recommended in the docs)

My goal is to achieve maximum throughput on a single 4090 GPU. However, when following the official docs for efficiency: https://huggingface.co/docs/diffusers/api/pipelines/wan#t2v-inference-speed

I hit two different failures:

1. RuntimeError when calling .to(memory_format=torch.channels_last)

According to the docs:

pipeline.transformer.to(memory_format=torch.channels_last)
pipeline.transformer = torch.compile(
    pipeline.transformer, mode="max-autotune", fullgraph=True
)

I got the error

Traceback (most recent call last):
  File "/data/code/haobang.geng/code/online_storymv_generate/workers/wan.py", line 42, in <module>
    pipe.transformer.to(memory_format=torch.channels_last)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/diffusers/models/modeling_utils.py", line 1424, in to
    return super().to(*args, **kwargs)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1343, in to
    return self._apply(convert)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/nn/modules/module.py", line 903, in _apply
    module._apply(fn)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/nn/modules/module.py", line 930, in _apply
    param_applied = fn(param)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1323, in convert
    return t.to(
RuntimeError: required rank 4 tensor to use channels_last format

2 When skipping channels_last and compiling directly, torch.compile fails at runtime

I attempted:# Skipped channels_last

pipe.transformer = torch.compile(
    pipe.transformer, mode="max-autotune", fullgraph=True
)
pipe.transformer_2 = torch.compile(
    pipe.transformer_2, mode="max-autotune", fullgraph=True
)

I got the error

 0%|                                                      | 0/6 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/data/code/haobang.geng/code/online_storymv_generate/workers/wan.py", line 89, in <module>
    frames = pipe(input_image, "animate", num_inference_steps=6, guidance_scale=1.0).frames[0]
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/diffusers/pipelines/wan/pipeline_wan_i2v.py", line 756, in __call__
    noise_pred = current_model(
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 574, in _fn
    return fn(*args, **kwargs)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 1380, in __call__
    return self._torchdynamo_orig_callable(
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 547, in __call__
    return _compile(
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 986, in _compile
    guarded_code = compile_inner(code, one_graph, hooks, transform)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 715, in compile_inner
    return _compile_inner(code, one_graph, hooks, transform)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_utils_internal.py", line 95, in wrapper_function
    return function(*args, **kwargs)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 750, in _compile_inner
    out_code = transform_code_object(code, transform)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/bytecode_transformation.py", line 1361, in transform_code_object
    transformations(instructions, code_options)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 231, in _fn
    return fn(*args, **kwargs)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 662, in transform
    tracer.run()
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2868, in run
    super().run()
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1052, in run
    while self.step():
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 962, in step
    self.dispatch_table[inst.opcode](self, inst)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 659, in wrapper
    return inner_fn(self, inst)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1736, in CALL_FUNCTION_EX
    self.call_function(fn, argsvars.items, kwargsvars)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 897, in call_function
    self.push(fn.call_function(self, args, kwargs))  # type: ignore[arg-type]
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/variables/lazy.py", line 170, in realize_and_forward
    return getattr(self.realize(), name)(*args, **kwargs)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 858, in call_function
    return self.func.call_function(tx, merged_args, merged_kwargs)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 317, in call_function
    return super().call_function(tx, args, kwargs)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 118, in call_function
    return tx.inline_user_function_return(self, [*self.self_args(), *args], kwargs)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 903, in inline_user_function_return
    return InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 3072, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 3198, in inline_call_
    tracer.run()
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1052, in run
    while self.step():
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 962, in step
    self.dispatch_table[inst.opcode](self, inst)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 659, in wrapper
    return inner_fn(self, inst)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1736, in CALL_FUNCTION_EX
    self.call_function(fn, argsvars.items, kwargsvars)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 897, in call_function
    self.push(fn.call_function(self, args, kwargs))  # type: ignore[arg-type]
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/variables/lazy.py", line 170, in realize_and_forward
    return getattr(self.realize(), name)(*args, **kwargs)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/variables/user_defined.py", line 914, in call_function
    return func_var.call_function(tx, [obj_var] + args, kwargs)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 317, in call_function
    return super().call_function(tx, args, kwargs)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 118, in call_function
    return tx.inline_user_function_return(self, [*self.self_args(), *args], kwargs)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 903, in inline_user_function_return
    return InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 3072, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 3198, in inline_call_
    tracer.run()
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1052, in run
    while self.step():
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 962, in step
    self.dispatch_table[inst.opcode](self, inst)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 659, in wrapper
    return inner_fn(self, inst)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1658, in CALL_FUNCTION
    self.call_function(fn, args, {})
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 897, in call_function
    self.push(fn.call_function(self, args, kwargs))  # type: ignore[arg-type]
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 378, in call_function
    return super().call_function(tx, args, kwargs)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 317, in call_function
    return super().call_function(tx, args, kwargs)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 118, in call_function
    return tx.inline_user_function_return(self, [*self.self_args(), *args], kwargs)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 903, in inline_user_function_return
    return InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 3072, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 3116, in inline_call_
    result = InliningInstructionTranslator.check_inlineable(func)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 3093, in check_inlineable
    unimplemented(
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/exc.py", line 317, in unimplemented
    raise Unsupported(msg, case_name=case_name)
torch._dynamo.exc.Unsupported: 'inline in skipfiles: ModuleGroup.onload_ | _fn /data/conda_envs/haobang.geng/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py, skipped according trace_rules.lookup SKIP_DIRS'

from user code:
   File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/diffusers/hooks/hooks.py", line 189, in new_forward
    output = function_reference.forward(*args, **kwargs)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/diffusers/hooks/hooks.py", line 188, in new_forward
    args, kwargs = function_reference.pre_forward(module, *args, **kwargs)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/diffusers/hooks/group_offloading.py", line 304, in pre_forward
    self.group.onload_()

Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information


You can suppress this exception and fall back to eager by setting:
    import torch._dynamo
    torch._dynamo.config.suppress_errors = True
    

full code

import torch
from diffusers import WanImageToVideoPipeline, DiffusionPipeline, LCMScheduler, UniPCMultistepScheduler
from huggingface_hub import hf_hub_download
import requests
from PIL import Image
from diffusers.loaders.lora_conversion_utils import _convert_non_diffusers_wan_lora_to_diffusers
from io import BytesIO
from diffusers.utils import export_to_video
import safetensors.torch
from diffusers.hooks import apply_group_offloading
import time
# Load image
# image_url = "https://cloud.inference.sh/u/4mg21r6ta37mpaz6ktzwtt8krr/01k1g7k73eebnrmzmc6h0bghq6.png"
# response = requests.get(image_url)
# input_image = Image.open(BytesIO(response.content)).convert("RGB")
input_image = Image.open("/data/code/haobang.geng/code/online_storymv_generate/temp/temp_input/1.jpg").convert("RGB")
warmup_steps = 3

# load pipeline 
pipe = WanImageToVideoPipeline.from_pretrained(
    "/data/code/haobang.geng/models/Wan2.2-I2V-A14B-Diffusers",
    torch_dtype=torch.bfloat16
)
pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config, flow_shift=8.0)

# load and fuse lora
high_lora_path = "/data/code/haobang.geng/models/WanVideo_comfy/LoRAs/Wan22_Lightx2v/Wan_2_2_I2V_A14B_HIGH_lightx2v_4step_lora_v1030_rank_64_bf16.safetensors"
low_lora_path = "/data/code/haobang.geng/ComfyUI/models/loras/Wan_2_1_lightx2v_I2V_14B_480p_cfg_step_distill_rank64_bf16.safetensors"
pipe.load_lora_weights(high_lora_path, adapter_name='lightx2v_t1')
pipe.set_adapters(["lightx2v_t1"], adapter_weights=[1.0])
pipe.fuse_lora(adapter_names=["lightx2v_t1"], lora_scale=1, components=["transformer"])
if hasattr(pipe, "transformer_2") and pipe.transformer_2 is not None:
    org_state_dict = safetensors.torch.load_file(low_lora_path)
    converted_state_dict = _convert_non_diffusers_wan_lora_to_diffusers(org_state_dict)
    pipe.transformer_2.load_lora_adapter(converted_state_dict, adapter_name="lightx2v_t2")
    pipe.transformer_2.set_adapters(["lightx2v_t2"], weights=[1.0])
    pipe.fuse_lora(adapter_names=["lightx2v_t2"], lora_scale=1., components=["transformer_2"])

pipe.unload_lora_weights()

# torch.compile
# pipe.transformer.to(memory_format=torch.channels_last)
pipe.transformer = torch.compile(
    pipe.transformer, mode="max-autotune", fullgraph=True
)
# pipe.transformer_2.to(memory_format=torch.channels_last)
pipe.transformer_2 = torch.compile(
    pipe.transformer_2, mode="max-autotune", fullgraph=True
)

# group offload
apply_group_offloading(
    pipe.transformer,
    offload_type="leaf_level",
    offload_device=torch.device("cpu"),
    onload_device=torch.device("cuda"),
    use_stream=True,
)
apply_group_offloading(
    pipe.transformer_2,
    offload_type="leaf_level",
    offload_device=torch.device("cpu"),
    onload_device=torch.device("cuda"),
    use_stream=True,
)
apply_group_offloading(
    pipe.text_encoder,
    offload_device=torch.device("cpu"),
    onload_device=torch.device("cuda"),
    offload_type="leaf_level",
    use_stream=True,
)
apply_group_offloading(
    pipe.vae,
    offload_device=torch.device("cpu"),
    onload_device=torch.device("cuda"),
    offload_type="leaf_level",
    use_stream=True,
)

# set effeicent attention
pipe.transformer.set_attention_backend("flash")


# for i in range(warmup_steps):
#     frames = pipe(input_image, "animate", num_inference_steps=6, guidance_scale=1.0).frames[0]

start_time = time.time()
frames = pipe(input_image, "animate", num_inference_steps=6, guidance_scale=1.0).frames[0]
end_time = time.time()
print(f"Time taken: {end_time - start_time} seconds")
export_to_video(frames, "/data/code/haobang.geng/code/online_storymv_generate/temp/temp_output/output.mp4",fps=15)

Request

1 Can Wan2.2 Transformer support channels_last?
(Currently incompatible with Rank ≠ 4 tensors)
2 Can the team patch torch.compile compatibility
for Wan2.2 T2V/I2V transformers?

3 Are there recommended compiler flags
(e.g., dynamic=True, fullgraph=False, etc.)
that work reliably for Wan2.2?

@sayakpaul
Copy link
Member

Full code is not reproducible because it contains paths relative to your system.

1 Can Wan2.2 Transformer support channels_last?

No, because (you should open the issue on PyTorch repository because we don't control this behaviour):

RuntimeError: required rank 4 tensor to use channels_last format


2 Can the team patch torch.compile compatibility.

We support it. From your error-trace, you seem to have enabled group offloading which is not compatible with fullgraph=True. So, you should not set it to True. Also, enable compilation after applying group offloading.

from user code:
   File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/diffusers/hooks/hooks.py", line 189, in new_forward
    output = function_reference.forward(*args, **kwargs)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/diffusers/hooks/hooks.py", line 188, in new_forward
    args, kwargs = function_reference.pre_forward(module, *args, **kwargs)
  File "/data/conda_envs/haobang.geng/lib/python3.10/site-packages/diffusers/hooks/group_offloading.py", line 304, in pre_forward
    self.group.onload_()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants