[Z-Image] various small changes, Z-Image transformer tests, etc. #12741

sayakpaul · 2025-11-28T13:59:54Z

What does this PR do?

Introduces a dedicated test suite for the Z-Image DiT.
Adds is_flaky decorator to test_inference() in the Z-Image pipeline test suite.
Adds a return_dict argument to the forward() of Z-Image DiT, following other models in the library.
- As a consequence of this, I followed the return pattern, i.e., return a Transformer2DModelOutput type output or something like return (out,).

Notes

The model accepts the hidden states as a list[torch.Tensor] which differs from other models. Output also follows the same type. This is why I had to modify a couple of tests (where it was reasonably easy) to allow this. Tests, where it was not relatively easy, were skipped (such as test_training, `test_ema_training, etc.).
The repeated block in this model is ZImageTransformerBlock, which is used for noise_refiner, context_refiner, and layers. As a consequence of this, the inputs recorded for the block would vary during compilation and full compilation with fullgraph=True would trigger recompilation at least thrice.
Some of the group offloading tests were skipped because of states that interfered in between the tests (as also noted here).
x_pad_token and cap_pad_token params within the DiT are initialized with torch.empty(), possibly for memory efficiency, but they interfere during test in very weird ways. This is because torch.empty() can render NaNs. To prevent this from creeping into the tests, I tried adding is_flaky() to some of the tests that got affected by this, but that didn't help (see this). @JerryWu-code, would it be safe to get x_pad_token and cap_pad_token initialized deterministically, maybe with something like torch.ones()? Or do you think it would have memory implications?

Minor nits

We usually avoid raw assert statements inside the model implementations in favor of properly raising errors. Should we follow something similar here, too?
There is a self.scheduler.sigma_min = 0.0 inside the Z-Image pipeline:

diffusers/src/diffusers/pipelines/z_image/pipeline_z_image.py

Line 477 in 1b91856

self.scheduler.sigma_min = 0.0

. Maybe I am missing out on something but that seems like an antipattern to me.
The signature of forward() of the DiT has shorthand variable names: x, t, cap_feats, unlike hidden_states, timestep, and encoder_hidden_states.
Should _cfg_normalization and _cfg_truncation inside the pipeline be turned into properties like guidance_scale?

Maybe we could consider revisiting them (but not a priority perhaps).

Cc: @JerryWu-code

sayakpaul · 2025-11-28T14:00:59Z

src/diffusers/models/transformers/transformer_z_image.py

-        return x, {}
+        if not return_dict:
+            return (x,)
+
+        return Transformer2DModelOutput(sample=x)


Should be a very safe change?

HuggingFaceDocBuilderDev · 2025-11-28T14:08:22Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

sayakpaul · 2025-11-28T14:58:43Z

Failures in "Fast tests for PRs / Fast PyTorch Models & Schedulers CPU tests (pull_request)" pass even when run with CUDA_VISIBLE_DEVICES="" pytest tests/models/transformers/test_models_transformer_z_image.py.

Edit: it likely fails when CUDA_VISIBLE_DEVICES="" pytest tests/models/ is run.

This reverts commit bca3e27.

sayakpaul added 5 commits November 28, 2025 18:43

start zimage model tests.

12608de

up

6d47d10

up

1b0888c

up

7c47ae0

up

d54bd6c

sayakpaul requested review from dg845 and yiyixuxu November 28, 2025 13:59

sayakpaul commented Nov 28, 2025

View reviewed changes

sayakpaul added 10 commits November 28, 2025 20:35

up

9b0028a

up

a74a8f7

up

c137ae1

Merge branch 'main' into z-image-tests

66b6922

up

2c367f8

up

76dbf63

up

8ee24fc

up

a11cdd2

up

bca3e27

Revert "up"

52c6d2f

This reverts commit bca3e27.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Z-Image] various small changes, Z-Image transformer tests, etc. #12741

[Z-Image] various small changes, Z-Image transformer tests, etc. #12741

Uh oh!

sayakpaul commented Nov 28, 2025 •

edited

Loading

Uh oh!

sayakpaul Nov 28, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Nov 28, 2025

Uh oh!

sayakpaul commented Nov 28, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[Z-Image] various small changes, Z-Image transformer tests, etc. #12741

Are you sure you want to change the base?

[Z-Image] various small changes, Z-Image transformer tests, etc. #12741

Uh oh!

Conversation

sayakpaul commented Nov 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Notes

Minor nits

Uh oh!

sayakpaul Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Nov 28, 2025

Uh oh!

sayakpaul commented Nov 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sayakpaul commented Nov 28, 2025 •

edited

Loading

sayakpaul commented Nov 28, 2025 •

edited

Loading