Skip to content

Conversation

@AlanPonnachan
Copy link
Contributor

What does this PR do?

This PR adds support for MagCache , a training-free inference acceleration method for diffusion models, specifically targeting Transformer-based architectures like Flux.

This implementation follows the ModelHook pattern (similar to FirstBlockCache) to integrate seamlessly into Diffusers.

Key features:

  • MagCacheConfig: Configuration class to control threshold, retention ratio, and skipping limits.
  • apply_mag_cache: Helper function to attach the hooks to a model.
  • Default Ratios: Includes pre-computed magnitude decay ratios for Flux models (Dev/Schnell) as per the official implementation.
  • Mechanism: The hook calculates the accumulated error of the residual magnitude at each step. If the error is below the defined threshold, it skips the computation of the transformer blocks and approximates the output using the residual from the previous step.

Fixes #12697

Before submitting

Who can review?

@sayakpaul

@sayakpaul sayakpaul requested a review from DN6 November 29, 2025 06:19
@sayakpaul
Copy link
Member

@leffff could you review as well if possible?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Magcache Support.

2 participants