-
Notifications
You must be signed in to change notification settings - Fork 87
Description
### Running FlashVSR on lower VRAM without any artifacts.
### 08.12.2025
Optimization: Ported all VRAM optimizations and Tiled VAE support to tiny-long mode, ensuring feature parity across all modes.
Performance: Optimized for Windows environments (speed improvements).
Update: General codebase cleanup and synchronization.
### 07.12.2025
VRAM Optimization: Implemented auto-fallback for process_chunk. If OOM occurs, it automatically retries with tiled_vae=True and then tiled_dit=True, preventing crashes.
Critical Fix: Fixed a bug in the non-tiled processing path where output was undefined.
Optimization: Defer VAE loading in full mode until strictly necessary, significantly reducing peak VRAM usage.
Optimization: Added a proactive 90% VRAM usage warning.
Refactor: Rewrote progress bar to use single-line in-place updates (\r) for cleaner console output.
Defaults: Updated default settings for FlashVSR Ultra-Fast node to be safer for 16GB cards (unload_dit=True, tiled options enabled).
Bug Fix: Fixed AttributeError in full mode by adding a fallback mechanism to manually load the VAE model if the model manager fails.
Bug Fix: Fixed the progress bar to correctly display status in ComfyUI using the cqdm wrapper. Added text-based progress bar to logs.
Sync: Enabled VAE spatial tiling for tiny mode, bringing VRAM savings from tiny-long to the standard fast pipeline.
Documentation: Expanded tooltips for all node parameters and added detailed usage instructions to README.
New Feature: Added frame_chunk_size option to split large videos into chunks, enabling processing of large files on limited VRAM by offloading to CPU.
Enhancement: Improved logging to show detailed resource usage (RAM, Peak VRAM, per-step timing) and model configuration details.
Optimization: Added torch.cuda.ipc_collect() for better memory cleanup.
New Feature: Added attention_mode selection with support for flash_attention_2, sdpa, sparse_sage, and block_sparse backends.
Refactor: Cleaned up code and improved error handling for imports.
### 06.12.2025
Bug Fix: Fixed a shape mismatch error for small input frames by implementing correct padding logic.
Optimization: VRAM is now immediately freed at the start of processing to prevent OOM errors.
New Feature: Added enable_debug option for extensive logging.
New Feature: Added keep_models_on_cpu option to keep models in RAM (CPU) instead of VRAM.
Enhancement: Added accurate FPS calculation and peak VRAM reporting.
Optimization: Replaced einops operations with native PyTorch ops.
Optimization: Added "Conv3d memory workaround".