Skip to content

Conversation

@akaitsuki-ii
Copy link
Contributor

No description provided.

@akaitsuki-ii akaitsuki-ii requested a review from a team December 2, 2025 06:50
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @akaitsuki-ii, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors the implementation of FP8 linear operations within the diffsynth_engine. The changes aim to improve modularity and control over FP8 inference by introducing dedicated FP8 linear modules and shifting the responsibility of enabling FP8 from global context managers to model-specific methods. This allows for more precise application of FP8 quantization to relevant linear layers, enhancing flexibility and maintainability.

Highlights

  • Refactored FP8 Linear Implementation: The core FP8 linear logic has been moved from a global context manager and utility functions to dedicated FP8Linear and LoRAFP8Linear modules, offering a more structured and explicit approach to FP8 operations.
  • Model-Specific FP8 Activation: Models now expose an enable_fp8_linear method, allowing them to explicitly convert their internal nn.Linear layers to FP8-compatible versions (specifically LoRAFP8Linear) when FP8 inference is desired, providing granular control.
  • Centralized Autocasting Utilities: The enable_fp8_autocast functionality has been relocated to a new diffsynth_engine/utils/autocast.py file, separating general autocasting from specific FP8 linear module implementations for better organization.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the FP8 linear layer implementation by replacing the previous monkey-patching approach with a more explicit and modular design. This is a significant improvement for code clarity and maintainability. The new implementation uses a dedicated fp8_linear function and LoRAFP8Linear module, which are consistently applied across various models. I've identified one critical issue that will cause a runtime error and a few medium-severity suggestions to improve the robustness of the module selection logic.

)
if config.use_fp8_linear:
enable_fp8_linear(dit)
dit.enable_fp8_linear()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This call to dit.enable_fp8_linear() will raise an AttributeError at runtime. The dit object, an instance of WanS2VDiT, does not have the enable_fp8_linear method defined.

To fix this, you should implement the enable_fp8_linear method in the WanS2VDiT class in diffsynth_engine/models/wan/wan_s2v_dit.py. It seems this was overlooked during the refactoring.

You can add a method similar to the one in WanDiT:

from diffsynth_engine.models.basic.lora import LoRAFP8Linear

...

class WanS2VDiT(WanDiT):
    ...
    def enable_fp8_linear(self):
        target_names = ["blocks"]
        for name, module in self.named_modules():
            if any(name.startswith(f"{t}.") for t in target_names) and isinstance(module, nn.Linear):
                self.set_submodule(name, LoRAFP8Linear.from_linear(module))

Note that you'll also need to import LoRAFP8Linear in diffsynth_engine/models/wan/wan_s2v_dit.py.

def enable_fp8_linear(self):
target_names = ["blocks", "single_blocks"]
for name, module in self.named_modules():
if any([t in name for t in target_names]) and isinstance(module, nn.Linear):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The condition any([t in name for t in target_names]) to identify target modules is a bit loose and could lead to incorrect module replacement if a module name incidentally contains one of the target names (e.g., a new module named archived_blocks). Using startswith would be more robust and ensure that only modules within the specified parent modules are matched.

Suggested change
if any([t in name for t in target_names]) and isinstance(module, nn.Linear):
if any(name.startswith(f"{t}.") for t in target_names) and isinstance(module, nn.Linear):

def enable_fp8_linear(self):
target_names = ["transformer_blocks"]
for name, module in self.named_modules():
if any([t in name for t in target_names]) and isinstance(module, nn.Linear):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The condition any([t in name for t in target_names]) to identify target modules is a bit loose and could lead to incorrect module replacement if a module name incidentally contains one of the target names. Using startswith would be more robust and ensure that only modules within the specified parent modules are matched.

Suggested change
if any([t in name for t in target_names]) and isinstance(module, nn.Linear):
if any(name.startswith(f"{t}.") for t in target_names) and isinstance(module, nn.Linear):

def enable_fp8_linear(self):
target_names = ["blocks"]
for name, module in self.named_modules():
if any([t in name for t in target_names]) and isinstance(module, nn.Linear):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The condition any([t in name for t in target_names]) to identify target modules is a bit loose and could lead to incorrect module replacement if a module name incidentally contains one of the target names. Using startswith would be more robust and ensure that only modules within the specified parent modules are matched.

Suggested change
if any([t in name for t in target_names]) and isinstance(module, nn.Linear):
if any(name.startswith(f"{t}.") for t in target_names) and isinstance(module, nn.Linear):

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants