refactor fp8 linear #210

akaitsuki-ii · 2025-12-02T06:50:11Z

No description provided.

gemini-code-assist · 2025-12-02T06:50:37Z

Summary of Changes

Hello @akaitsuki-ii, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors the implementation of FP8 linear operations within the diffsynth_engine. The changes aim to improve modularity and control over FP8 inference by introducing dedicated FP8 linear modules and shifting the responsibility of enabling FP8 from global context managers to model-specific methods. This allows for more precise application of FP8 quantization to relevant linear layers, enhancing flexibility and maintainability.

Highlights

Refactored FP8 Linear Implementation: The core FP8 linear logic has been moved from a global context manager and utility functions to dedicated FP8Linear and LoRAFP8Linear modules, offering a more structured and explicit approach to FP8 operations.
Model-Specific FP8 Activation: Models now expose an enable_fp8_linear method, allowing them to explicitly convert their internal nn.Linear layers to FP8-compatible versions (specifically LoRAFP8Linear) when FP8 inference is desired, providing granular control.
Centralized Autocasting Utilities: The enable_fp8_autocast functionality has been relocated to a new diffsynth_engine/utils/autocast.py file, separating general autocasting from specific FP8 linear module implementations for better organization.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request refactors the FP8 linear layer implementation by replacing the previous monkey-patching approach with a more explicit and modular design. This is a significant improvement for code clarity and maintainability. The new implementation uses a dedicated fp8_linear function and LoRAFP8Linear module, which are consistently applied across various models. I've identified one critical issue that will cause a runtime error and a few medium-severity suggestions to improve the robustness of the module selection logic.

gemini-code-assist · 2025-12-02T06:53:10Z

diffsynth_engine/pipelines/wan_s2v.py

            )
            if config.use_fp8_linear:
-                enable_fp8_linear(dit)
+                dit.enable_fp8_linear()


This call to dit.enable_fp8_linear() will raise an AttributeError at runtime. The dit object, an instance of WanS2VDiT, does not have the enable_fp8_linear method defined.

To fix this, you should implement the enable_fp8_linear method in the WanS2VDiT class in diffsynth_engine/models/wan/wan_s2v_dit.py. It seems this was overlooked during the refactoring.

You can add a method similar to the one in WanDiT:

from diffsynth_engine.models.basic.lora import LoRAFP8Linear ... class WanS2VDiT(WanDiT): ... def enable_fp8_linear(self): target_names = ["blocks"] for name, module in self.named_modules(): if any(name.startswith(f"{t}.") for t in target_names) and isinstance(module, nn.Linear): self.set_submodule(name, LoRAFP8Linear.from_linear(module))

Note that you'll also need to import LoRAFP8Linear in diffsynth_engine/models/wan/wan_s2v_dit.py.

gemini-code-assist · 2025-12-02T06:53:10Z

diffsynth_engine/models/flux/flux_dit.py

+    def enable_fp8_linear(self):
+        target_names = ["blocks", "single_blocks"]
+        for name, module in self.named_modules():
+            if any([t in name for t in target_names]) and isinstance(module, nn.Linear):


The condition any([t in name for t in target_names]) to identify target modules is a bit loose and could lead to incorrect module replacement if a module name incidentally contains one of the target names (e.g., a new module named archived_blocks). Using startswith would be more robust and ensure that only modules within the specified parent modules are matched.

Suggested change

if any([t in name for t in target_names]) and isinstance(module, nn.Linear):

if any(name.startswith(f"{t}.") for t in target_names) and isinstance(module, nn.Linear):

gemini-code-assist · 2025-12-02T06:53:11Z

diffsynth_engine/models/qwen_image/qwen_image_dit.py

+    def enable_fp8_linear(self):
+        target_names = ["transformer_blocks"]
+        for name, module in self.named_modules():
+            if any([t in name for t in target_names]) and isinstance(module, nn.Linear):


The condition any([t in name for t in target_names]) to identify target modules is a bit loose and could lead to incorrect module replacement if a module name incidentally contains one of the target names. Using startswith would be more robust and ensure that only modules within the specified parent modules are matched.

Suggested change

if any([t in name for t in target_names]) and isinstance(module, nn.Linear):

if any(name.startswith(f"{t}.") for t in target_names) and isinstance(module, nn.Linear):

gemini-code-assist · 2025-12-02T06:53:11Z

diffsynth_engine/models/wan/wan_dit.py

+    def enable_fp8_linear(self):
+        target_names = ["blocks"]
+        for name, module in self.named_modules():
+            if any([t in name for t in target_names]) and isinstance(module, nn.Linear):


The condition any([t in name for t in target_names]) to identify target modules is a bit loose and could lead to incorrect module replacement if a module name incidentally contains one of the target names. Using startswith would be more robust and ensure that only modules within the specified parent modules are matched.

Suggested change

if any([t in name for t in target_names]) and isinstance(module, nn.Linear):

if any(name.startswith(f"{t}.") for t in target_names) and isinstance(module, nn.Linear):

refactor fp8 linear

571e0c6

akaitsuki-ii requested a review from a team December 2, 2025 06:50

gemini-code-assist bot reviewed Dec 2, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor fp8 linear #210

refactor fp8 linear #210

Uh oh!

akaitsuki-ii commented Dec 2, 2025

Uh oh!

gemini-code-assist bot commented Dec 2, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Dec 2, 2025

Uh oh!

gemini-code-assist bot Dec 2, 2025

Uh oh!

gemini-code-assist bot Dec 2, 2025

Uh oh!

gemini-code-assist bot Dec 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	if any([t in name for t in target_names]) and isinstance(module, nn.Linear):
	if any(name.startswith(f"{t}.") for t in target_names) and isinstance(module, nn.Linear):

refactor fp8 linear #210

Are you sure you want to change the base?

refactor fp8 linear #210

Uh oh!

Conversation

akaitsuki-ii commented Dec 2, 2025

Uh oh!

gemini-code-assist bot commented Dec 2, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants