Skip to content

feat(usb): add fast fail attach#2067

Open
danilrwx wants to merge 13 commits intomainfrom
fix/usb/add-fast-fail-attach
Open

feat(usb): add fast fail attach#2067
danilrwx wants to merge 13 commits intomainfrom
fix/usb/add-fast-fail-attach

Conversation

@danilrwx
Copy link
Copy Markdown
Contributor

@danilrwx danilrwx commented Mar 5, 2026

Description

This PR now contains two groups of changes:

  1. USB attach fast-fail logic for USBIP port exhaustion:
  • add shared speed-aware USB helpers (pkg/common/usb/speed.go + tests);
  • update VM USB validator to check available USBIP ports per speed class (HS/SS) for new cross-node attachments;
  • update VM USB attach handler to stop futile attach attempts when no compatible free USBIP ports are available on target node (requeue);
  • update USBDevice lifecycle status with explicit NoFreeUSBIPPort reason.
  1. Existing branch fixes included in this PR scope:
  • e2e stabilization (better node diagnostics dump, clearer VM agent readiness error, timeout tuning);
  • switch some e2e image sources from Alpine BIOS image to Ubuntu image;
  • werf bundle import path fix in release-channel-version image stage.

Why do we need it, and what problem does it solve?

For USB flows, attach operations could repeatedly retry even when the target node had no free USBIP ports for the required speed class. This created noisy retries and delayed feedback.

With these changes, validation and reconciliation fail earlier and report a clear reason, so behavior is predictable and troubleshooting is easier.

Additional e2e/werf changes improve test stability and packaging correctness for the current branch content.

What is the expected result?

  1. Run VM with USB devices requiring cross-node USBIP forwarding.
  2. Exhaust HS or SS hub ports on the target node.
  3. Request one more USB device of the exhausted speed class.
  4. Verify:
    • validation rejects over-capacity updates;
    • attach handler does not loop blindly and requeues;
    • USBDevice condition reflects NoFreeUSBIPPort when applicable.
  5. Run affected e2e suites and confirm improved diagnostics and stable image usage.
  6. Verify bundle stage still includes module.yaml correctly after werf import path fix.

Checklist

  • The code is covered by unit tests.
  • e2e tests passed.
  • Documentation updated according to the changes.
  • Changes were tested in the Kubernetes cluster manually.

Changelog entries

section: core
type: fix
summary: "Add fast-fail USB attach checks and speed-aware USBIP port validation for HS/SS hubs."
impact_level: low

@danilrwx danilrwx changed the base branch from main to feat/improve-scheduling-for-hs-ss-hubs March 5, 2026 09:12
@yaroslavborbat yaroslavborbat force-pushed the feat/improve-scheduling-for-hs-ss-hubs branch 5 times, most recently from 769ea8e to bfdeea0 Compare March 10, 2026 08:45
@danilrwx danilrwx force-pushed the fix/usb/add-fast-fail-attach branch from e431939 to efcb2f6 Compare March 12, 2026 10:50
Base automatically changed from feat/improve-scheduling-for-hs-ss-hubs to main March 17, 2026 14:05
@danilrwx danilrwx force-pushed the fix/usb/add-fast-fail-attach branch from efcb2f6 to 0091b74 Compare March 18, 2026 17:16
@danilrwx danilrwx marked this pull request as ready for review March 18, 2026 17:16
@danilrwx danilrwx added this to the v1.7.0 milestone Mar 18, 2026
yaroslavborbat
yaroslavborbat previously approved these changes Mar 24, 2026
- Remove call to undefined validateAvailableUSBIPPortsDefault
- Replace getNodeTotalPorts with direct client.Get for Node
- Use usb.CheckFreePortForRequest from common/usb package
- Remove unused imports

Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
Check free USBIP ports only for new attachments, not for devices
already attached in KVVMI. This prevents resetting attached: true to
false when ports are exhausted but device is already working.

Fixes the issue where attached USB devices were marked as detached
due to 'no free USBIP ports available' logs.

Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
When a USB device attach request is already in flight (existingStatus exists
but not yet attached), skip the USBIP port availability check. The port
was available when the request was made, and re-checking would cause devices
to get stuck if ports become exhausted mid-flight while DRA/KubeVirt
is still processing the request.

Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
Use Ready=true && Attached=false instead of just Attached=false to determine
if a USB device attach request is already in flight. This is more reliable
because:
- existingStatus may exist from the start when VM is created with USB device
- Ready=true means the USBDevice is ready and ResourceClaimTemplate exists
- Attached=false means KVVMI has not yet reported the device as attached

This ensures we only skip port checks when the request has actually been sent,
not just when any status exists.

Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
Check USBIP port availability only when device is added for the first time
(existingStatus == nil). Once a device has a status, the attach request was
already sent and ports were available at that time. This prevents devices
from getting stuck if ports are exhausted mid-flight while DRA processes the
request.

Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
Add helper to check if a device exists in KVVMI regardless of its phase.
Used by usb_device_attach_handler to skip port checks for devices that
are already in flight (exist in KVVMI, even if not yet ready).

Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
Skip free-port validation when the device already appears in KVVMI
host device statuses (attach in progress), matching hostDeviceExistsByName
helper added earlier.

Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
@danilrwx danilrwx force-pushed the fix/usb/add-fast-fail-attach branch from bd2a95c to 3b7852d Compare March 25, 2026 18:26
…r device sets

Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
Signed-off-by: Vladislav Panfilov <97229646+prismagod@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants