bump patchset to v52 by phip1611 · Pull Request #153 · cyberus-technology/cloud-hypervisor

phip1611 · 2026-04-30T09:16:47Z

This series bumps the gardenlinux Cloud Hypervisor patchset onto the current
base (soon to be released as v52).

You can find an overview of the difficulties during the rebase in this outline document (trivial patches, hard to rebase patches, patches that are now upstream...).

From 248 commits we have in the current gardenlinux branch, we are now down to ~158 (when TLS is merged upstream). I expect the v52 release to happen very soon.

Changes & Hints for Reviewers

The commits that are still here, exist with the same name in the old gardenlinux branch
I reordered the patchset quite significantly: small standalone commits are mostly moved to the beginning where it makes sense, followed by larger series
All commits of series where consolidated, moved together, and sometimes even squashed (init A -> ... -> fix A commits where squashed)
For example, the whole CPU Profiles effort is now a single commit series at the end of our patchset
This was by far the toughest patchset rebase we had so far
Beware: I am unfortunately pretty sure that I've missed minor changes of our gardenlinux branch in that rebase process. For example, some error message improvement or so, but nothing major. This comes from the nature of this complex operation I had to do here.
Changes I had to do against upstream to work with our stack:
- rename pci_device_id from upstream back to device_id to be compatible with us
- remove mutual TLS (mTLS) (use normal TLS)
libvirt pipeline run: https://gitlab.cyberus-technology.de/cyberus/cloud/libvirt/-/merge_requests/194/pipelines

The result is a shorter and more reviewable branch than
cyberus-github/gardenlinux while preserving the relevant Gardenlinux behavior
on top of the current Cloud Hypervisor base.

Ticket: https://github.com/cobaltcore-dev/cobaltcore/issues/503#issuecomment-4311454443

phip1611 · 2026-04-30T09:45:54Z

@olivereanderson please take a brief look. I grouped all your commits and brought them into consecutive order. Once cloud-hypervisor#8029 is merged - what are the implications for our fork? What is your recommendation to keep the patchset working and maintainable? What are your thoughts and ideas?

olivereanderson · 2026-04-30T09:49:50Z

@olivereanderson please take a brief look. I grouped all your commits and brought them into consecutive order. Once cloud-hypervisor#8029 is merged - what are the implications for our fork? What is your recommendation to keep the patchset working and maintainable? What are your thoughts and ideas?

I plan to backport cloud-hypervisor#8029 as soon as it is merged because the code is simply better.

phip1611 · 2026-04-30T10:05:43Z

If possible, I'd prefer to not merge (or backport) anything into gardenlinux before we finish this. But we can plan this together next week as well!

olivereanderson · 2026-04-30T10:27:57Z

If possible, I'd prefer to not merge (or backport) anything into gardenlinux before we finish this. But we can plan this together next week as well!

We can definitely merge this PR (v52) first. Let's discuss further next week 🙂

This extends migration to also support paused VMs, preserving the paused state on the destination. Changes: - Add CompletePaused protocol command that finalizes migration without resuming the VM on the destination - Skip the pause step during migration if the VM is already paused - On migration failure, only restore the running state if the VM was originally running (not paused) Signed-off-by: Nguyen Dinh Phi <[email protected]>

Adding a paused flag to live_migration() tests; when this flag is set, the VM will be paused before migration is performed. Signed-off-by: Nguyen Dinh Phi <[email protected]>

A malicious or buggy guest can issue an MSI-X table write with an unexpected size (not 4 or 8 bytes), triggering an assert!() that crashes the VMM process. Replace the assertion with an error log and early return to maintain VMM stability under adversarial guest behavior. Signed-off-by: Anatol Belski <[email protected]>

Rate limiting is implemented in the virtio device layer and does not apply to vhost-user devices which delegate I/O handling to an external process. Add validation to reject configurations where vhost_user is enabled along with rate limiting options (bw_size, ops_size, or rate_limit_group) for both disk and network devices. This prevents users from mistakenly configuring rate limiting that would be silently ignored when using vhost-user backends. Signed-off-by: Rob Bradford <[email protected]>

If this test flakes is can then cause subsequent invocations to fail as the test has left its special test interfaces alive. Signed-off-by: Rob Bradford <[email protected]>

Booting the VM on these tests takes longer so allow longer before timing out the boot. Signed-off-by: Rob Bradford <[email protected]>

Follow the same pattern as other virtio devices using a bool to check if it needs notification and propagating its own Error enum. Sadly this does still use `anyhow!()` but this does match with the behaviour of the other devices in their implementations. As a side effect we can now remove two errors from the top-level Error enum in virtio-devices as these were only used by this module and those errors had mangled descriptions. Signed-off-by: Rob Bradford <[email protected]>

Use into_iter() for test_list when building tests_to_run. This keeps the collected type as Vec<&PerformanceTest>. Signed-off-by: Muminul Islam <[email protected]>

Request::execute and Request::execute_async checked each data descriptor against `disk_nsectors` using the request's fixed start sector. With sector = disk_nsectors-1 and N descriptors of 512 bytes each, every descriptor passed (top = disk_nsectors) but the vectored I/O collectively read/wrote N*512 bytes starting at the last sector — N-1 sectors past EOF. For the io_uring/aio raw backends this lets the guest extend the host disk image beyond its provisioned size, exhausting the host filesystem. For fixed-VHD images (footer at end of file) the same chain overwrites the footer with guest-controlled bytes, corrupting the disk image. Replace the per-descriptor check with a chain-wide check_data_bounds(). Pre-validating the entire request before beginning the operation avoids having to unroll a partial submit. Signed-off-by: Dylan Reid <[email protected]>

A malicious or buggy guest can write an out-of-bounds value to queue_msix_vector or msix_config. When the device later triggers an interrupt, it indexes into table_entries with the unchecked vector, causing a panic. Validate the vector against the MSI-X table size in both trigger() and notifier() paths, logging a warning and returning early when the vector exceeds the table bounds. Signed-off-by: Anatol Belski <[email protected]>

Verify that firing an interrupt with a queue vector beyond the MSI-X table size returns Ok without panicking. Signed-off-by: Anatol Belski <[email protected]>

Verify that triggering an interrupt when the vector is set to VIRTQ_MSI_NO_VECTOR short-circuits and returns Ok. Signed-off-by: Anatol Belski <[email protected]>

Verify that requesting a notifier with an out-of-bounds MSI-X vector returns None instead of panicking. Signed-off-by: Anatol Belski <[email protected]>

Verify that a valid in bounds vector with MSI-X enabled successfully triggers the interrupt source group. Signed-off-by: Anatol Belski <[email protected]>

Verify that firing a config change interrupt with msix_config vector beyond the table size returns Ok without panicking. Signed-off-by: Anatol Belski <[email protected]>

phip1611 · 2026-05-05T08:38:38Z

Normal libvirt-tests (default suite) are already passing.

The unit tests added in bf3279f built sparse files by writing only at one offset and assuming the surrounding pages stayed unallocated. That breaks on shmem/tmpfs with huge=within_size: kernel 6.10+ added large-folio support to shmem, and on first write the kernel allocates one folio whose order is the largest power-of-two number of pages that fits inside the file size (capped at PMD-size). For a 64 KiB test file the very first pwrite anywhere allocates a 64 KiB folio covering the whole file, so SEEK_HOLE never reports a hole and written_pages_show_as_data_extents, sparse_file_yields_extents_at_written_positions, and single_extent_at_zero_offset all fail. memfd_create lives on shmem too and inherits the same THP policy from /sys/kernel/mm/transparent_hugepage/shmem_enabled, so the problem is not /tmp-specific. Fix the fixtures, not the production code: build each test file via a new sparse_layout() helper that writes the requested data extents and then fallocate(FALLOC_FL_PUNCH_HOLE)s every gap. PUNCH_HOLE is the explicit "deallocate these pages" syscall and is honored by every Linux filesystem we run tests on (tmpfs, ext4, xfs, btrfs); the kernel splits any large folio overlapping the punched range. The resulting SEEK_DATA/SEEK_HOLE map matches the spec exactly regardless of folio/THP policy. For single_extent_at_zero_offset the dst side still loses to the folio allocator -- writing 8 KiB into a 64 KiB tmpfs file allocates a 64 KiB folio whether we want it or not -- so the previous meta.blocks()-based sparseness assertion (which tested the filesystem, not our code) is replaced with a sentinel pre-fill: dst starts filled with 0xFE and the post-condition is that bytes outside the source-data extent are still 0xFE. That directly verifies write_region_sparse only touched the data extent without depending on dst-side hole reporting. Side effect: extent_at_non_zero_src_offset, two_regions_in_same_destination_file_at_dst_offset, and round_trip_sparse_write_then_read previously passed by accident on hosts with mTHP-on-shmem -- their src memfds reported the whole file as data so write_region_sparse silently fell into a dense copy of zeros + data. With sparse_layout() the sources are genuinely sparse and those tests now exercise the sparse path on every host. Tested on tmpfs (huge=within_size) and ext4 (TMPDIR=/var/tmp); all 9 tests pass on both with no skips. Assisted-by: Claude:Opus-4.7 Signed-off-by: Dylan Reid <[email protected]>

Regenerate CPU profiles in order to enable machine check architecture (MCA) for non-host CPU profiles which is required to boot Windows server. Signed-off-by: Oliver Anderson <[email protected]> On-behalf-of: SAP [email protected]

These are already displayed as not available to guests via CPUID for non-host CPU profiles, but we forgot to forbid the corresponding MSRs. The profiles we have generated are OK with respect to this oversight because KVM_GET_MSR_INDEX_LIST did not report those MSRs at the time they were generated, but it does now. Signed-off-by: Oliver Anderson <[email protected]> On-behalf-of: SAP [email protected]

Hardware duty cycling (HDC) does not make sense in the virtualization setting and should thus not be displayed as available to guests. We have already disabled certain HDC aspects via CPUID 0x6 ECX[13], but we forgot to disable the state components which is what we do in this commit. Signed-off-by: Oliver Anderson <[email protected]> On-behalf-of: SAP [email protected]

We have already disabled architectural LBR (last branch record) for CPU profiles, but we forgot to disable the corresponding state components. Signed-off-by: Oliver Anderson <[email protected]> On-behalf-of: SAP [email protected]

Hardware P-states (HWP) is already disabled for non-host CPU profiles, but we forgot to also disable the associated state components. Signed-off-by: Oliver Anderson <[email protected]> On-behalf-of: SAP [email protected]

We already disabled Processor Trace (PT) for CPU profiles, but forgot to disable the associated state components. Signed-off-by: Oliver Anderson <[email protected]> On-behalf-of: SAP [email protected]

We have already forbidden IA32_PASID, an MSR related to process address space identifiers (PASID), but we forgot to disable the associated state components. Signed-off-by: Oliver Anderson <[email protected]> On-behalf-of: SAP [email protected]

Bit 56 of VM_ENTRY_HARDWARE_EXCEPTIONS in IA32_VMX_BASIC is only set on rather recent KVM versions. Thus whenever a CPU profile is generated on a machine with a recent Linux kernel, the current inherit policy will lead to the CPU profile being incompatible on deplyoments with older Linux kernels. This may not be the intention of the person generating the CPU profile, thus we change the policy to `Static(0)` for the time being. Signed-off-by: Oliver Anderson <[email protected]> On-behalf-of: SAP [email protected]

IA32_XSS (Extended Supervisor State Mask) is only reported via KVM_GET_MSR_INDEX_LIST on rather recent kernels. This can lead to CPU profiles that are generated on a machine with the latest Linux kernel, not work with deployments where the hosts use a bit older kernels which may be unintentional. We thus decide to forbid this MSR for now, even though CPUID 0xd.0x1.EAX[3] can inform the guest that the MSR is available. We do not want to force the aforementioned feature bit to 0 because it is also used to report support for XSAVES/XRSTORS. Although not ideal, we consider denying access to IA32_XSS to be acceptable because the 0xd CPUID leaves report all IA32_XSS related state components to be unsupported. There is thus no reason for the guest to be interested in using this MSR. Signed-off-by: Oliver Anderson <[email protected]> On-behalf-of: SAP [email protected]

We have disabled LBR for non-host CPU profiles, but forgot to also do so in the VM-Exit and VM-Entry control MSRs. Signed-off-by: Oliver Anderson <[email protected]> On-behalf-of: SAP [email protected]

We add developer documentation on how to use the CPU profile generation tool. Signed-off-by: Oliver Anderson <[email protected]> On-behalf-of: SAP [email protected]

We will later use flate2 in arch/build.rs to compress CPU profile JSON files at compile time and also later to decompress them at runtime. Signed-off-by: Oliver Anderson <[email protected]> On-behalf-of: SAP [email protected]

We introduce a build.rs build script in the arch crate which automatically constructs the x86_64 CpuProfile enum with one variant per pre-generated CPU profile. In order to keep the binary size in check we also take the opportunity to compress the CPU profile JSON files into the binary which then get decompressed at runtime. We will adapt cpu_profile.rs in the next commit to use the output of build.rs Signed-off-by: Oliver Anderson <[email protected]> On-behalf-of: SAP [email protected]

Signed-off-by: Oliver Anderson <[email protected]> On-behalf-of: SAP [email protected]

When we introduced our build script we forgot to tell `serde` to (de-) serialize the `CpuProfile` enum in kebab-case which is a breaking change. Signed-off-by: Oliver Anderson <[email protected]> On-behalf-of: SAP [email protected]

This is needed as at our customer we deployed everything without mTLS. Management software can provide the necessary cert files if it knows that both CH hosts support mTLS already, so we can eventually upgrade the fleet to mTLS and get rid of this commit. On-behalf-of: SAP [email protected] Signed-off-by: Philipp Schuster <[email protected]>

This is a temporary measurement as upstream decided for a different name than we in our fork. On-behalf-of: SAP [email protected] Signed-off-by: Philipp Schuster <[email protected]>

…rk)" This reverts commit 3134e961444cd76ca3afc8abf55a8479f86c1e1c.

This is needed as at our customer we deployed everything without mTLS. We need to find a migration path soon, tho. On-behalf-of: SAP [email protected] Signed-off-by: Philipp Schuster <[email protected]>

This was missing in [0] but is required for proper explicit PCI BDF management, e.g., when a VM is created via libvirt and each device has an explicit BDF. [0] cloud-hypervisor#7965 On-behalf-of: SAP [email protected] Signed-off-by: Philipp Schuster <[email protected]>

Restructure CtrlQueue::process() so each command parses its own descriptor layout and returns the used length alongside the status descriptor. This is a behavior-neutral cleanup that prepares follow-up control queue features. On-behalf-of: SAP [email protected] Signed-off-by: Sebastian Eydam <[email protected]>

Extract virtio-net constructor bookkeeping into a small helper struct and dedicated restore/fresh initialization helpers. This keeps new_with_tap() focused on assembly and makes follow-up feature changes easier to review. On-behalf-of: SAP [email protected] Signed-off-by: Sebastian Eydam <[email protected]>

On-behalf-of: SAP [email protected] Signed-off-by: Sebastian Eydam <[email protected]>

In addition to the RARP announcement, advertise VIRTIO_NET_F_GUEST_ANNOUNCE on virtio-net devices and request a guest announcement after migration by setting the announce status bit and raising a config interrupt. Handle the guest announce ACK on the control queue. On-behalf-of: SAP [email protected] Signed-off-by: Sebastian Eydam <[email protected]>

Add unit tests for the new guest-announce flow in the control queue, virtio-net, and vhost-user-net. The tests cover setting and clearing the announce state, triggering the config interrupt, and disabling the host-side RARP fallback when the guest negotiated VIRTIO_NET_F_GUEST_ANNOUNCE. On-behalf-of: SAP [email protected] Signed-off-by: Sebastian Eydam <[email protected]>

Preserve migration compatibility with older snapshots by defaulting a missing announce_pending field to false during deserialization, and cover both cases with regression tests. On-behalf-of: SAP [email protected] Signed-off-by: Sebastian Eydam <[email protected]>

Re-trigger config interrupts for restored pending guest announce requests once the net device is activated. Cover both virtio-net and vhost-user-net with regression tests. On-behalf-of: SAP [email protected] Signed-off-by: Sebastian Eydam <[email protected]>

Track a runtime announce generation for virtio-net and vhost-user-net so post-migration retry announcers stop after reset or device teardown. This keeps repeated announce rounds within one migration session, while preventing stale retry threads from re-arming VIRTIO_NET_S_ANNOUNCE after the guest already reset, rebooted, or the device was dropped. On-behalf-of: SAP [email protected] Signed-off-by: Sebastian Eydam <[email protected]>

phip1611 self-assigned this Apr 30, 2026

phip1611 force-pushed the gardenlinux-next-v52 branch from bc2452a to 1a41fef Compare April 30, 2026 09:21

phi-nguyendp added 2 commits May 4, 2026 09:30

tests: Adding integration tests for migration of paused VM

c595125

Adding a paused flag to live_migration() tests; when this flag is set, the VM will be paused before migration is performed. Signed-off-by: Nguyen Dinh Phi <[email protected]>

phip1611 force-pushed the gardenlinux-next-v52 branch 4 times, most recently from 1d4fdc8 to 5c611f4 Compare May 4, 2026 14:30

weltling and others added 2 commits May 4, 2026 16:08

phip1611 force-pushed the gardenlinux-next-v52 branch 2 times, most recently from 554b8fc to c11d72b Compare May 4, 2026 18:25

rbradford and others added 4 commits May 4, 2026 21:28

tests: Cleanup interfaces in test_vfio

a93dbe7

If this test flakes is can then cause subsequent invocations to fail as the test has left its special test interfaces alive. Signed-off-by: Rob Bradford <[email protected]>

tests: Allow more time for firmware & O_DIRECT tests

7d8986a

Booting the VM on these tests takes longer so allow longer before timing out the boot. Signed-off-by: Rob Bradford <[email protected]>

performance-metrics: avoid double ref in test selection

3df0579

Use into_iter() for test_list when building tests_to_run. This keeps the collected type as Vec<&PerformanceTest>. Signed-off-by: Muminul Islam <[email protected]>

phip1611 force-pushed the gardenlinux-next-v52 branch from 768a632 to 7ddbe2c Compare May 5, 2026 06:17

dgreid and others added 7 commits May 5, 2026 08:21

virtio-devices: Test trigger with OOB MSI-X vector does not panic

442e7f8

Verify that firing an interrupt with a queue vector beyond the MSI-X table size returns Ok without panicking. Signed-off-by: Anatol Belski <[email protected]>

virtio-devices: Test trigger with NO_VECTOR returns Ok

088b136

Verify that triggering an interrupt when the vector is set to VIRTQ_MSI_NO_VECTOR short-circuits and returns Ok. Signed-off-by: Anatol Belski <[email protected]>

virtio-devices: Test notifier with OOB vector returns None

a682112

Verify that requesting a notifier with an out-of-bounds MSI-X vector returns None instead of panicking. Signed-off-by: Anatol Belski <[email protected]>

virtio-devices: Test trigger with valid vector fires interrupt

6be080c

Verify that a valid in bounds vector with MSI-X enabled successfully triggers the interrupt source group. Signed-off-by: Anatol Belski <[email protected]>

virtio-devices: Test config vector OOB does not panic

6daa9e1

Verify that firing a config change interrupt with msix_config vector beyond the table size returns Ok without panicking. Signed-off-by: Anatol Belski <[email protected]>

olivereanderson and others added 29 commits May 7, 2026 19:32

arch: Regenerate CPU profiles

8f7294d

Regenerate CPU profiles in order to enable machine check architecture (MCA) for non-host CPU profiles which is required to boot Windows server. Signed-off-by: Oliver Anderson <[email protected]> On-behalf-of: SAP [email protected]

arch: Disable LBR state components

beda265

We have already disabled architectural LBR (last branch record) for CPU profiles, but we forgot to disable the corresponding state components. Signed-off-by: Oliver Anderson <[email protected]> On-behalf-of: SAP [email protected]

arch: Disable HWP state components

cbdc289

Hardware P-states (HWP) is already disabled for non-host CPU profiles, but we forgot to also disable the associated state components. Signed-off-by: Oliver Anderson <[email protected]> On-behalf-of: SAP [email protected]

arch: Disable PT state components

f5f2225

We already disabled Processor Trace (PT) for CPU profiles, but forgot to disable the associated state components. Signed-off-by: Oliver Anderson <[email protected]> On-behalf-of: SAP [email protected]

arch: Disable PASID state components

6d764e0

We have already forbidden IA32_PASID, an MSR related to process address space identifiers (PASID), but we forgot to disable the associated state components. Signed-off-by: Oliver Anderson <[email protected]> On-behalf-of: SAP [email protected]

arch: Clear LBR related bits in the VM-Exit and VM-Entry CTL MSRs

8f831c1

We have disabled LBR for non-host CPU profiles, but forgot to also do so in the VM-Exit and VM-Entry control MSRs. Signed-off-by: Oliver Anderson <[email protected]> On-behalf-of: SAP [email protected]

docs: CPU Profile generation

e2dde73

We add developer documentation on how to use the CPU profile generation tool. Signed-off-by: Oliver Anderson <[email protected]> On-behalf-of: SAP [email protected]

build: flate2 Workspace dependency

d94135d

We will later use flate2 in arch/build.rs to compress CPU profile JSON files at compile time and also later to decompress them at runtime. Signed-off-by: Oliver Anderson <[email protected]> On-behalf-of: SAP [email protected]

arch: Update cpu_profile.rs to include code generation from build.rs

2a76306

Signed-off-by: Oliver Anderson <[email protected]> On-behalf-of: SAP [email protected]

docs: Update CPU profile generation developer documentation

cd8d36a

Signed-off-by: Oliver Anderson <[email protected]> On-behalf-of: SAP [email protected]

arch: Deserialize CPU profiles in kebab-case

6c3e326

When we introduced our build script we forgot to tell `serde` to (de-) serialize the `CpuProfile` enum in kebab-case which is a breaking change. Signed-off-by: Oliver Anderson <[email protected]> On-behalf-of: SAP [email protected]

vmm: pci: rename pci_device_id -> device_id

a97911e

This is a temporary measurement as upstream decided for a different name than we in our fork. On-behalf-of: SAP [email protected] Signed-off-by: Philipp Schuster <[email protected]>

Revert "vm-migration: make mTLS optional (make compatible with our fo…

2c268c0

…rk)" This reverts commit 3134e961444cd76ca3afc8abf55a8479f86c1e1c.

vm-migration: mTLS -> TLS (make upstream compatible with our fork)

d4d9a69

This is needed as at our customer we deployed everything without mTLS. We need to find a migration path soon, tho. On-behalf-of: SAP [email protected] Signed-off-by: Philipp Schuster <[email protected]>

virtio-devices: net: report link up in config status

4b842e5

On-behalf-of: SAP [email protected] Signed-off-by: Sebastian Eydam <[email protected]>

phip1611 force-pushed the gardenlinux-next-v52 branch from d45b566 to 67574bb Compare May 7, 2026 17:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bump patchset to v52#153

bump patchset to v52#153
phip1611 wants to merge 304 commits into
cyberus-technology:gardenlinux-next-v52-basefrom
phip1611:gardenlinux-next-v52

phip1611 commented Apr 30, 2026 •

edited

Loading

Uh oh!

phip1611 commented Apr 30, 2026

Uh oh!

olivereanderson commented Apr 30, 2026

Uh oh!

phip1611 commented Apr 30, 2026

Uh oh!

olivereanderson commented Apr 30, 2026

Uh oh!

phip1611 commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

15 participants

Conversation

phip1611 commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes & Hints for Reviewers

Uh oh!

phip1611 commented Apr 30, 2026

Uh oh!

olivereanderson commented Apr 30, 2026

Uh oh!

phip1611 commented Apr 30, 2026

Uh oh!

olivereanderson commented Apr 30, 2026

Uh oh!

phip1611 commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

15 participants

phip1611 commented Apr 30, 2026 •

edited

Loading