Skip to content

[backport] Backport sweep for 9.1#3987

Open
valkeyrie-ops[bot] wants to merge 15 commits into
9.1from
agent/backport/sweep/9.1
Open

[backport] Backport sweep for 9.1#3987
valkeyrie-ops[bot] wants to merge 15 commits into
9.1from
agent/backport/sweep/9.1

Conversation

@valkeyrie-ops

@valkeyrie-ops valkeyrie-ops Bot commented Jun 15, 2026

Copy link
Copy Markdown

Backport sweep for 9.1

Automated cherry-picks from PRs marked "To be backported".

Applied

Source PR Title Detail
#3938 Fix IO-Threads redesign cleanup perf regression from #3544
#3964 Omit alldbs rule in ACL SAVE/LIST and CONFIG REWRITE for compatibility conflicts resolved by Claude Code
#3920 Reject integer overflow of length fields in zipmapValidateIntegrity
#3921 Reject NAN scores in listpack/ziplist-encoded sorted sets on RDB load
#3959 Stabilize CLUSTERSCAN unassigned-slot test by retrying DELSLOTS
#3939 Fix RESP3 type violation in addReplyCommandSubCommands
#3811 Fix off_t to int truncation in bio repl transfer size reporting
#3843 Increase max proctitle length from 255 to 1024
#3980 Reduce io-threads modifiability test iterations under Valgrind
#4049 Fix HGETDEL FIELDS token validation
#4016 Fix build warnings with OpenSSL 4.0 conflicts resolved by Claude Code
#4047 Fix HRANDFIELD CASE 4 infinite loop when valid fields fewer than count
#4024 Restore client's selected DB after module keyspace notification
#4073 fix: Reject corrupt stream RDB with shared NACK across consumers
#4068 Fix sentinel failover coordinated segfault when old leader's client is disconnected

AI resolution details are posted as comments on this PR when available.


Generated by valkey-ci-agent using Claude Code.

roshkhatri and others added 2 commits June 15, 2026 16:09
This regression is still present in 9.1 GA as the cherrypick of revert
commit was missed during release.

Re-applies #3544 (reverted in #3756 due to ~20% SET regression) with the
performance fix from #3760.

**Root Cause:** The original #3544 changed
`tryOffloadFreeObjToIOThreads` to only offload the SDS buffer free to IO
threads, freeing the `robj` shell on the main thread. I carried out
profiling for the change and it showed that freeing the `robj` shell on
the main thread became the prime main-thread hotspot (~10% CPU), while
IO threads shifted from doing real `jemalloc` work to spinning idle on
`spmcDequeue`.

**Fix**: Keep `tryOffloadFreeObjToIOThreads` offloading the entire robj
(`decrRefCount`) to the IO thread. Cross-thread `zfree` is safe with
`jemalloc`.

This PR includes all cleanup work from #3544 so - 
- `trySendWriteToIOThreads`: defer clearing `last_header` until after
successful enqueue
- `evictClients`: simplified bookkeeping
- Queue sizes as runtime parameters instead of compile-time macros
- IO ignition policy using `stat_active_time` instead of `getrusage`
- Function renames (`IOThreadFreeArgv` --> `ioThreadFreeArgv`, etc.) and
doc comments

**Benchmark** on (Graviton4 c8gb.metal-48xl):
Config: SET, 128B values, 9 IO threads, pipeline=10, 1600 clients - Same
as Valkey official method
| Version | Throughput |
|---------|-----------|
| Unstable + original #3544 | ~1,554K rps |
| Unstable + this PR | ~2,116K rps |

<details>
<summary>Diff vs original #3544 (perf fix)</summary>

```diff
diff --git a/src/io_threads.c b/src/io_threads.c
--- a/src/io_threads.c
+++ b/src/io_threads.c
@@ // IO thread handler
 case JOB_REQ_FREE_OBJ:
-    zfree(data);
+    decrRefCount(data);
     break;

@@ // tryOffloadFreeObjToIOThreads
-    /* We offload only the free of the ptr that may be allocated by the I/O thread.
-     * The object itself was allocated by the main thread and will be freed by the main thread. */
-    void *job = tagJob(sdsAllocPtr(objectGetVal(obj)), JOB_REQ_FREE_OBJ);
+    void *job = tagJob(obj, JOB_REQ_FREE_OBJ);
     if (unlikely(spmcEnqueue(&io_shared_inbox, job) == false)) return C_ERR;
-    objectSetVal(obj, NULL);
-    decrRefCount(obj);
     io_jobs_submitted++;
```
</details>

---------

Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
#3964)

Database-level ACL #2309 introduced `alldbs` rule that was explicit for
all users and because of that previous versions no longer had the
ability to parse ACL strings produced by later versions.
Omit `alldbs` in `ACLDescribeSelector()`, that is used in `ACL
SAVE/LOAD` and `CONFIG REWRITE` command paths so that downgrades would
be possible if new feature was not used (`db=` and `resetdbs` rules).
Keep `ACL GETUSER` command's output as is and return `alldbs` in
`databases` field because of command's field-value format.
Add test to check that `ACL LIST` omits implicit `alldbs` and add check
to existing `ACL SAVE` and `CONFIG REWRITE` tests.

Fixes #3915

Signed-off-by: Daniil Kashapov <daniil.kashapov.ykt@gmail.com>
@codecov

codecov Bot commented Jun 15, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 94.39252% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.47%. Comparing base (71545e0) to head (cfc72d4).

Files with missing lines Patch % Lines
src/module.c 0.00% 3 Missing ⚠️
src/io_threads.c 94.44% 1 Missing ⚠️
src/sentinel.c 0.00% 1 Missing ⚠️
src/server.c 80.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##              9.1    #3987      +/-   ##
==========================================
- Coverage   76.73%   76.47%   -0.26%     
==========================================
  Files         163      163              
  Lines       81098    81129      +31     
==========================================
- Hits        62228    62046     -182     
- Misses      18870    19083     +213     
Files with missing lines Coverage Δ
src/acl.c 92.55% <ø> (-0.13%) ⬇️
src/networking.c 92.05% <100.00%> (-0.18%) ⬇️
src/queues.c 99.35% <100.00%> (+0.02%) ⬆️
src/rdb.c 77.01% <100.00%> (+0.08%) ⬆️
src/replication.c 86.15% <100.00%> (+0.22%) ⬆️
src/server.h 100.00% <ø> (ø)
src/setproctitle.c 76.19% <ø> (ø)
src/t_hash.c 95.28% <100.00%> (+0.01%) ⬆️
src/t_zset.c 96.86% <100.00%> (-0.09%) ⬇️
src/tls.c 17.64% <ø> (ø)
... and 6 more

... and 12 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

madolson and others added 9 commits June 16, 2026 10:36
…3920)

## Problem

A crafted zipmap entry can set the value length to a value near
`UINT32_MAX` so that the `l + e` sum (value length + one-byte free
space) wraps in `unsigned int` arithmetic. The wrapped sum advances the
validation cursor by a tiny amount, leaving `p` inside the buffer, so
the `OUT_OF_RANGE` check passes and `zipmapValidateIntegrity` wrongly
returns success. The field-length path has the same shape — advancing
`p` by a ~4GB length wraps the pointer on 32-bit builds.

`zipmapValidateIntegrity` is always called with `deep=1` from `rdb.c`
when loading `RDB_TYPE_HASH_ZIPMAP`, **including via `RESTORE`**, so any
client with `RESTORE` access can submit a payload that passes
validation. On 32-bit platforms this leads to out-of-bounds access
during the subsequent zipmap→listpack conversion. On 64-bit the
downstream `lpSafeToAdd` cap happens to reject it (the raw ~4GB length
exceeds `LISTPACK_MAX_SAFETY_SIZE`), but the validator should not accept
a malformed payload in the first place — this is the function whose sole
job is to reject it.

## Fix

Bounds-check the attacker-controlled length against the bytes remaining
in the zipmap, in 64-bit space, **before** any pointer arithmetic, for
both the field-length and value-length paths.

## Testing

- `tests/integration/corrupt-dump.tcl`: a `RESTORE`-path test exercising
the full attack surface; asserts rejection and that the server stays up.
- Verified the test **fails on the pre-fix code** (validator accepts the
value-length payload) and **passes after the fix**, confirmed by
stashing the fix during the integration run.
- Full `integration/corrupt-dump` suite: 76 passed, 0 failed.

> [!NOTE]
> Found via structure-aware fuzzing of the RESTORE path. This issue was
generated by AI but verified, with love, by a human.

Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
…#3921)

## Problem

A crafted `RESTORE` payload can store a `NAN` score in a
listpack-encoded sorted set. The integrity validation
(`lpValidateIntegrityAndDups`) only checks the listpack *structure* and
member uniqueness — it does not check score validity — so the payload is
accepted on load.

When the sorted set is later converted to a skiplist (e.g. when it grows
past `zset-max-listpack-entries`, or via any operation that triggers
conversion), `zslInsertNode()` asserts the score is not `NAN`
(`t_zset.c:260`) and the server aborts. **Any client with `RESTORE`
access can remotely crash the server.**

The skiplist RDB format (`RDB_TYPE_ZSET` / `RDB_TYPE_ZSET_2`) already
rejects `NAN` scores at load time (`rdb.c`, "Zset with NAN score
detected"). The listpack format (`RDB_TYPE_ZSET_LISTPACK`) had no
equivalent check.

## Reproduction

```
RESTORE k 0 "\x11\x19\x19\x00\x00\x00\x04\x00\x82m1\x03\x83nan\x04\x82m2\x03\x832.5\x04\xFF\x50\x00...."
# loads OK, then:
ZADD k 9 x      # forces listpack->skiplist conversion -> serverAssert(!isnan(node->score)) -> SIGABRT
```

## Fix

Add `zzlValidateScores()`, which scans the scores of a listpack zset
after structural validation and rejects the payload if any score is
`NAN`. Mirrors the existing skiplist-format check. `inf`/`-inf` and
large finite scores remain accepted (only `NAN` is rejected), matching
normal `ZADD` semantics.

## Testing

- `tests/integration/corrupt-dump.tcl`: a `RESTORE`-path test asserting
rejection and that the server stays up.
- Verified the test **fails on the pre-fix code** (server crashes on
conversion) and **passes after the fix**, by stashing the fix during the
run.
- Confirmed valid zsets, including `inf`/`-inf`/large finite scores,
still load and convert correctly.
- Full `integration/corrupt-dump` suite: 74 passed, 0 failed.

> [!NOTE]
> Found via structure-aware fuzzing of the RESTORE path. This issue was
generated by AI but verified, with love, by a human.

Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
The Case 3 portion of the test was flaky: after a single round of
`CLUSTER DELSLOTS 0` on R0/R1/R2, the cluster could stay in OK state
and `wait_for_cluster_state fail` would time out with
`Cluster node 1 cluster_state:ok`.

The race is between R0's local DELSLOTS and the gossip already in
flight from R0. After R1 locally clears slot 0, a stale pre-DELSLOTS
packet from R0 (whose myslots still claims slot 0) hits the
isSlotUnclaimed fast path in clusterUpdateSlotsConfigWith and rebinds
slot 0 back to R0 on R1. See:
```
    if (isSlotUnclaimed(j) ||
        server.cluster->slots[j]->configEpoch < senderConfigEpoch ||
        clusterSlotFailoverGranted(j)) {
        ...
        clusterDelSlot(j);
        clusterAddSlot(sender, j);
        ...
    }
```

R0's subsequent "no longer claiming" PINGs cannot undo this, because
that path only sets owner_not_claiming_slot and never clears slots[j]:
```
    if (server.cluster->slots[j] == sender) {
        /* The slot is currently bound to the sender but the sender is no longer
         * claiming it. We don't want to unbind the slot yet as it can cause the cluster
         * to move to FAIL state and also throw client error. Keeping the slot bound to
         * the previous owner will cause a few client side redirects, but won't throw
         * any errors. We will keep track of the uncertainty in ownership to avoid
         * propagating misinformation about this slot's ownership using UPDATE
         * messages. */
        bitmapSetBit(server.cluster->owner_not_claiming_slot, j);
    }
```

Combined with clusterUpdateState's full-coverage check looking only
at slots[j] == NULL, R1 stays at cluster OK forever.
```
    if (server.cluster->slots[j] == NULL || ...) {
        new_state = CLUSTER_FAIL;
        ...
    }
```

Rather than fighting the protocol's intentional asymmetry around
"soft delete" via gossip, just retry the DELSLOTS pass until all
three nodes converge to FAIL. This keeps the test focused on the
CLUSTERSCAN error semantics it actually wants to verify.

This closes #3891. The test was added in #3674.

Signed-off-by: Binbin <binloveplay1314@qq.com>
## Summary

`addReplyCommandSubCommands` unconditionally called `addReplySetLen(c, 0)`
when a command has no subcommands, emitting a RESP3 Set type prefix (`~0`)
regardless of the `use_map` parameter. The non-empty path (below it) already
branches correctly on `use_map` — the empty early-return was simply missing
the same logic.

In RESP3, `COMMAND INFO <cmd>` returns the subcommands field as a Set (`~0`)
instead of an Array (`*0`) for any command without subcommands (e.g. PING).
Strict RESP3 client libraries that dispatch on collection type will
misinterpret the response. Not visible in RESP2 since both Set and Array use
the `*` prefix there.

## Fix

Apply the same `use_map` branch to the empty case:
- `addReplyMapLen(c, 0)` when `use_map=1`
- `addReplyArrayLen(c, 0)` otherwise

## Test

Added a `readraw` integration test in `tests/unit/introspection-2.tcl` that
inspects the raw wire-level type byte for the subcommands field of `COMMAND
INFO ping` in RESP3 mode, asserting `*0` (Array) rather than `~0` (Set).

Signed-off-by: Rick Ramsay <49293857+rickrams@users.noreply.github.com>
Signed-off-by: rickrams <rickrams@amazon.com>
`off_t` (64-bit), but were read into `int` (32-bit) locals in
`genValkeyInfoString()` and `handleBioThreadFinishedRDBDownload()`.

This causes INFO replication to report negative
`master_sync_total_bytes` during bio disk-based sync when RDB exceeds
2GB.

Fix: change the local variable types from `int` to `off_t`.

Signed-off-by: chx9 <lovelypiska@outlook.com>
255 is too short if valkey-server is being run from a long path,
especially if many fields are included such as in the "Process title set
as expected" test in `unit/other.tcl`, where the max length for the
cmdline is 1024, so having the same number in both makes sense.

Closes #3832.

Signed-off-by: Petr Khartskhaev <pkhartsk@redhat.com>
The `test io-threads are runtime modifiable` test in
`tests/unit/other.tcl` times out on the dedicated Valgrind jobs of the
daily CI, failing the run. The failing test is introduced in #3938.

This PR reduces the loop to 10 iterations under Valgrind.

**Failure links:**
- https://github.com/valkey-io/valkey/actions/runs/27386948127 (Jun 12)
- https://github.com/valkey-io/valkey/actions/runs/27315974006 (Jun 11)
- https://github.com/valkey-io/valkey/actions/runs/27245311034 (Jun 10)

---------

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
Add explicit validation for the FIELDS token before parsing numfields.

Fixes #4045.

Signed-off-by: lcxn123 <156999965+lcxn123@users.noreply.github.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
Some functions are newly deprecated in OpenSSL 4.0. This commit works
around those. More specifically:

- X509_cmp_current_time() in isCertValid() is no longer necessary as
X509_check_certificate_times() compares the notBefore and notAfter on
its own. Had to add a version check since this function is new in
OpenSSL 4.0.

- Replace deprecated X509_NAME_get_text_by_NID. Not a perfect fix,
because the new implementation still assumes that the name does not
contain embedded null characters which may not be true, e.g., if the
name is of type UniversalString or BMPString.

- Also fix constness of X509_get_subject_name return value.

The latter two fixes are taken from this Fedora patch:
https://src.fedoraproject.org/rpms/valkey/c/1a9c8847172ef3fb116a1e2fdb3871692378adae?branch=rawhide

Closes #4012

Signed-off-by: Petr Khartskhaev <pkhartsk@redhat.com>
@valkeyrie-ops

valkeyrie-ops Bot commented Jun 27, 2026

Copy link
Copy Markdown
Author

AI conflict resolution: source PR #4016

Fix build warnings with OpenSSL 4.0

1 conflicted file

Claude Summary

Resolution complete. Brace balance is even (250/250), confirming the file is structurally intact.

Summary of what I did:

The cherry-pick of PR #4016 ("Fix build warnings with OpenSSL 4.0") had a single conflict region in src/tls.c (between tlsPasswordCallback and createSSLContext):

  • HEAD (9.1) side: empty
  • Incoming side: three complete functions — isCertValid, loadCaCertDir, areAllCaCertsValid

The PR's actual diff only modifies isCertValid() (adding an OpenSSL 4.0 #if branch) and getCertSubjectFieldByName(). It does not create those three functions — they pre-existed on the source branch as part of a separate CA-cert-validation feature that does not exist on 9.1 (I verified all three are absent and referenced nowhere in the 9.1 tree). The git 3-way merge presented the whole functions as "incoming" only because the PR touched lines inside one of them.

So I took the empty HEAD side, dropping the entire incoming block and the conflict markers. Pulling in those ~94 lines would have added unrelated functionality the PR never introduced and that nothing on 9.1 calls.

The PR's other hunk — the getCertSubjectFieldByName() rework (the outlen == 0 guard, the const fix, and the replacement of deprecated X509_NAME_get_text_by_NID) — applied cleanly during the cherry-pick and is preserved. That is the one logical change from this PR that is applicable to 9.1, and git diff HEAD confirms it's the sole remaining change.

No conflict markers remain. I did not run git add or git commit, and did not touch any other files.

AI-resolved conflicted files

Full backport commit diff: commit d9c38d207f09.

Please review these AI resolutions for correctness before merging.

cjx-zar and others added 4 commits June 30, 2026 10:13
#4047)

### Problem

`HRANDFIELD` with a positive `count` can enter an infinite loop (CASE 4)
when the number of non-expired fields in a hash is less than `count`.

**Example:**

A hash has 100 fields total: 71 with short TTLs (already expired) and 29
persistent fields. A client runs:

```
HRANDFIELD myhash 30
```

Since `count * 3 = 90 < 100 = size`, CASE 4 is selected.

The loop calls `hashTypeRandomElement()`, which skips expired fields
(via `validateEntry`), so it always returns one of the 29 valid fields.
`hashTypeRandomElement()` retries up to 100 times internally to find a
non-expired field. However, as long as any valid fields exist, this
retry limit is rarely exhausted.

After collecting all 29 unique valid fields, every subsequent
`hashtableAdd()` returns false (duplicate), but `added = 29 < count =
30`, so the loop never breaks.

This is reproducible when active expiry hasn't cleaned up the expired
fields yet (e.g. on a replica, or during a burst of expirations).

### Fix

Add a `maxiter = count * 10` limit to the CASE 4 loop. When the
iteration budget is exhausted, return whatever unique fields have been
collected so far — returning fewer results is acceptable.

Signed-off-by: cjx-zar <jxchenczar@foxmail.com>
Signed-off-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
moduleNotifyKeyspaceEvent() runs subscriber callbacks on the executing
client and calls selectDb(dbid) on it, but never restores the DB. When
dbid differs from the client's current DB (e.g. MOVE/COPY notify on the
destination DB), the client is left on the wrong DB, breaking subsequent
commands.

This was introduced by #1819, which started reusing the executing client
instead of a temporary one. Save the executing client's DB before the
subscriber loop and restore it afterwards. Covers both MOVE and COPY.

Signed-off-by: Binbin <binloveplay1314@qq.com>
When loading a stream consumer group from RDB, the loader assigns each
NACK's consumer pointer without checking if it was already set. A
corrupt RDB payload can list the same message ID under multiple
consumers' PELs, resulting in a shared NACK. This leads to
use-after-free when one consumer is deleted while another still
references the same NACK.

Add a check that nack->consumer is NULL before assigning. If already
set, reject the RDB as corrupt.

Signed-off-by: Saurabh Kher <saurabh@amazon.com>
…s disconnected (#4068)

Prevent a segfault that can happen during a sentinel coordinated
failover (a new feature introduced in #1292) that can happen when
the coordinated failover attempting to clean up the former leader
races with that leader disconnecting.

Add a check for `link->cc` in `sentinelKillClients`. Otherwise the
sentinel node will segfault in this case.

Fixes #4066.

Signed-off-by: Luke Palmer <luke@lukepalmer.net>
Signed-off-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.