Before Creating the Enhancement Request
Summary
Reduce allocation in the pull/dispatch path by replacing boxed collections with primitive arrays, reusing DispatchRequest via ThreadLocal, merging mapped file slices, and eliminating CompletableFuture callback lambdas.
Motivation
JFR profiling on the broker pull/dispatch path reveals several per-message allocation hotspots:
-
GetMessageResult — stored message offsets as List<Long>, boxing every long into a Long object. Under high pull QPS, this creates thousands of short-lived Long objects and ArrayList resize overhead per second.
-
DispatchRequest — a new DispatchRequest object is created for every message dispatched to ConsumeQueue/IndexService/TimerWheel. The object has mutable fields that could be reset and reused via ThreadLocal.
-
DefaultMappedFile.selectMappedBuffer — creates two separate ByteBuffer slices for position+size, then wraps them. Can be merged into a single slice operation.
-
DefaultMessageStore.putMessage/putMessages — wraps asyncPutMessage result in a thenAccept lambda callback for stats logging. The lambda captures this and beginTime, creating a closure object per message.
Describe the Solution You'd Like
GetMessageResult: replace List<Long> with long[] + add addQueueOffset(long) method. Right-size initial capacity with constructor parameter.
DispatchRequest: change final fields to mutable + add reset() method for ThreadLocal reuse.
DefaultMappedFile: merge dual-slice into single selectMappedBuffer operation with cached append slice.
DefaultMessageStore: remove thenAccept callback, inline stats logging into CommitLog or caller.
ConsumeQueue: make topicQueueKey a final field to avoid per-call computation.
Describe Alternatives You've Considered
- Use
LongAdder instead of long[] for offsets — not applicable, offsets need ordering.
- Keep
thenAccept callback but use a static method reference — still captures this, doesn't eliminate allocation.
- Use object pool instead of ThreadLocal for DispatchRequest — ThreadLocal is simpler and sufficient for single-threaded dispatch.
Additional Context
Part of a larger JFR-driven optimization effort. Related PRs: #10443, #10444, #10514, #10524.
Before Creating the Enhancement Request
Summary
Reduce allocation in the pull/dispatch path by replacing boxed collections with primitive arrays, reusing DispatchRequest via ThreadLocal, merging mapped file slices, and eliminating CompletableFuture callback lambdas.
Motivation
JFR profiling on the broker pull/dispatch path reveals several per-message allocation hotspots:
GetMessageResult— stored message offsets asList<Long>, boxing everylonginto aLongobject. Under high pull QPS, this creates thousands of short-livedLongobjects andArrayListresize overhead per second.DispatchRequest— a newDispatchRequestobject is created for every message dispatched to ConsumeQueue/IndexService/TimerWheel. The object has mutable fields that could be reset and reused via ThreadLocal.DefaultMappedFile.selectMappedBuffer— creates two separateByteBufferslices for position+size, then wraps them. Can be merged into a single slice operation.DefaultMessageStore.putMessage/putMessages— wrapsasyncPutMessageresult in athenAcceptlambda callback for stats logging. The lambda capturesthisandbeginTime, creating a closure object per message.Describe the Solution You'd Like
GetMessageResult: replaceList<Long>withlong[]+ addaddQueueOffset(long)method. Right-size initial capacity with constructor parameter.DispatchRequest: changefinalfields to mutable + addreset()method for ThreadLocal reuse.DefaultMappedFile: merge dual-slice into singleselectMappedBufferoperation with cached append slice.DefaultMessageStore: removethenAcceptcallback, inline stats logging intoCommitLogor caller.ConsumeQueue: maketopicQueueKeyafinalfield to avoid per-call computation.Describe Alternatives You've Considered
LongAdderinstead oflong[]for offsets — not applicable, offsets need ordering.thenAcceptcallback but use a static method reference — still capturesthis, doesn't eliminate allocation.Additional Context
Part of a larger JFR-driven optimization effort. Related PRs: #10443, #10444, #10514, #10524.