perf: optimize array stdlib hot paths#811
Conversation
|
|
||
| var slow: Func = this | ||
| var fast: Func = this | ||
| // Walk function(x) f(f(x)) chains without recursion. The fast pointer detects cyclic |
There was a problem hiding this comment.
I don't think this is a valid optimization, such specialization seem to defeat the nature of this benchmark
Sure, this is an identity function, but people would not write it this way, so it shouldn't be optimized as one
There was a problem hiding this comment.
Yes, in Pekko Stream, the identity function is been purged too, will remove this part out of this pr.
There was a problem hiding this comment.
Good point. I removed the call-time identity-composition proof for function(x) f(f(x)) and also removed the std.map(function(x) x, arr) identity shortcut, so this PR no longer treats that benchmark as an identity case.
I also dropped bench.07 from the benchmark evidence and updated the PR title/body to focus on the remaining generic stdlib paths: object key/value lazy indexed views, array repeat as an indexed view, set/sort keyF range fast paths, and the base64/string cleanup.
b71c7d4 to
abe9ca3
Compare
Summary
This PR now keeps only generic stdlib hot-path improvements. In response to review, it removes the benchmark-specific identity-composition proof for
function(x) f(f(x))chains and also removes thestd.map(function(x) x, arr)identity shortcut. Thebench.07identity-composition benchmark is intentionally no longer used as evidence for this PR.LazyIndexedArrfor object value/key-value lazy indexed views, keepingstd.objectKeysValues(...)[i].keyfrom forcing.value.std.repeat(array, count)an indexed view, while keeping the oldSystem.arraycopybulk materialization path forasLazyArray.std.mapas per-elementLazyApply1thunks, but read source arrays througharr.eval(i)so existing array views do not have to materialize first.keyF=idandfunction(x) -xrange paths.StandardCharsets.UTF_8and a tighter base64 array loop.JVM/code-quality notes:
MappedArr/MakeArrayArrindexed views are intentionally not included; local JMH showed they regress common JVM full-consumption paths because the shared state side table adds dispatch and memory traffic.LazyIndexedArrcaches successful values only. Exceptions reset the slot tonull, matching existingLazyFunc/LazyExprretry behavior and avoiding observablestd.trace/error behavior changes.keyF=idhandling in set/sort-style APIs, not general function application or benchmark-specific composition proving.Correctness
Added coverage for:
std.objectKeysValues({a: error ...})[0].keynot forcing.valuestd.objectKeysValues({a: error ...})[0].valuestill forcing and reporting the source errorstd.maplaziness on arrays, without a map identity fast pathstd.map/std.sortnot forcing callback bodiesRan:
./mill -i 'sjsonnet.jvm[3.3.7].test.testOnly' sjsonnet.StdMapTests sjsonnet.StdWithKeyFTests sjsonnet.TailCallOptimizationTests sjsonnet.EvaluatorTests./mill -i 'sjsonnet.jvm[3.3.7].test'./mill -i '__.checkFormat'./mill -i 'sjsonnet.js[3.3.7].compile'./mill -i 'sjsonnet.native[3.3.7].compile'git diff --checkBenchmarks
Environment: same machine, Zulu JDK 21.0.10, G1GC,
-Xss100m, JMH 1.37. Lower is better. This table excludesbench.07because that workload is specifically the identity-composition case discussed in review.Command:
bench.04bench.06bench.09gen_big_objectbase64_stresslarge_string_join