Skip to content

perf: optimize array stdlib hot paths#811

Merged
stephenamar-db merged 1 commit intodatabricks:masterfrom
He-Pin:perf/lazy-array-stack-safe
May 1, 2026
Merged

perf: optimize array stdlib hot paths#811
stephenamar-db merged 1 commit intodatabricks:masterfrom
He-Pin:perf/lazy-array-stack-safe

Conversation

@He-Pin
Copy link
Copy Markdown
Contributor

@He-Pin He-Pin commented Apr 30, 2026

Summary

This PR now keeps only generic stdlib hot-path improvements. In response to review, it removes the benchmark-specific identity-composition proof for function(x) f(f(x)) chains and also removes the std.map(function(x) x, arr) identity shortcut. The bench.07 identity-composition benchmark is intentionally no longer used as evidence for this PR.

  • Add LazyIndexedArr for object value/key-value lazy indexed views, keeping std.objectKeysValues(...)[i].key from forcing .value.
  • Make std.repeat(array, count) an indexed view, while keeping the old System.arraycopy bulk materialization path for asLazyArray.
  • Keep std.map as per-element LazyApply1 thunks, but read source arrays through arr.eval(i) so existing array views do not have to materialize first.
  • Optimize standard set/sort keyF=id and function(x) -x range paths.
  • Use StandardCharsets.UTF_8 and a tighter base64 array loop.

JVM/code-quality notes:

  • Broader MappedArr / MakeArrayArr indexed views are intentionally not included; local JMH showed they regress common JVM full-consumption paths because the shared state side table adds dispatch and memory traffic.
  • LazyIndexedArr caches successful values only. Exceptions reset the slot to null, matching existing LazyFunc / LazyExpr retry behavior and avoiding observable std.trace/error behavior changes.
  • The remaining function identity check is limited to structural keyF=id handling in set/sort-style APIs, not general function application or benchmark-specific composition proving.

Correctness

Added coverage for:

  • std.objectKeysValues({a: error ...})[0].key not forcing .value
  • std.objectKeysValues({a: error ...})[0].value still forcing and reporting the source error
  • std.map laziness on arrays, without a map identity fast path
  • empty std.map / std.sort not forcing callback bodies

Ran:

  • ./mill -i 'sjsonnet.jvm[3.3.7].test.testOnly' sjsonnet.StdMapTests sjsonnet.StdWithKeyFTests sjsonnet.TailCallOptimizationTests sjsonnet.EvaluatorTests
  • ./mill -i 'sjsonnet.jvm[3.3.7].test'
  • ./mill -i '__.checkFormat'
  • ./mill -i 'sjsonnet.js[3.3.7].compile'
  • ./mill -i 'sjsonnet.native[3.3.7].compile'
  • git diff --check

Benchmarks

Environment: same machine, Zulu JDK 21.0.10, G1GC, -Xss100m, JMH 1.37. Lower is better. This table excludes bench.07 because that workload is specifically the identity-composition case discussed in review.

Command:

./mill -i bench.runRegressions \
  bench/resources/cpp_suite/bench.04.jsonnet \
  bench/resources/cpp_suite/bench.06.jsonnet \
  bench/resources/cpp_suite/bench.09.jsonnet \
  bench/resources/cpp_suite/gen_big_object.jsonnet \
  bench/resources/go_suite/base64_stress.jsonnet \
  bench/resources/cpp_suite/large_string_join.jsonnet
benchmark master PR result
bench.04 0.105 ms/op 0.105 ms/op neutral
bench.06 0.209 ms/op 0.137 ms/op 1.53x faster
bench.09 0.041 ms/op 0.038 ms/op neutral
gen_big_object 0.859 ms/op 0.792 ms/op neutral
base64_stress 0.172 ms/op 0.173 ms/op neutral
large_string_join 0.538 ms/op 0.537 ms/op neutral

Comment thread sjsonnet/src/sjsonnet/Val.scala Outdated

var slow: Func = this
var fast: Func = this
// Walk function(x) f(f(x)) chains without recursion. The fast pointer detects cyclic
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a valid optimization, such specialization seem to defeat the nature of this benchmark

Sure, this is an identity function, but people would not write it this way, so it shouldn't be optimized as one

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in Pekko Stream, the identity function is been purged too, will remove this part out of this pr.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I removed the call-time identity-composition proof for function(x) f(f(x)) and also removed the std.map(function(x) x, arr) identity shortcut, so this PR no longer treats that benchmark as an identity case.

I also dropped bench.07 from the benchmark evidence and updated the PR title/body to focus on the remaining generic stdlib paths: object key/value lazy indexed views, array repeat as an indexed view, set/sort keyF range fast paths, and the base64/string cleanup.

@He-Pin He-Pin force-pushed the perf/lazy-array-stack-safe branch from b71c7d4 to abe9ca3 Compare May 1, 2026 03:13
@He-Pin He-Pin changed the title perf: optimize lazy identity benchmark paths perf: optimize array stdlib hot paths May 1, 2026
@stephenamar-db stephenamar-db merged commit 76f5ad2 into databricks:master May 1, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants