DX-105463: [C++][Gandiva] Add TimestampIR for unit-aware timestamp[us/ns] support#134
DX-105463: [C++][Gandiva] Add TimestampIR for unit-aware timestamp[us/ns] support#134lriggs merged 6 commits intodremio:dremio_27.0_20from
Conversation
…/ns] support Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
…tor, and test files
TimestampIR Generated FunctionsAll functions produced by kFixedAdds — 5 functions × 4 arg patterns × 2 units = 40Pure arithmetic:
kCalendarAdds — 3 functions × 4 arg patterns × 2 units = 24Split/recombine around precompiled millis function.
kExtracts — 14 functions × 1 signature × 2 units = 28Pattern: kTruncs — 11 functions × 1 signature × 2 units = 22Pattern: kDiffs — 8 functions × 1 signature × 2 units = 16Pattern: kTwoTsScalars — 2 functions × 1 signature × 2 units = 4Two-timestamp → scalar (float64 for months_between, int32 for datediff). kCastsFromTs — 3 functions × 1 signature × 2 units = 6Cast from timestamp to another type. kDateArithEntries — 7 entries × 2 int-types × 2 units = 28Fixed 1-day arithmetic.
Timezone — 2 functions × 2 units = 4Split-recombine: timezone offset is whole-second, sub-ms survives. castVARCHAR — 1 function × 2 units = 2Appends sub-ms digits to the millis formatter output. |
TimestampIR Test Coverage TableSource: Helper functions used throughout (defined in the test file):
Test constants ( Non-precision tests (ms only / date types)
Precision tests (ms / us / ns)Extract functions
Coverage: all 14 kExtracts × 2 units covered. Trunc functions
Coverage: all 11 kTruncs × 2 units covered. Fixed-add functions (kFixedAdds)
Coverage: all 5 kFixedAdds × 4 arg patterns × 2 units covered. Calendar-add functions (kCalendarAdds)
Coverage: all 3 kCalendarAdds × 4 arg patterns × 2 units covered. Diff functions (kDiffs)
Coverage: all 8 kDiffs × 2 units covered. Two-timestamp scalars (kTwoTsScalars)
Coverage: months_between_timestamp_timestamp_us/ns and datediff_timestamp_timestamp_us/ns covered. Cast functions (kCastsFromTs)
Coverage: castDATE_timestamp_us/ns, castTIME_timestamp_us/ns, last_day_from_timestamp_us/ns all covered. Date arithmetic (kDateArithEntries)
Coverage: all 7 kDateArithEntries × 2 int-types × 2 units covered. Timezone functions
Coverage: to_utc_timezone_timestamp_us/ns and from_utc_timezone_timestamp_us/ns covered. castVARCHAR
Coverage: castVARCHAR_timestamp_int64_us/ns covered. Coverage Summary
Known Gaps / Thin Areas
|
akravchukdremio
left a comment
There was a problem hiding this comment.
LGTM. I've found that we can implement one more function: next_day, but it's not a popular one I think. Just created a commit with adding this func on top of this branch: d256409. But I think also fine to merge as is without it
Summary
Adds
TimestampIR— an LLVM IR builder class modeled afterDecimalIR— that generates unit-aware wrapper functions at module-load time, enabling Gandiva functions registered fortimestamp[ms]to automatically handletimestamp[us]andtimestamp[ns]inputs without explicit per-unit registry entries.What changes are included in this PR?
timestamp_ir.h/cc(new):TimestampIRclass with wrapper patterns: pure IR arithmetic, calendar split/recombine, extract, trunc, diff, cast, and timezone wrappers. IncludesFloorDiv/FloorDivRemhelpers for correct floor-toward-negative-infinity semantics on pre-epoch timestamps.CMakeLists.txt+engine.cc: WireTimestampIR::AddFunctions()at module load alongsideDecimalIR.function_signature.cc:DataTypeEqualsforTIMESTAMPignoresTimeUnit; only timezone is significant for matching. This allows the existingtimestamp[ms]registry entries to match calls withtimestamp[us]/timestamp[ns]parameters.llvm_generator.cc/h:BuildFunctionCallinspects function descriptor for non-ms timestamp params and remaps to_us/_nssuffixed IR functions. PropagatesStatuserrors from the visitor back to the caller (previously silently dropped).precompiled/time.cc: Floor-division fix inDATE_TRUNC_FIXED_UNITfor pre-epoch (negative) timestamps; fix sub-second millis sign incastVARCHAR_timestamp_int64for negative timestamps.tests/date_time_test.cc: End-to-end C++ tests fortimestamp[us]andtimestamp[ns]through extract, trunc, arithmetic, and cast functions.Are these changes tested?
Yes —
tests/date_time_test.cccovers the new unit-aware paths. Also validated end-to-end througharrow-javaProjectorTestand Dremio integration tests (separate PRs).Are there any user-facing changes?
No API changes. Gandiva functions that previously only accepted
timestamp[ms]now transparently accepttimestamp[us]andtimestamp[ns].Safe degradation: If a caller uses an old native lib (pre-TimestampIR),
DataTypeEqualsfalls back to strict matching, function lookup fails at validation time, and the caller gets a clean error rather than silent wrong results.