[Fix][Transform-V2] Correct EXTRACT epoch and subsecond units#11142
[Fix][Transform-V2] Correct EXTRACT epoch and subsecond units#11142fallintoplace wants to merge 2 commits into
Conversation
15a9c24 to
6974978
Compare
6974978 to
e553300
Compare
DanielLeens
left a comment
There was a problem hiding this comment.
Thanks for the contribution. I reviewed the full current diff from the real Zeta SQL EXTRACT(...) path, not just the added assertions.
What this PR fixes
- Pain point:
EXTRACT(EPOCH ...)can overflow after 2038 because the result is forced throughint, andEXTRACT(MILLISECOND / MICROSECOND ...)only returns the fractional part instead of the total milliseconds / microseconds within the second. - Approach: change the runtime
extract(...)implementation to returnNumber, keepEPOCHaslong, and compute millisecond / microsecond values assecond * unit + fractional-part. At the same time, updateZetaSQLType.getExtractType(...)soEPOCHis typed asBIGINT. - Simple example:
EXTRACT(EPOCH FROM TIMESTAMP '2040-01-01 00:00:00')now stays a correct 64-bit value instead of overflowing throughint.
Full runtime path I reviewed
SQL EXTRACT(...)
-> parser builds ExtractExpression
-> ZetaSQLType.getExpressionType(...) [ZetaSQLType.java:166-168]
-> getExtractType(...) [ZetaSQLType.java:322-326]
-> EPOCH => BIGINT
-> others => INT
execution
-> ZetaSQLFunction.EXTRACT
-> DateTimeFunction.extract(...) [DateTimeFunction.java:346-572]
-> EPOCH uses toEpochSecond(...)
-> MILLISECOND / MICROSECOND combine second + fractional nanos
Findings
- The normal path definitely hits this change; this is not a dormant helper.
- The type system and runtime are now aligned for
EPOCH, which closes the 2038 overflow path cleanly. - The new
MILLISECOND/MICROSECONDbehavior matches the expected SQL semantics of total milliseconds / microseconds within the second. - I did not find a new correctness, compatibility, or test-stability blocker on the current head.
Test coverage / stability
ExtractFunctionTestnow covers the fractional-unit semantics and anEPOCHvalue after 2038.- These tests are fully local and deterministic; I would rate them as stable.
Merge conclusion
Conclusion: can merge
- Blockers
- None on the current head.
- Non-blocking suggestions
- None.
Overall, this is a focused fix on a real main-path SQL behavior mismatch, and the current version looks good to merge from my side.
|
Good job. Thanks a lot for your valuable contribution to Apache SeaTunnel! To pass the automated CI checks for this PR, could you please enable CI workflows in your forked repository first? You can refer to the check run link below as a reference for the required CI configuration:https://github.com/apache/seatunnel/pull/11142/checks?check_run_id=82484983181 If you run into any trouble enabling CI or encounter related errors during the process, feel free to leave a comment here. Contributors are always ready to offer help~
|
davidzollo
left a comment
There was a problem hiding this comment.
+1 if CI passes.
LGTM
|
Thanks. I rechecked the current state here and the code head has not changed since my approval. From my side there is nothing new to re-review on the source path itself, so this stays in the approved / waiting-for-CI state. Once the CI workflow is enabled and the checks rerun, happy to take another look if a new code diff appears. |

Purpose of this pull request
Fix Zeta SQL
EXTRACTbehavior to match the documented datetime fields.This patch:
EXTRACT(EPOCH ...)so timestamps after 2038 do not overflow;MILLISECOND(S)andMICROSECOND(S);Does this PR introduce any user-facing change?
Yes.
EXTRACT(EPOCH ...)now returns BIGINT instead of INT, andEXTRACT(MILLISECOND(S)|MICROSECOND(S) ...)now includes the whole seconds field as documented.How was this patch tested?
Added regression coverage for post-2038
EPOCH, plural/singular millisecond and microsecond fields, and output type inference forEPOCH../mvnw -pl seatunnel-shade/seatunnel-calcite -DskipTests -DskipITs -Dcheckstyle.skip -Dspotless.check.skip install./mvnw -pl seatunnel-shade/seatunnel-janino -DskipTests -DskipITs -Dcheckstyle.skip -Dspotless.check.skip install./mvnw -pl seatunnel-transforms-v2 -Dtest=org.apache.seatunnel.transform.sql.zeta.ExtractFunctionTest -DskipITs -Dcheckstyle.skip -Dspotless.check.skip test./mvnw -pl seatunnel-transforms-v2 -DskipITs -Dcheckstyle.skip -Dspotless.check.skip test./mvnw -pl seatunnel-transforms-v2 spotless:checkCheck list