[Improve][Transform-V2] Migrate transforms validation to declarative OptionRule#11095
[Improve][Transform-V2] Migrate transforms validation to declarative OptionRule#11095nzw921rx wants to merge 21 commits into
Conversation
…nsformFactoryTest Restore the accidentally deleted testFactoryIdentifierAndOptionRule() and testCreateTransformReturnsMultiCatalogTransform() tests. The new OptionRule validation tests are appended alongside them.
…tion Update FilterFieldTransformTest and ReplaceTransformTest to validate through Factory optionRule instead of expecting constructor-level validation that was migrated to declarative OptionRule.
DanielLeens
left a comment
There was a problem hiding this comment.
Thanks for the contribution. I reviewed the current diff against dev and checked the declarative validation path across the updated transform factories and the added factory-level tests.
What this PR solves
- User pain: many transform config errors are discovered too late or with weak messages.
- Fix approach: move more validation into declarative
OptionRulechecks and add focused factory tests around those contracts. - One-line summary: I did not find a new blocking regression in the current validation path.
Runtime path rechecked
job config
-> transform factory optionRule()
-> central config validator
-> transform instance creation only after contract checks pass
I spot-checked the updated factories that carry the highest contract risk here (for example DefineSinkType, Copy, JsonPath, RegexExtract, and TableFilter) and the current direction looks coherent: the new rules fail earlier and the added tests cover the new validator behavior directly.
CI
The PR-level Build check is red right now, so I would still wait for the workflow result before merge.
Conclusion: can merge after CI is green
- Blocking items
- No new source-level blocker from me on the current head.
- Please keep an eye on the current
Buildfailure before merge.
- Suggested follow-up
- If the CI failure turns out to come from one of the newly tightened validation contracts, I would mirror that exact case into a focused factory test rather than only relying on broader integration coverage.
Thanks for pushing the validation cleanup forward.
|
Thanks for the cleanup — moving the scattered imperative checks to declarative DynamicCompile lost its conditional-required validationIn // dev: source_code is REQUIRED when compile_pattern = SOURCE_CODE
.conditional(COMPILE_PATTERN, SOURCE_CODE, DynamicCompileTransformConfig.SOURCE_CODE)
// this PR: only a not-blank value constraint, no longer required
.conditional(COMPILE_PATTERN, SOURCE_CODE, Conditions.notBlank(SOURCE_CODE))These two overloads are not equivalent. The old
Since I verified this locally with a probe test ( Suggested fix — keep it required and add the not-blank check: .conditional(COMPILE_PATTERN, CompilePattern.SOURCE_CODE, SOURCE_CODE)
.conditional(COMPILE_PATTERN, CompilePattern.SOURCE_CODE, Conditions.notBlank(SOURCE_CODE))
.conditional(COMPILE_PATTERN, CompilePattern.ABSOLUTE_PATH, ABSOLUTE_PATH)
.conditional(COMPILE_PATTERN, CompilePattern.ABSOLUTE_PATH, Conditions.notBlank(ABSOLUTE_PATH))and please add negative tests for the missing-key case (and for "neither source_code nor absolute_path set"). The "errors are aggregated" claim doesn't hold for extension validatorsThe PR description says validation errors are now aggregated instead of failing on the first error. But the New rules / tightening beyond a pure migration — please document for backward-compatA few changes go beyond equivalent migration and can reject configs that worked before:
These are mostly reasonable "fail earlier", but per the project's backward-compatibility policy they should be called out in the PR description and recorded in Minor
Overall: good direction and good test effort — just please fix the DynamicCompile conditional-required regression (with missing-key tests) and document the tightened/added rules before merge. |
|
Thanks for adding this detailed follow-up. I checked the +1 to @zhangshenghang's point here: the contributor should address that conditional-required gap before this PR moves forward. Since there is no new commit yet, I am not starting another full review round on the unchanged head right now, but I am happy to re-review the latest code once an update is pushed. |
…ssion and ConditionExtension aggregation
DanielLeens
left a comment
There was a problem hiding this comment.
Thanks for the update. I re-reviewed the latest head from scratch.
+1 to @zhangshenghang's earlier DynamicCompile point: the conditional-required regression is fixed on the current head by DynamicCompileTransformFactory.java:42-57, and the new negative tests at DynamicCompileTransformFactoryTest.java:83-104 now cover the missing-key paths that were previously slipping through.
I still see one blocker before merge, plus one follow-up that is worth tightening up.
What this PR solves
- Pain point: several transform plugins were still discovering config errors too late, with validation split across constructors, runtime logic, and ad hoc checks.
- Approach: move more of that validation into declarative
OptionRulerules and add focused factory-level tests around the migrated contracts. - One-line summary: the direction is good, but the current head still leaves a compatibility-documentation gap, and the new "ConditionExtension aggregation" cleanup is only partial.
Full runtime path I rechecked
job config
-> transform factory optionRule()
-> ConfigValidator.validate(optionRule)
-> required / optional / conditional checks
-> Conditions.notBlank / notEmpty / greaterThan
-> Conditions.extension(..., ConditionExtension)
if submission-time validation passes
-> createTransform(context)
-> transform instance is created and enters the runtime path
current head specifics
-> DynamicCompileTransformFactory [DynamicCompileTransformFactory.java:38-60]
-> SOURCE_CODE / ABSOLUTE_PATH are again conditionally required
-> CopyFieldTransformFactory / TableFilterTransformFactory
-> now return false from the extension validator, which allows aggregation to continue
-> DefineSinkType / JsonPath / RegexExtract / DataValidator
-> still throw OptionValidationException directly from ConditionExtension
Findings
Issue 1: the newly tightened validation contracts are still undocumented as incompatible changes
- Location:
seatunnel-transforms-v2/src/main/java/org/apache/seatunnel/transform/adaptsink/DefineSinkTypeTransformFactory.java:46-55,79-96seatunnel-transforms-v2/src/main/java/org/apache/seatunnel/transform/encrypt/FieldEncryptTransformFactory.java:40-55seatunnel-transforms-v2/src/main/java/org/apache/seatunnel/transform/dynamiccompile/DynamicCompileTransformFactory.java:38-60docs/en/introduction/concepts/incompatible-changes.md:6-104docs/zh/introduction/concepts/incompatible-changes.md
- Why this is a problem:
- This PR is doing more than just moving existing failures earlier. It also tightens several submission-time contracts that can now reject configs that previously passed.
- Concrete examples on the current head include duplicate-column rejection in
DefineSinkType,max_field_length > 0inFieldEncrypt, and the restored conditional-required checks inDynamicCompile. - Per the project's compatibility policy, those user-visible rejections need to be recorded in the incompatible-changes docs with migration guidance. That documentation is still missing.
- Risk:
- Users upgrading into this behavior will hit new submission-time failures without release-note coverage or a clear migration path.
- It also makes it harder for maintainers to distinguish an intentional contract tightening from an accidental regression later on.
- Best fix:
- Add entries to both
docs/en/introduction/concepts/incompatible-changes.mdanddocs/zh/introduction/concepts/incompatible-changes.md. - Please spell out which transforms changed, which old configs are now rejected, and how users should update those configs.
- Add entries to both
- Severity: High
- Raised by others already: Yes. This is directly related to @zhangshenghang's earlier compatibility note; I am adding the code-level evidence for the current head.
Issue 2: the ConditionExtension aggregation cleanup is still only partial
- Location:
seatunnel-transforms-v2/src/main/java/org/apache/seatunnel/transform/adaptsink/DefineSinkTypeTransformFactory.java:67-99seatunnel-transforms-v2/src/main/java/org/apache/seatunnel/transform/jsonpath/JsonPathTransformFactory.java:65-102seatunnel-transforms-v2/src/main/java/org/apache/seatunnel/transform/regexextract/RegexExtractTransformFactory.java:67-87seatunnel-transforms-v2/src/main/java/org/apache/seatunnel/transform/validator/DataValidatorTransformFactory.java:67-99- contract reference:
docs/en/introduction/concepts/incompatible-changes.md:40-60
- Why this matters:
- The latest follow-up commit clearly tries to improve aggregation by converting
CopyFieldandTableFiltertodescription() + return false. - But the factories above still throw
OptionValidationExceptiondirectly fromConditionExtension, so validation still aborts on the first extension error there. - In other words, the current head still has mixed behavior: plain constraints aggregate, but several extension validators remain fail-fast.
- The latest follow-up commit clearly tries to improve aggregation by converting
- Risk:
- The migrated transforms still do not present a consistent validation UX.
- The commit message / PR scope now reads more complete than the actual behavior on the current head.
- Best fix:
- Option A: convert the remaining extension validators to boolean/description-style failures as well, so the aggregation behavior is actually consistent.
- Option B: if that is outside the intended scope for this PR, narrow the stated scope and description so it does not imply the aggregation problem is solved across the board.
- Severity: Medium
- Raised by others already: Yes. This is related to @zhangshenghang's earlier aggregation point; I am confirming that it is still not fully closed on the latest head.
Tests and CI
- The new
DynamicCompileTransformFactoryTest.java:83-104coverage is valuable and directly closes the missing-key regression proof I was looking for on that path. - The visible PR-level
Buildcheck is still shown as failed right now, but the check output points to the contributor fork runnzw921rx/seatunnel#27736094115, and that rerun is currently in progress. So I cannot call the current head green yet, but I also cannot attribute a fresh current-head failure to the transform code at this moment.
Merge conclusion
Conclusion: can merge after fixes
- Blockers
- Issue 1 above: please document the newly tightened transform validation behavior as incompatible changes, with migration notes, before merge.
- Non-blocking suggestions
- Issue 2 above: either finish the remaining aggregation cleanup in this PR, or narrow the claimed scope so it matches the actual implementation.
Overall, the DynamicCompile regression that @zhangshenghang flagged is fixed now, so this head is materially better than the one I reviewed before. The remaining blocker is no longer in the code path itself; it is the release / compatibility contract that still needs to be documented explicitly.
… document validation tightening as incompatible changes
@DanielLeens Thanks for the review, The |
DanielLeens
left a comment
There was a problem hiding this comment.
Thanks for the follow-up. I re-checked the latest head with the docs update and the remaining transform-factory validation path, and I do not see a blocker from the current source state.
What this PR fixes
- User pain: Transform validation was inconsistent and spread across different imperative paths, and the behavior change also needed explicit incompatible-change documentation.
- Fix approach: The PR migrates many transform factories to declarative validation and now also adds the missing incompatible-change docs in both languages.
- One-line summary: The missing documentation from the earlier round is now in place, and the current head looks mergeable from the source side.
Runtime path I traced
Job submission
-> transform-factory `optionRule()` paths
-> `RegexExtractTransformFactory.DefaultValuesLengthValidator` returns `false` on mismatch
-> `ConfigValidator` aggregates the failure during submission
Docs path
-> incompatible-change docs were added in both English and Chinese
Key findings
- The missing docs from the earlier round are now in place.
- The current validator style matches the shared aggregated-validation model.
- I do not see a new blocker on the current head.
Blocking items
None at the source-code level on the current head.
Other review notes
- No additional non-blocking source-level comments from this round.
Compatibility and side effects
- Compatibility: The behavior change is now documented, and I do not see a source-level blocker on the current head.
- Side effects: No meaningful resource-model concern from the latest head.
- Error handling / logging: The validation path still uses the shared aggregated-error flow.
- Tests: The PR has broad validation coverage already, and I did not see a new flaky pattern in the current head.
- Docs: The incompatible-change docs are now present in both English and Chinese.
- CI: Checks were still queued when I reviewed.
Merge verdict
- I do not see a blocker on the current head. The missing incompatible-change documentation from the earlier round is now in place, and the current validation changes look reasonable.
Purpose of this pull request
Related #11007
OptionRule + ConditionsTransforms migrated:
Does this PR introduce any user-facing change?
1. Validation errors behavior by transform:
All transforms now use declarative
OptionRulefor submission-time validation. Error reporting behavior depends on the validation type:columns[2]: 'type' must not be null)2. Tightened validation (configs that previously passed may now be rejected at submission time):
DefineSinkType: rejects duplicate column namesFieldEncrypt: requiresmax_field_length > 0FieldMapper,Metadata,JsonPath,Sql,DynamicCompile: core options now enforced as required3. Transform option rules are now fully declared and can be exported via REST API and CLI metadata.
How was this patch tested?
./mvnw test -pl seatunnel-transforms-v2