Skip to content

[Improve][Connector-V2] Migrate Elasticsearch validation to declarative OptionRule#11122

Open
nzw921rx wants to merge 3 commits into
apache:devfrom
nzw921rx:improve/elasticsearch-optionrule-migration
Open

[Improve][Connector-V2] Migrate Elasticsearch validation to declarative OptionRule#11122
nzw921rx wants to merge 3 commits into
apache:devfrom
nzw921rx:improve/elasticsearch-optionrule-migration

Conversation

@nzw921rx

Copy link
Copy Markdown
Collaborator

Purpose of this pull request

Related #11007

  1. Migrate Elasticsearch connector validation from imperative checks to declarative OptionRule + Conditions
  2. Remove redundant runtime validation in Source/Auth providers where checks are now covered by optionRule()
  3. Add factory-level unit tests for Source/Sink/Catalog to cover positive and negative validation paths

Scope migrated:

  • ElasticsearchSourceFactory
  • ElasticsearchSinkFactory
  • ElasticSearchCatalogFactory
  • ElasticsearchSource (remove constructor-side config checks now handled by optionRule)
  • BasicAuthProvider / ApiKeyAuthProvider / ApiKeyEncodedAuthProvider (remove redundant validate checks)
  • New shared validator utility: ElasticsearchValidators

Does this PR introduce any user-facing change?

Yes, minor validation behavior improvements:

  1. Earlier and clearer submission-time validation

    • Rejects invalid configs at submission time instead of failing later at runtime.
    • Examples:
      • Source requires at least one of index / index_list
      • search_type=SQL requires non-blank sql_query
      • username / password must appear together
      • auth_type=api_key requires non-blank auth.api_key_id and auth.api_key
      • auth_type=api_key_encoded requires non-blank and valid Base64 id:key format
  2. Aggregated validation errors

    • New ConditionExtension validators return false (instead of throwing), so multiple config issues can be returned together in one validation result.
  3. Catalog validation is now fully declared

    • ElasticSearchCatalogFactory now defines required/optional/auth-related rules in optionRule(), enabling consistent validation metadata export via REST API and CLI.

How was this patch tested?

./mvnw spotless:apply -pl seatunnel-connectors-v2/connector-elasticsearch
./mvnw test -pl seatunnel-connectors-v2/connector-elasticsearch \
  -Dtest="ElasticsearchSourceFactoryTest,ElasticsearchSinkFactoryTest,ElasticSearchCatalogFactoryTest,ElasticsearchFactoryTest" \
  -DfailIfNoTests=false

@DanielLeens DanielLeens left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this. I went through the full diff and traced the actual source / sink / catalog creation paths on the current dev base.

The source-side and sink-side migration to declarative OptionRule + Conditions is moving in the right direction. In both paths the new rules are on the real hot path:

Source creation
  -> FactoryUtil.createAndPrepareSource(...) [seatunnel-api/.../FactoryUtil.java:184-193]
      -> ConfigValidator.validate(sourceFactory.optionRule()) [L191]
      -> ElasticsearchSourceFactory.createSource(...)
      -> new ElasticsearchSource(...)

Sink creation
  -> FactoryUtil.createAndPrepareSink(...) [seatunnel-api/.../FactoryUtil.java:210-260]
      -> ConfigValidator.validate(tableSinkFactory.optionRule()) [L252]
      -> ElasticsearchSinkFactory.createSink(...)

I did find one blocker before this can merge:

  1. ElasticSearchCatalogFactory.optionRule() is not on the real catalog creation path today.
    • Real path:
Catalog resolution
  -> CatalogTableUtil.getCatalogTables(...) [seatunnel-api/.../CatalogTableUtil.java:101-142]
      -> FactoryUtil.createOptionalCatalog(...) [seatunnel-api/.../FactoryUtil.java:287-295]
          -> optionalFactory.map(catalogFactory -> catalogFactory.createCatalog(...))
  • Unlike the source/sink paths, FactoryUtil.createOptionalCatalog(...) does not call ConfigValidator.of(options).validate(catalogFactory.optionRule()) before createCatalog(...).
  • Because of that, the new rules in connector-elasticsearch/.../ElasticSearchCatalogFactory.java:58-90 are effectively dead on the current dev base. They only run in the new unit test, where the test invokes ConfigValidator manually.
  • That means the PR description currently overstates the result for catalogs: invalid Elasticsearch catalog configs are still not rejected on the real submission path; they can still fail later in catalog.open() / catalog.getTables(...).

The factory tests themselves look deterministic and non-flaky, but they do not cover this real catalog runtime path, so they miss the gap above.

My merge conclusion for the current head is: fix before merge.

What I would suggest:

  • Option A: include the missing catalog validation in FactoryUtil.createOptionalCatalog(...), so the catalog path matches the source/sink validation model end-to-end.
  • Option B: if you want to keep this PR scoped to Elasticsearch only, then please drop the catalog migration/tests from this PR for now and land them after the shared catalog validation path is available.

Happy to re-review once you push an update.

@nzw921rx

Copy link
Copy Markdown
Collaborator Author

Thanks for working on this. I went through the full diff and traced the actual source / sink / catalog creation paths on the current dev base.

The source-side and sink-side migration to declarative OptionRule + Conditions is moving in the right direction. In both paths the new rules are on the real hot path:

Source creation
  -> FactoryUtil.createAndPrepareSource(...) [seatunnel-api/.../FactoryUtil.java:184-193]
      -> ConfigValidator.validate(sourceFactory.optionRule()) [L191]
      -> ElasticsearchSourceFactory.createSource(...)
      -> new ElasticsearchSource(...)

Sink creation
  -> FactoryUtil.createAndPrepareSink(...) [seatunnel-api/.../FactoryUtil.java:210-260]
      -> ConfigValidator.validate(tableSinkFactory.optionRule()) [L252]
      -> ElasticsearchSinkFactory.createSink(...)

I did find one blocker before this can merge:

  1. ElasticSearchCatalogFactory.optionRule() is not on the real catalog creation path today.
    • Real path:
Catalog resolution
  -> CatalogTableUtil.getCatalogTables(...) [seatunnel-api/.../CatalogTableUtil.java:101-142]
      -> FactoryUtil.createOptionalCatalog(...) [seatunnel-api/.../FactoryUtil.java:287-295]
          -> optionalFactory.map(catalogFactory -> catalogFactory.createCatalog(...))
  • Unlike the source/sink paths, FactoryUtil.createOptionalCatalog(...) does not call ConfigValidator.of(options).validate(catalogFactory.optionRule()) before createCatalog(...).
  • Because of that, the new rules in connector-elasticsearch/.../ElasticSearchCatalogFactory.java:58-90 are effectively dead on the current dev base. They only run in the new unit test, where the test invokes ConfigValidator manually.
  • That means the PR description currently overstates the result for catalogs: invalid Elasticsearch catalog configs are still not rejected on the real submission path; they can still fail later in catalog.open() / catalog.getTables(...).

The factory tests themselves look deterministic and non-flaky, but they do not cover this real catalog runtime path, so they miss the gap above.

My merge conclusion for the current head is: fix before merge.

What I would suggest:

  • Option A: include the missing catalog validation in FactoryUtil.createOptionalCatalog(...), so the catalog path matches the source/sink validation model end-to-end.
  • Option B: if you want to keep this PR scoped to Elasticsearch only, then please drop the catalog migration/tests from this PR for now and land them after the shared catalog validation path is available.

Happy to re-review once you push an update.

@DanielLeens thanks your review ,PR #11127 Fixed

@DanielLeens

DanielLeens commented Jun 21, 2026

Copy link
Copy Markdown
Contributor

Thanks for the update. The catalog-validation gap I called out is the shared FactoryUtil.createOptionalCatalog(...) path, so I agree that PR #11127 is the key dependency. Since this branch itself has no new commit yet, I am not starting a new full review round here. Once this PR either rebases onto that shared fix or brings the shared validation into this branch directly, happy to re-check it.

@nzw921rx nzw921rx force-pushed the improve/elasticsearch-optionrule-migration branch from 153ad14 to 9782157 Compare June 22, 2026 02:46

@DanielLeens DanielLeens left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the migration work here. I re-checked the current head from the source factory rule through the shared validator helpers and the added tests, and I do not see a source-level blocker.

What this PR fixes

  • User pain: Elasticsearch index/auth/sql-query validation was scattered and gave inconsistent submission-time feedback.
  • Fix approach: The PR moves the connector to declarative OptionRule validation and shared ElasticsearchValidators helpers across source/sink/catalog.
  • One-line summary: On the current head, the submission-time validation contract is much closer to the real Elasticsearch runtime expectations.

Runtime path I traced

Job submission
  -> `ElasticsearchSourceFactory.optionRule()` requires index/index_list, validates auth combinations, and requires `sql_query` for SQL search

Shared helper path
  -> `ElasticsearchValidators.ApiKeyEncodedFormatValidator` validates the encoded `id:key` form

Result
  -> invalid Elasticsearch configs fail earlier and more consistently

Key findings

  • The normal submission path hits these rules directly.
  • The shared encoded-key validator makes the auth contract more consistent across entry points.
  • I do not see a source-level blocker on the current head.

Blocking items
None at the source-code level on the current head.

Other review notes

  • No additional non-blocking source-level comments from this round.

Compatibility and side effects

  • Compatibility: Fully backward compatible. This mostly makes invalid configs fail earlier.
  • Side effects: No meaningful performance or resource-model downside from the new rules.
  • Error handling / logging: Submission-time validation feedback is more complete and consistent now.
  • Tests: The added tests cover the key index/auth/sql-query cases and do not show obvious flaky patterns.
  • Docs: No extra docs update is required for this validation migration.
  • CI: The red CI I traced is a Windows unit-test failure complaining that PulsarSinkFactory is missing MULTI_TABLE_SINK_REPLICA. That is unrelated to this Elasticsearch PR and should go away once the branch is rebased onto a dev that already contains #11160 (or the equivalent fix).

Merge verdict

  • I do not see a blocker on the current head. The declarative Elasticsearch validation path looks consistent with the expected runtime contract.

@nzw921rx nzw921rx force-pushed the improve/elasticsearch-optionrule-migration branch from 9782157 to dfc5560 Compare June 23, 2026 15:01

@DanielLeens DanielLeens left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the follow-up. I re-reviewed the latest head from the real Elasticsearch source / sink / catalog creation paths again, not only the newly added validator helpers.

I also rechecked the exact blocker from my earlier review. That blocker is now resolved on the current head because the shared catalog entry path does validate catalogFactory.optionRule() before createCatalog(...):

Catalog creation
  -> FactoryUtil.createOptionalCatalog(...) [FactoryUtil.java:287-299]
      -> ConfigValidator.validate(catalogFactory.optionRule())
      -> ElasticSearchCatalogFactory.createCatalog(...)

What this PR solves

  • User pain: Elasticsearch config validation used to be scattered across source / sink / auth-provider runtime checks, which led to inconsistent submission-time feedback.
  • Fix approach: the PR moves those checks into declarative OptionRule + Conditions rules, plus shared reusable validators.
  • One-line summary: on the current head, the Elasticsearch submission-time validation contract is now consistently enforced on the real source, sink, and catalog paths.

Runtime chain I checked

Source creation
  -> FactoryUtil.createAndPrepareSource(...)
      -> ConfigValidator.validate(ElasticsearchSourceFactory.optionRule())
      -> ElasticsearchSourceFactory.createSource(...)
      -> new ElasticsearchSource(...)

Sink creation
  -> FactoryUtil.createAndPrepareSink(...)
      -> ConfigValidator.validate(ElasticsearchSinkFactory.optionRule())
      -> ElasticsearchSinkFactory.createSink(...)

Catalog creation
  -> FactoryUtil.createOptionalCatalog(...)
      -> ConfigValidator.validate(ElasticSearchCatalogFactory.optionRule())
      -> ElasticSearchCatalogFactory.createCatalog(...)

Findings

  • The earlier catalog-path blocker is fixed on the current head.
  • The shared encoded API-key validator is now reused consistently across source / sink / catalog.
  • I do not see a reopened source-level blocker in the current diff.

Tests / stability

  • The new tests are still structurally stable: pure in-memory validation tests, no sleeps, no ports, no containers, and no async timing assumptions.
  • Coverage is materially better now for auth combinations, SQL search validation, and catalog validation.

CI

  • The latest Build is green on the current head.
  • I am not relying on CI alone here; the source-level blocker from the earlier review is also resolved.

Merge conclusion

Conclusion: can merge

  1. Blocking items
  • None from Daniel's side on the current head.
  1. Suggested but non-blocking follow-up
  • The note in ElasticsearchSource about per-entry index_list configs not being declaratively validated is reasonable to keep visible for future hardening, but I am not treating that as a blocker for this PR.

Overall this looks good to me on the latest revision.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants