[SPARK-57518][SQL] Make ThriftServer JDBC metadata operations DataSource V2 catalog-aware by yadavay-amzn · Pull Request #56627 · apache/spark

yadavay-amzn · 2026-06-20T01:46:17Z

What changes were proposed in this pull request?

Route ThriftServer JDBC metadata operations (getCatalogs, getSchemas, getTables, getColumns) through CatalogManager so they honor DataSource V2 catalogs and the default catalog. Populate TABLE_CAT with the real catalog name. Introduce a new conf spark.sql.thriftServer.catalogMetadata.enabled (default true) with legacy fallback when disabled.

Why are the changes needed?

With spark.sql.catalog.* and spark.sql.defaultCatalog set, JDBC/BI clients get inconsistent metadata because the metadata operations used the V1 SessionCatalog directly, ignoring any configured DSv2 catalogs.

Design notes

(a) A null catalogName resolves to the CURRENT catalog (consistent with Spark's own unspecified-to-current resolution and with Trino/Snowflake behavior), not all catalogs.

(b) getCatalogs returns CatalogManager.listCatalogs() (including spark_catalog, sorted alphabetically).

(c) TABLE_CAT was previously empty or null -- now populated with the actual catalog name. The conf defaults to ON with an escape hatch for clients that relied on parsing empty TABLE_CAT.

(d) KNOWN LIMITATION: listCatalogs() returns only ALREADY-LOADED catalogs, so catalogs that are configured but never accessed will not be listed. This is documented; we do not eagerly load catalogs.

(e) V2-specific metadata authorization is deferred to a follow-up. Existing Hive auth hooks are unchanged and getCatalogs/getSchemas already pass null priv objects.

Does this PR introduce any user-facing change?

Yes. TABLE_CAT now reflects the catalog name (gated by the new conf). New conf: spark.sql.thriftServer.catalogMetadata.enabled.

How was this patch tested?

SparkMetadataOperationSuite covers: default spark_catalog path, configured in-memory DSv2 catalog path, and conf-disabled legacy path (getCatalogs, getSchemas, getTables, getColumns).

Was this patch authored or co-authored using generative AI tooling?

Authored with assistance from Claude Opus 4.8

…rce V2 catalog-aware Route getCatalogs/getSchemas/getTables/getColumns through CatalogManager to honor DSv2 catalogs and the default catalog. Populate TABLE_CAT with the real catalog name. New conf spark.sql.thriftServer.catalogMetadata.enabled (default true) with legacy fallback.

yadavay-amzn force-pushed the fix/SPARK-57518-thriftserver-dsv2-metadata branch from 6700cf8 to 0b42f09 Compare June 20, 2026 07:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-57518][SQL] Make ThriftServer JDBC metadata operations DataSource V2 catalog-aware#56627

[SPARK-57518][SQL] Make ThriftServer JDBC metadata operations DataSource V2 catalog-aware#56627
yadavay-amzn wants to merge 1 commit into
apache:masterfrom
yadavay-amzn:fix/SPARK-57518-thriftserver-dsv2-metadata

yadavay-amzn commented Jun 20, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

yadavay-amzn commented Jun 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Design notes

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

yadavay-amzn commented Jun 20, 2026 •

edited

Loading