Skip to content

Conversation

@geruh
Copy link
Contributor

@geruh geruh commented Jan 11, 2026

Rationale for this change

This PR adds table UUID validation on refresh and commit to detect when a table has been replaced. For example, if a table is dropped and recreated with the same name, this prevents accidentally operating on a different table than expected.

Modeled after the Java implementation.

https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/BaseMetastoreTableOperations.java#L202-L209

Python was missing this check.

Are these changes tested?

Added some tests at the table and catalog level

Are there any user-facing changes?

no

Copy link
Contributor

@kevinjqliu kevinjqliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this!

Comment on lines 1506 to 1509
# Only check UUID for existing tables, not new tables
if not isinstance(self, StagedTable):
self._check_uuid(response.metadata)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the scenario where the commit response has a different table uuid?
The commit request should include AssertTableUUID so I would expect the catalog to verify that

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review @Fokko and @kevinjqliu!

I was following the behavior of both implementations while REST is explicit about the check the BaseMetastore operations says eventually check. That being said, I can't think of a scenario where the commit check would catch something that Assert Table check wouldn't. So the commit check is purely defensive.

catalog._check_endpoint(Capability.V1_DELETE_VIEW)


def test_table_uuid_check_on_commit(rest_mock: Mocker, example_table_metadata_v2: dict[str, Any]) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a big fan of mocking this out, since I think this should already work as @kevinjqliu pointed out. When performing the update, AssertTableUUID should ensure that no other process has dropped and recreated the table. The requirement will be asserted by the REST catalog on the server, or with {Hive,Sql,etc}Catalog it should be part of the code when we maintain a lock on the table.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants