FlatPostgresCollection Create Doc Impl #266

suddendust · 2026-01-12T19:39:48Z

Description

This PR implements Collection#create(Key key, Document document) for FlatPostgresCollection.

Testing

Added integration tests.

Checklist:

My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
Any dependent changes have been merged and published in downstream modules

…ma_cache

…rite_create

suresh-prakash · 2026-01-13T02:18:10Z

...e/src/main/java/org/hypertrace/core/documentstore/postgres/model/PostgresColumnMetadata.java

  private final DataType canonicalType;
  @Getter private final PostgresDataType postgresType;
  private final boolean nullable;
+  private final boolean array;


Nit: isArray since array is a slightly confusing without looking at this line.

codecov · 2026-01-14T06:43:22Z

Codecov Report

❌ Patch coverage is 84.90566% with 24 lines in your changes missing coverage. Please review.
✅ Project coverage is 80.60%. Comparing base (4853757) to head (c3024b0).

Files with missing lines	Patch %	Lines
...documentstore/postgres/FlatPostgresCollection.java	83.91%	16 Missing and 7 partials ⚠️
...rg/hypertrace/core/documentstore/CreateResult.java	85.71%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##               main     #266      +/-   ##
============================================
+ Coverage     80.51%   80.60%   +0.09%     
- Complexity     1385     1390       +5     
============================================
  Files           234      235       +1     
  Lines          6194     6348     +154     
  Branches        554      579      +25     
============================================
+ Hits           4987     5117     +130     
- Misses          831      848      +17     
- Partials        376      383       +7

Flag	Coverage Δ
integration	`80.60% <84.90%> (+0.09%)`	⬆️
unit	`57.37% <5.66%> (-1.29%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

suddendust · 2026-01-14T06:44:42Z

@suresh-prakash Can you take a look at the following:

If the code isn't able to find the columnMetadata of a particular column even after retries, this column is skipped (rather than failing). Is this behaviour correct? I went with this design to write on a best-effort basis.
Similarly, if for some reason we're unable to parse the value of a column, we add it to skippedFields list rather than failing the query.
If any columns are skipped this way, we return the list to the client in the CreateResult. For this, CreateResult has been enhanced.
We retry once by default in case we get the following SQL state in the first attempt: UNDEFINED_COLUMN("42703") OR DATATYPE_MISMATCH("42804"). The logic is maybe the schema has changed to better to refresh it and retry.

suresh-prakash · 2026-01-14T07:16:47Z

@suresh-prakash Can you take a look at the following:

If the code isn't able to find the columnMetadata of a particular column even after retries, this column is skipped (rather than failing). Is this behaviour correct? I went with this design to write on a best-effort basis.

Similarly, if for some reason we're unable to parse the value of a column, we add it to skippedFields list rather than failing the query.

If any columns are skipped this way, we return the list to the client in the CreateResult. For this, CreateResult has been enhanced.

We retry once by default in case we get the following SQL state in the first attempt: UNDEFINED_COLUMN("42703") OR DATATYPE_MISMATCH("42804"). The logic is maybe the schema has changed to better to refresh it and retry.

If the code isn't able to find the columnMetadata of a particular column even after retries, this column is skipped (rather than failing). Is this behaviour correct? I went with this design to write on a best-effort basis.

Best-effort makes sense. But, most of the times such log messages are ignored though we may be setting them in the CreateResult. Rather I'd prefer we fail it so that the client can take a necessary action (they may choose to retry with the column ignored, if we throw a custom exception with meaningful info.).

Similarly, if for some reason we're unable to parse the value of a column, we add it to skippedFields list rather than failing the query.

This is fine, I guess, because that's the way the Mongo impl. works today. If I give an invalid selection, it doesn't throw. Rather avoids it in the JSON, which in turn results in a null (or missing) value on the caller/client side.

We retry once by default in case we get the following SQL state in the first attempt: UNDEFINED_COLUMN("42703") OR DATATYPE_MISMATCH("42804"). The logic is maybe the schema has changed to better to refresh it and retry.

100% makes sense. 🙂

puneet-traceable · 2026-01-14T15:47:03Z

@suresh-prakash Can you take a look at the following:

If the code isn't able to find the columnMetadata of a particular column even after retries, this column is skipped (rather than failing). Is this behaviour correct? I went with this design to write on a best-effort basis.

Similarly, if for some reason we're unable to parse the value of a column, we add it to skippedFields list rather than failing the query.

If any columns are skipped this way, we return the list to the client in the CreateResult. For this, CreateResult has been enhanced.

We retry once by default in case we get the following SQL state in the first attempt: UNDEFINED_COLUMN("42703") OR DATATYPE_MISMATCH("42804"). The logic is maybe the schema has changed to better to refresh it and retry.

If the code isn't able to find the columnMetadata of a particular column even after retries, this column is skipped (rather than failing). Is this behaviour correct? I went with this design to write on a best-effort basis.

Best-effort makes sense. But, most of the times such log messages are ignored though we may be setting them in the CreateResult. Rather I'd prefer we fail it so that the client can take a necessary action (they may choose to retry with the column ignored, if we throw a custom exception with meaningful info.).

Similarly, if for some reason we're unable to parse the value of a column, we add it to skippedFields list rather than failing the query.

This is fine, I guess, because that's the way the Mongo impl. works today. If I give an invalid selection, it doesn't throw. Rather avoids it in the JSON, which in turn results in a null (or missing) value on the caller/client side.

We retry once by default in case we get the following SQL state in the first attempt: UNDEFINED_COLUMN("42703") OR DATATYPE_MISMATCH("42804"). The logic is maybe the schema has changed to better to refresh it and retry.

100% makes sense. 🙂

Why retry on DATATYPE_MISMATCH? This can be a bad situation to be in. If client data type mismatches the db column type then retries won't help. DB Column data type change should never be the case in postgres.

suddendust · 2026-01-14T20:10:12Z

DB Column data type change should never be the case in postgres.

Well theoretically speaking, the rationale behind the retry is: Maybe the data type changed and the schema is stale, so refresh it and try again.

suddendust · 2026-01-15T07:28:36Z

@suresh-prakash I am not throwing an exception because it won't throw an exception in Mongo as well. So to keep the interface behaviour consistent, shall we stick to it?

suresh-prakash · 2026-01-15T08:08:43Z

@suresh-prakash I am not throwing an exception because it won't throw an exception in Mongo as well. So to keep the interface behaviour consistent, shall we stick to it?

True. But silently neglecting seems dangerous. Most clients would be unware of it. While it may give the immediate compatibility, it could create far more issues later. Since this write path anyways require code changes in the clients, I still think it would be better to let the clients handle it rather than the library doing it behind the scenes.

@kotharironak Thoughts here?

suddendust · 2026-01-16T06:08:26Z

@suresh-prakash How about a config to control this (in customParameters)? Maybe something like bestEffortWrites: true/false. If true, it would write on a best-effort basis. If not, if would do a strict match - That is, all fields passed in the doc should be present in the schema along with the right value type.

suddendust · 2026-01-16T07:16:29Z

@suresh-prakash Have add a bestEffortWrites custom param that controls the dataflow as following:

If true, then PG would skip any fields passed in the document that are not present in the schema, or whose passed values' types don't conform to what is present in the schema.
If false, it does a strict match. All fields present in the doc should be present in the schema + the passed values' types should conform to the defined schema.

Wdyt?

suresh-prakash · 2026-01-16T11:00:46Z

@suresh-prakash Have add a bestEffortWrites custom param that controls the dataflow as following:

If true, then PG would skip any fields passed in the document that are not present in the schema, or whose passed values' types don't conform to what is present in the schema.

If false, it does a strict match. All fields present in the doc should be present in the schema + the passed values' types should conform to the defined schema.

Wdyt?

This is a nice middle-ground. 🙂 Just a small suggestion though. Instead of a boolean, can we make it as an enum (say, MissingColumnStrategy) please? That way, if at all there arises a third strategy in the future, it's straight-forward to extend it. E.g. values: SKIP, THROW, IGNORE_DOCUMENT, MENTION_IN_RESPONSE, etc. (Out of these, we can just implement the necessary ones today).

suddendust added 21 commits December 28, 2025 23:17

Added PostgresSchemaRegistry.java

00e9a9c

Spotless

31846e9

WIP

2fdbf0e

Spotless

1727dd0

Remove unused method in SchemaRegistry

a62fbc2

Remove unused method in ColumnMetadata

6b7595b

WIP

7b4ef2a

WIP

598cb25

Configure cache expiry and cooldown

9c173b9

Added PostgresMetadataFetcherTest

7bf77c5

WIP

6d03cd5

Added docs on thread safety

c3f5f7e

Added PostgresSchemaRegistryIntegrationTest.java

827381f

WIP

602037b

WIP

c8a53eb

Refactor

31f16e2

Merge branch 'schema_cache' into pg_write_create

c139bee

Merge branch 'main' of github.com:hypertrace/document-store into sche…

75150e3

…ma_cache

Merge branch 'schema_cache' into pg_write_create

5412f9b

WIP

bf5ca5c

Implement create for flat collections

57b623a

suddendust requested review from avinashkolluru, kotharironak, skjindal93 and suresh-prakash as code owners January 12, 2026 19:39

Merge branch 'main' of github.com:hypertrace/document-store into pg_w…

233c9c4

…rite_create

suddendust changed the title ~~[Draft] Pg write create~~ FlatPostgresCollection Create Doc Impl Jan 12, 2026

Fix compilation issue

9f8811e

suresh-prakash previously approved these changes Jan 13, 2026

View reviewed changes

Refactor

bfd6651

suddendust dismissed suresh-prakash’s stale review via bfd6651 January 14, 2026 06:22

Enhanced CreateResult.java and others

70ec4b3

suddendust added 2 commits January 14, 2026 12:34

Added more test cases

6d1c277

Spotless

910ef8c

suddendust added 2 commits January 14, 2026 13:47

WIP

3e2c178

Add more test coverage

9061e24

Added bestEfforts configuration to PG custom parameters.

c3024b0

FlatPostgresCollection Create Doc Impl #266

Are you sure you want to change the base?

FlatPostgresCollection Create Doc Impl #266

Uh oh!

Conversation

suddendust commented Jan 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Testing

Checklist:

Uh oh!

suresh-prakash Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

suddendust Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

suddendust commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

suresh-prakash commented Jan 14, 2026

Uh oh!

puneet-traceable commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

suddendust commented Jan 14, 2026

Uh oh!

suddendust commented Jan 15, 2026

Uh oh!

suresh-prakash commented Jan 15, 2026

Uh oh!

suddendust commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

suddendust commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

suresh-prakash commented Jan 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

suddendust commented Jan 12, 2026 •

edited

Loading

codecov bot commented Jan 14, 2026 •

edited

Loading

suddendust commented Jan 14, 2026 •

edited

Loading

puneet-traceable commented Jan 14, 2026 •

edited

Loading

suddendust commented Jan 16, 2026 •

edited

Loading

suddendust commented Jan 16, 2026 •

edited

Loading