Skip to content

epoch migration command#768

Merged
kortemik merged 51 commits into
teragrep:mainfrom
elliVM:epoch-migration-step
Jun 9, 2026
Merged

epoch migration command#768
kortemik merged 51 commits into
teragrep:mainfrom
elliVM:epoch-migration-step

Conversation

@elliVM

@elliVM elliVM commented Dec 15, 2025

Copy link
Copy Markdown
Contributor

Description

Implement support for a command to update missing epoch values of S3 object metadata in the archive SQL. Works by running the archive datasource in an epoch migration mode, where epoch value is fetched to the returned schemas _time column.

  • Updates the pth_03 version to 9.4.0 with teragrep exec migration epoch command support
  • Adds jooq library for dynamic SQL building
  • Update the SQL logtime metadata of the S3 objects using epoch migration mode from the archiver
  • Epoch migraiton mode of PTH_06 provides a calculated epoch value of the first even of the S3 in the _time column. In the _raw column there is a JSON format metadata about the event, this can be used to see how the event was parsed and what was the source of the _time column epoch.

Testing

Included unit and dpl tests

General

  • I have checked that my test files and functions have meaningful names.
  • I have checked that each test tests only a single behavior.
  • I have done happy tests.
  • I have tested only my own code.
  • I have tested at least all public methods.

Assertions

  • I have checked that my tests use assertions and not runtime overhead.
  • I have checked that my tests end in assertions.
  • I have checked that there is no comparison statements in assertions.
  • I have checked that assertions are in tests and not in helper functions.
  • I have checked that assertions for iterables are outside of for loops and both sides of the iteration blocks.
  • I have checked that assertions are not tested inside consumers.

Testing Data

  • I have tested algorithms and anything else with the possibility of unbound growth.
  • I have checked that all testing data is local and fully replaceable or reproducible or both.
  • I have checked that all test files are standalone.
  • I have checked that all test-specific fake objects and classes are in the test directory.
  • I have checked that my tests do not contain anything related to customers, infrastructure or users.
  • I have checked that my tests do not contain non-generic information.
  • I have checked that my tests do not do external requests and are not privately or publicly routable.

Statements

  • I have checked that my tests do not use throws for exceptions.
  • I have checked that my tests do not use try-catch statements.
  • I have checked that my tests do not use if-else statements.

Java

  • I have checked that my tests for Java uses JUnit library.
  • I have checked that my tests for Java uses JUnit utilities for parameters.

Other

  • I have only tested public behavior and not private implementation details.
  • I have checked that my tests are not (partially) commented out.
  • I have checked that hand-crafted variables in assertions are used accordingly.
  • I have tested Object Equality.
  • I have checked that I do not have any manual tests or I have a valid reason for them and I have explained it in the PR description.

Code Quality

  • I have checked that my code follows metrics set in Procedure: Class Metrics.
  • I have checked that my code follows metrics set in Procedure: Method Metrics.
  • I have checked that my code follows metrics set in Procedure: Object Quality.
  • I have checked that my code does not have any NULL values.
  • I have checked my code does not contain FIXME or TODO comments.

@elliVM elliVM self-assigned this Dec 15, 2025
@elliVM elliVM linked an issue Dec 15, 2025 that may be closed by this pull request
@elliVM elliVM changed the title epoch migration step epoch migration command Dec 15, 2025
@elliVM elliVM force-pushed the epoch-migration-step branch from 0434986 to 205dfd3 Compare January 28, 2026 09:18
@elliVM elliVM force-pushed the epoch-migration-step branch 2 times, most recently from 8b6df08 to ccbb108 Compare February 5, 2026 09:19
@elliVM elliVM marked this pull request as ready for review February 5, 2026 10:07
@elliVM elliVM requested a review from Tiihott February 5, 2026 10:38

@Tiihott Tiihott left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Object equality tests should be added for some new classes.
Please rebase.

Comment thread src/test/java/com/teragrep/pth_10/EpochMigrationStepTest.java
@elliVM elliVM force-pushed the epoch-migration-step branch from bb9d02d to 5b57d2d Compare February 10, 2026 10:13
@elliVM

elliVM commented Feb 10, 2026

Copy link
Copy Markdown
Contributor Author

rebased

@elliVM elliVM requested a review from Tiihott February 11, 2026 10:51

@Tiihott Tiihott left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests fail when run via maven.

@elliVM elliVM force-pushed the epoch-migration-step branch from 17823f6 to 9b20d0b Compare February 12, 2026 11:14
@elliVM

elliVM commented Feb 12, 2026

Copy link
Copy Markdown
Contributor Author

rebased

@elliVM elliVM marked this pull request as draft February 18, 2026 11:02
@elliVM elliVM force-pushed the epoch-migration-step branch from dbc66fb to 481d8b9 Compare February 24, 2026 13:03
@elliVM

elliVM commented Feb 24, 2026

Copy link
Copy Markdown
Contributor Author

rebased and switched to use new connection pool objects

@elliVM elliVM marked this pull request as ready for review February 24, 2026 13:33
@elliVM

elliVM commented Feb 25, 2026

Copy link
Copy Markdown
Contributor Author

tests keep failing in actions with connection pool initialization errors

@elliVM

elliVM commented Feb 26, 2026

Copy link
Copy Markdown
Contributor Author

working build now

@elliVM elliVM requested a review from Tiihott February 26, 2026 08:38
Tiihott
Tiihott previously approved these changes Feb 27, 2026

@Tiihott Tiihott left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All tests pass and changes look fine, LGTM.

@elliVM

elliVM commented Feb 27, 2026

Copy link
Copy Markdown
Contributor Author

Doing testing in QA

@elliVM

elliVM commented Feb 27, 2026

Copy link
Copy Markdown
Contributor Author

Issues in QA

  • Caused by: java.sql.SQLException: No suitable driver executor class path does not have JDBC driver available. FIX: added hikariConfig.setDriverClassName("org.mariadb.jdbc.Driver") to maybe have it load in the executor. if not the executors need to be provided with the driver some other way I think.
  • archive.url/bloomdb url uses bloomdb database FIX: changed to use archive db credentials

Tiihott
Tiihott previously approved these changes Mar 2, 2026

@Tiihott Tiihott left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests pass and new changes look ok. A new test run in QA should be done to check if the credential and JDBC driver fixes resolve the issues found in QA.

@elliVM

elliVM commented Mar 2, 2026

Copy link
Copy Markdown
Contributor Author

I think the JDBC driver must be added to the spark executor jars

@elliVM elliVM force-pushed the epoch-migration-step branch from c8279bd to f843e22 Compare May 18, 2026 07:35
@elliVM

elliVM commented May 18, 2026

Copy link
Copy Markdown
Contributor Author

rebased

@Tiihott Tiihott left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some missing equals() and hashCode() overrides. Also the SyslogArchiveObjectMetadataFormat and UnknownArchiveObjectMetadataFormat classes should encapsulate something, for example the json String.

…ssing equals and hashcode methods and update tests
@elliVM elliVM requested a review from Tiihott May 19, 2026 13:55

@Tiihott Tiihott left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests pass and changes look ok. But there are couple of EqualsVerifier tests missing from new changes.

@elliVM elliVM requested a review from Tiihott May 22, 2026 08:28
Tiihott
Tiihott previously approved these changes May 22, 2026

@Tiihott Tiihott left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@elliVM elliVM requested a review from kortemik May 22, 2026 12:14
Comment thread src/main/java/com/teragrep/pth_10/steps/teragrep/migrate/StubResolvedFormat.java Outdated
@elliVM

elliVM commented Jun 3, 2026

Copy link
Copy Markdown
Contributor Author

Switched insertion into temp table to use jooq Loader api with a decorator to convert spark Rows into Object[] expected by the loader

@elliVM elliVM requested a review from kortemik June 3, 2026 10:40
@kortemik kortemik requested a review from eemhu June 3, 2026 10:40

@kortemik kortemik left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

epoch is not from resolved format?

@eemhu

eemhu commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

tests pass locally

@kortemik kortemik merged commit 5eb0094 into teragrep:main Jun 9, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement a spark job to update missing logfile epoch metadataa

4 participants