fix(collector): handle Aurora's unsupported pg_last_xact_replay_times…#1274
Open
dannotripp wants to merge 1 commit into
Open
fix(collector): handle Aurora's unsupported pg_last_xact_replay_times…#1274dannotripp wants to merge 1 commit into
dannotripp wants to merge 1 commit into
Conversation
…tamp Aurora PostgreSQL does not support pg_last_xact_replay_timestamp() and returns a feature_not_supported error (code 0A000) when the replication collector queries it. This causes the collector to crash on every scrape for Aurora instances. When this error is detected, the collector now falls back to a simpler query that only reads pg_is_in_recovery(), so is_replica is still reported correctly. The time-based metrics (lag_seconds and last_replay_seconds) are emitted as NaN to signal that the values are unavailable, rather than crashing the collection cycle entirely. The error is identified by checking for a *pq.Error with class "0A" (feature_not_supported) and a message that contains "Aurora", which avoids incorrectly suppressing the same error code on standard Postgres. A new test TestPgReplicationCollectorAurora covers this fallback path. Signed-off-by: Danno Tripp <danno.tripp@reddit.com>
df20274 to
b7c82e9
Compare
leonardobenedet
approved these changes
Apr 28, 2026
There was a problem hiding this comment.
Pull request overview
This PR updates the pg_replication collector to gracefully handle Amazon Aurora PostgreSQL, which does not support pg_last_xact_replay_timestamp(), preventing the collector from failing on every scrape for Aurora instances.
Changes:
- Detect Aurora’s
feature_not_supportederror when the replication query usespg_last_xact_replay_timestamp(). - Fall back to a simpler query that still reports
pg_replication_is_replicaviapg_is_in_recovery(). - Emit
NaNfor time-based replication metrics when the underlying function is unsupported, and add a unit test covering this fallback.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
collector/pg_replication.go |
Adds Aurora-specific error detection and a fallback query; emits NaN for unsupported time-based metrics. |
collector/pg_replication_test.go |
Adds TestPgReplicationCollectorAurora to validate fallback behavior and NaN emission. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| row2 := db.QueryRowContext(ctx, pgReplicationIsReplicaQuery) | ||
| if err2 := row2.Scan(&isReplica); err2 != nil { | ||
| isReplica = 0 |
sysadmind
reviewed
May 13, 2026
| } | ||
| } | ||
|
|
||
| lagValue := math.NaN() |
Contributor
There was a problem hiding this comment.
Why NaN over just not emitting the metric? The latter is common in other collectors when the database value is NULL.
5 tasks
megative
added a commit
to megative/postgres_exporter
that referenced
this pull request
May 17, 2026
Pulls pg_wal out of this PR's scope to keep it focused on the new aurora_* collectors. Will open the pg_wal fallback as a separate small bugfix PR (the wal-collector analog of prometheus-community#1274). Removes: - pg_wal.go/pg_wal_test.go Aurora-specific fallback (reverted to master) - isAuroraUnsupportedFunction helper (had no remaining callers) - Related imports + test Cleans CHANGELOG / README accordingly. Signed-off-by: Pavel K <megativ3@gmail.com>
This was referenced May 17, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
fix(collector): handle Aurora's unsupported pg_last_xact_replay_timestamp
Fixes #1273
Aurora PostgreSQL does not support
pg_last_xact_replay_timestamp(), causingthe replication collector to abort every scrape with a fatal error on Aurora
instances.
This change detects the Aurora-specific
feature_not_supportederror (Postgreserror class
0A) and falls back gracefully:pg_replication_is_replicaisstill reported via
pg_is_in_recovery(), while the time-based metrics emitNaNto signal they are unavailable.A new test
TestPgReplicationCollectorAuroracovers the fallback path.