Backport PG14_ARCHIVE in REL_2_STABLE stable #1819
Open
reshke wants to merge 128 commits into
Open
Conversation
For a selected list of PG system views (started with 'pg_'prefix ), we will create a corresponding 'gp_' view for each one in the list. Each 'gp_' view is basically a UNION ALL of the results of running the corresponding 'pg_' view on all segments (including the coordinator). Note that, these views do not aggregate the results. The aggregate version of the views will be named with a '_summary' appendix (such as 'gp_stat_all_tables_summary'). To add a new 'pg_' view to this list, simply put the name in file 'src/backend/catalog/system_views_gp.in'. This commit adds an initial list of views that we think make sense to have 'gp_' views. With this change, we also remove the existing definition of gp_stat_archiver view and let it be generated automatically. We also had gp_stat_replication but it carries additional column than pg_stat_replication so it cannot use the automatic way.
Some pg_ views have been modified by cbdb: the gp_segment_id colmun has been added to them. So they are failed to be transformed from the pg_ views to gp_ views (see commit 5028222620d410fe3d4c60f732a599e269006968) So just remove them from system_vies_gp.in. Maybe better to fix them later.
We used to not have a very clear naming guideline for the existing 'pg_%' system views and the MPP versions of them. As an example, we renamed PG's pg_stat_all_tables and pg_stat_all_indexes to have an '_internal' appendix, and used their original names to collect aggregated results from all segments (commit e6f9303). However, with the previous commit, we now let all existing PG system views to have their original names, while add corresponding 'gp_%' views for the non-aggregated results from all segments, and 'gp_%_summary' views for aggregated results from all segments. Therefore, we now revert pg_stat_all_tables and pg_stat_all_indexes back to their original definitions, which just collect stats from a single segment. Then, we add them to sytem_views_gp.in to produce gp_stat_all_tables and gp_stat_all_indexes which collect non-aggregated results from all segments. Finally, we rename the aggregate version of those views to be gp_stat_all_tables_summary and gp_stat_all_indexes_summary. Because views pg_stat_user_tables and pg_stat_user_indexes use the above sumary views, we have to add _summary views for these two views as well. We will add _summary for other system views later. Modify regress test accordingly.
Added the following views: gp_stat_progress_vacuum_summary gp_stat_progress_analyze_summary gp_stat_progress_cluster_summary gp_stat_progress_create_index_summary Also replaced pg_stat_progress_* views with gp_stat_progress_* views for existing tests.
These summary views offer basic aggregation of the gp_stat_* views across Greenplum coordinator and segments. Aggregation logic applied as follows: * Time related (last_%): use max() * Transaction related, not innately summable (number of commits/rollbacks) : use max() * Table specific: sum()/numsegments for replicated tables, sum() for distributed tables * Innately summable stats, if no particular table is involved: use sum() * pid: use coordinator's pid (not used here, but this is the convention in other gp_%_summary views)
setup_cdb_schema() checked errno after a readdir() loop without resetting it beforehand. In some environments (e.g., Ubuntu 24.04), a stale errno value from operations inside the loop (such as pg_realloc or pg_strdup) could persist, causing readdir's normal termination to be misinterpreted as a failure (e.g., "Function not implemented"). This commit fixes the issue by adopting the standard PostgreSQL idiom: - Use "while (errno = 0, (file = readdir(dir)) != NULL)" to ensure errno is cleared strictly before each readdir() call. - Move closedir() after the errno check to prevent it from overwriting the error code from readdir(). - Add defensive error checking for the closedir() call itself. This ensures robust directory scanning and reliable error reporting during cluster initialization.
Fix SyntaxWarning caused by invalid escape sequences in mainUtils.py and logfilter.py. These warnings appear on Python 3.12+ (e.g., Ubuntu 24.04) and will become SyntaxError in Python 3.14. Changes: - mainUtils.py: Use raw strings for shell commands containing '\$' - logfilter.py: Use raw docstrings for functions containing regex examples See: #1587
Correct a typo in LICENSE that referenced the ISC license file with a duplicated directory name. No functional change.
* remove unused build_xerces.py
When CREATE TABLESPACE ... LOCATION '' is dispatched from QD to QE, the serialization converts the empty string to NULL. On the segment, pstrdup(stmt->location) crashes with SIGSEGV because stmt->location is NULL. Add a NULL guard to treat NULL the same as empty string, preserving in-place tablespace semantics. Fixes #1627
movedb() acquires a session-level AccessExclusiveLock on the database
via MoveDbSessionLockAcquire(). The release in CommitTransaction()
only checked for GP_ROLE_DISPATCH and IS_SINGLENODE(), missing
GP_ROLE_UTILITY. This caused the lock to leak in standalone backends
(e.g. TAP tests), triggering a proc.c assertion failure at exit:
FailedAssertion("SHMQueueEmpty(&(MyProc->myProcLocks[i]))")
Add GP_ROLE_UTILITY to the release condition.
Also fix a spurious "could not read symbolic link" log message when
dropping in-place tablespaces: readlink() on a directory returns
EINVAL, which is expected and can be safely skipped.
Fixes #1626
Two fixes:
- Use TestLib::slurp_file instead of PostgreSQL::Test::Utils::slurp_file
in adjust_conf(). PostgresNode.pm only imports TestLib, not
PostgreSQL::Test::Utils, so the latter is undefined and causes
t/101_restore_point_and_startup_pause to fail.
- Replace cp with install -m 644 in enable_archiving() archive_command.
coreutils 8.32 (Rocky 8, Ubuntu 22.04) uses copy_file_range() in cp
which crashes on Docker overlayfs. install does not use
copy_file_range(), avoiding the crash.
Also add ic-recovery test suite to rocky8 and deb CI pipelines.
Guard pfree/list_free calls with pointer-equality checks to avoid freeing live nodes when flatten_join_alias_vars returns the same pointer unchanged (e.g., outer-reference Vars with varlevelsup > 0). The unconditional pfree(havingQual) freed the Var node, whose memory was later reused by palloc for a T_List. copyObjectImpl then copied the wrong node type into havingQual, causing ORCA to encounter an unexpected RangeTblEntry and fall back to the Postgres planner. Applies the same guard pattern to all six fields: targetList, returningList, havingQual, scatterClause, limitOffset, limitCount. Reported-in: #1618
Force correlated execution (SubPlan) for scalar subqueries with GROUP BY () and a correlated HAVING clause. Previously ORCA decorrelated such subqueries into Left Outer Join + COALESCE(count,0), which incorrectly returned 0 instead of NULL when the HAVING condition was false. Add FHasCorrelatedSelectAboveGbAgg() to detect the pattern where NormalizeHaving() has converted the HAVING clause into a CLogicalSelect with outer refs above a CLogicalGbAgg with empty grouping columns. When detected, set m_fCorrelatedExecution = true in Psd() to bypass the COALESCE decorrelation path. Update groupingsets_optimizer.out expected output to reflect the new ORCA SubPlan format instead of Postgres planner fallback. Reported-in: #1618
Add a new AQUMV code path that rewrites multi-table JOIN queries to
scan materialized views when the query exactly matches the MV
definition. This compares the saved raw parse tree against stored
viewQuery from gp_matview_aux, bypassing the single-table AQUMV
logic entirely.
This enables significant query acceleration for common analytical
patterns: instead of repeatedly computing expensive multi-table joins
at query time, the planner can directly read pre-computed results
from the materialized view, turning O(N*M) join operations into a
simple sequential scan.
For example, given:
CREATE MATERIALIZED VIEW mv AS
SELECT t1.a, t2.b FROM t1 JOIN t2 ON t1.a = t2.a;
-- Before (GUC off): original join plan
Gather Motion 3:1
-> Hash Join
Hash Cond: (t1.a = t2.a)
-> Seq Scan on t1
-> Hash
-> Seq Scan on t2
-- After (GUC on): rewritten to MV scan
Gather Motion 3:1
-> Seq Scan on mv
Fix three issues in aqumv_query_is_exact_match(): - Add groupDistinct comparison (GROUP BY vs GROUP BY DISTINCT) - Add limitOption comparison (LIMIT vs FETCH FIRST WITH TIES) - Clear qp_extra in-place via aqumv_context->qp_extra instead of allocating a local char array; move standard_qp_extra typedef to planner.h so aqumv.c can reference the proper struct type Add test cases 26-28 to verify the new comparisons: - LIMIT vs FETCH FIRST WITH TIES non-match and exact match - GROUP BY DISTINCT vs GROUP BY non-match
The rewrite cleared sortClause, so grouping_planner() skipped adding a Sort node — queries with ORDER BY returned unsorted results from the MV scan. Fix: preserve sortClause and copy ressortgroupref to rewritten target entries so the upper planner generates Sort correctly. Before: Limit -> Gather -> Limit -> Seq Scan on mv After: Limit -> Gather -> Limit -> Sort -> Seq Scan on mv
When the relkind of a relache entry changes, because a table is converted into a view, pgstats can get confused in 15+, leading to crashes or assertion failures. For HEAD, Tom fixed this in b23cd18, by removing support for converting a table to a view, removing the source of the inconsistency. This commit just adds an assertion that a relcache entry's relkind does not change, just in case we end up with another case of that in the future. As there's no cases of changing relkind anymore, we can't add a test that that's handled correctly. For 15, fix the problem by not maintaining the association with the old pgstat entry when the relkind changes during a relcache invalidation processing. In that case the pgstat entry needs to be unlinked first, to avoid PgStat_TableStatus->relation getting out of sync. Also add a test reproducing the issues. No known problem exists in 11-14, so just add the test there. Reported-by: vignesh C <vignesh21@gmail.com> Author: Andres Freund <andres@anarazel.de> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CALDaNm2yXz+zOtv7y5zBd5WKT8O0Ld3YxikuU3dcyCvxF7gypA@mail.gmail.com Discussion: https://postgr.es/m/CALDaNm3oZA-8Wbps2Jd1g5_Gjrr-x3YWrJPek-mF5Asrrvz2Dg@mail.gmail.com Backpatch: 15-
When pg_dump retrieves the list of database objects and performs the data dump, there was possibility that objects are replaced with others of the same name, such as views, and access them. This vulnerability could result in code execution with superuser privileges during the pg_dump process. This issue can arise when dumping data of sequences, foreign tables (only 13 or later), or tables registered with a WHERE clause in the extension configuration table. To address this, pg_dump now utilizes the newly introduced restrict_nonsystem_relation_kind GUC parameter to restrict the accesses to non-system views and foreign tables during the dump process. This new GUC parameter is added to back branches too, but these changes do not require cluster recreation. Back-patch to all supported branches. Reviewed-by: Noah Misch Security: CVE-2024-7348 Backpatch-through: 12
This is a further update based on the PR: - #1625
Remove generation and installation of diskquota-build-info from diskquota to avoid writing extra files into the install prefix root (e.g. $GPHOME or /usr/local/cloudberry-db). Drop the unused CMake helper cmake/BuildInfo.cmake and remove cmake/Git.cmake and its invocation now that no build-info is produced.
It was always false in single-user mode, in autovacuum workers, and in background workers. This had no specifically-identified security consequences, but non-core code or future work might make it security-relevant. Back-patch to v11 (all supported versions). Jelte Fennema-Nio. Reported by Jelte Fennema-Nio.
This commit changes libpq so that errors reported by the backend during the protocol negotiation for SSL and GSS are discarded by the client, as these may include bytes that could be consumed by the client and write arbitrary bytes to a client's terminal. A failure with the SSL negotiation now leads to an error immediately reported, without a retry on any other methods allowed, like a fallback to a plaintext connection. A failure with GSS discards the error message received, and we allow a fallback as it may be possible that the error is caused by a connection attempt with a pre-11 server, GSS encryption having been introduced in v12. This was a problem only with v17 and newer versions; older versions discard the error message already in this case, assuming a failure caused by a lack of support for GSS encryption. Author: Jacob Champion Reviewed-by: Peter Eisentraut, Heikki Linnakangas, Michael Paquier Security: CVE-2024-10977 Backpatch-through: 12 Back-ported-by: reshke <reshke@double.cloud> ====== CBDB source commit is https://git.postgresql.org/cgit/postgresql.git/commit/?h=e6c9454764d880ee30735aa8c1e05d3674722ff9
Full and right outer joins were not supported in the initial implementation of Parallel Hash Join because of deadlock hazards (see discussion). Therefore FULL JOIN inhibited parallelism, as the other join strategies can't do that in parallel either. Add a new PHJ phase PHJ_BATCH_SCAN that scans for unmatched tuples on the inner side of one batch's hash table. For now, sidestep the deadlock problem by terminating parallelism there. The last process to arrive at that phase emits the unmatched tuples, while others detach and are free to go and work on other batches, if there are any, but otherwise they finish the join early. That unfairness is considered acceptable for now, because it's better than no parallelism at all. The build and probe phases are run in parallel, and the new scan-for-unmatched phase, while serial, is usually applied to the smaller of the two relations and is either limited by some multiple of work_mem, or it's too big and is partitioned into batches and then the situation is improved by batch-level parallelism. Author: Melanie Plageman <melanieplageman@gmail.com> Author: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKG%2BA6ftXPz4oe92%2Bx8Er%2BxpGZqto70-Q_ERwRaSyA%3DafNg%40mail.gmail.com
Hash join tuples reuse the HOT status bit to indicate match status during hash join execution. Correct reuse requires clearing the bit in all tuples. Serial hash join and parallel multi-batch hash join do so upon inserting the tuple into the hashtable. Single batch parallel hash join and batch 0 of unexpected multi-batch hash joins forgot to do this. It hadn't come up before because hashtable tuple match bits are only used for right and full outer joins and parallel ROJ and FOJ were unsupported. 11c2d6f introduced support for parallel ROJ/FOJ but neglected to ensure the match bits were reset. Author: Melanie Plageman <melanieplageman@gmail.com> Reported-by: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/flat/CAMbWs48Nde1Mv%3DBJv6_vXmRKHMuHZm2Q_g4F6Z3_pn%2B3EV6BGQ%40mail.gmail.com
As reported by buildfarm member conchuela, one of the regression tests added by 558c9d7 is having some ordering issues. This commit adds an ORDER BY clause to make the output more stable for the problematic query. Fix suggested by Tom Lane. The plan of the query updated still uses a parallel hash full join. Author: Melanie Plageman Discussion: https://postgr.es/m/623596.1684541098@sss.pgh.pa.us
PostgreSQL originally excluded FULL and RIGHT outer joins from parallel
hash join because of deadlock hazards in the per-batch barrier protocol.
PG 14 resolved this by introducing a dedicated PHJ_BATCH_SCAN phase: one
elected worker emits unmatched inner-side rows after probing, while the
others detach and move on.
In CBDB, distributed execution adds a second dimension: after a full
outer join the unmatched NULL-filled rows may come from any segment, so
the result carries a HashedOJ locus rather than a plain Hashed locus.
This change teaches the parallel planner about that:
- FULL JOIN and RIGHT JOIN are now valid parallel join types in the
distributed planner. Previously they were unconditionally rejected,
forcing serial execution across all segments.
- The HashedOJ locus produced by a parallel full join now carries
parallel_workers, so operators above the join (aggregates, further
joins) can remain parallel.
- A crash that could occur when a parallel LASJ_NOTIN (NOT IN) join
encountered NULL inner keys is fixed. The worker would exit early
but the batch barrier, which was never attached to, would be touched
on shutdown causing an assertion failure.
Example plans (3 segments, parallel_workers=2):
-- FULL JOIN: result locus is HashedOJ with Parallel Workers: 2
EXPLAIN(costs off, locus)
SELECT count(*) FROM t1 FULL JOIN t2 USING (id);
Finalize Aggregate
Locus: Entry
-> Gather Motion 6:1 (slice1; segments: 6)
-> Partial Aggregate
Locus: HashedOJ
Parallel Workers: 2
-> Parallel Hash Full Join
Locus: HashedOJ
Parallel Workers: 2
Hash Cond: (t1.id = t2.id)
-> Parallel Seq Scan on t1
Locus: HashedWorkers
-> Parallel Hash
-> Parallel Seq Scan on t2
Locus: HashedWorkers
-- RIGHT JOIN: when t1 is larger the planner hashes the smaller t2
-- and probes with t1; result locus HashedWorkers
EXPLAIN(costs off, locus)
SELECT count(*) FROM t1 RIGHT JOIN t2 USING (id);
Finalize Aggregate
Locus: Entry
-> Gather Motion 6:1 (slice1; segments: 6)
-> Partial Aggregate
Locus: HashedWorkers
Parallel Workers: 2
-> Parallel Hash Right Join
Locus: HashedWorkers
Parallel Workers: 2
Hash Cond: (t1.id = t2.id)
-> Parallel Seq Scan on t1
Locus: HashedWorkers
-> Parallel Hash
-> Parallel Seq Scan on t2
Locus: HashedWorkers
Performance (3 segments x 2 parallel workers, 6M rows each, 50% overlap):
FULL JOIN parallel: 4040 ms serial: 6347 ms speedup: 1.57x
RIGHT JOIN parallel: 3039 ms serial: 5568 ms speedup: 1.83x
Update Go installation in all Docker build containers to use the latest Go 1.24.13 release instead of 1.23.4, with corresponding SHA256 checksums for both amd64 and arm64 architectures. Affected files: - devops/deploy/docker/build/rocky8/Dockerfile - devops/deploy/docker/build/rocky9/Dockerfile - devops/deploy/docker/build/ubuntu22.04/Dockerfile - devops/deploy/docker/build/ubuntu24.04/Dockerfile Updated SHA256 checksums: - linux-amd64: 1fc94b57134d51669c72173ad5d49fd62afb0f1db9bf3f798fd98ee423f8d730 - linux-arm64: 74d97be1cc3a474129590c67ebf748a96e72d9f3a2b6fef3ed3275de591d49b3
Main changes: - Update default CODEBASE_VERSION from 2.0.0 to 2.1.0 in .env - Update documentation examples to use version 2.1.0 in README.md - Update help message example version in run.sh - Switch to Apache mirror system for downloading release tarball using closer.lua for better download reliability and speed - Replace wget with curl for source download in Dockerfile This change ensures the sandbox environment defaults to the latest Apache Cloudberry 2.1.0 release and uses the recommended Apache mirror download method.
Those fields are missed by orca which are needed by the pg_stat_statements to identify the query. Without initialization of those fields, pg_stat_statements won't track those queries.
This is the copy or Rocky9 containers. Changes comparing to Rocky9: - Get rid here of packages not available right now under Rocky10 repositories (for example rocky-release-hpc) - Move to the Java 21, it's default to Rocky Linux 10 Co-authored-by: Leonid Borchuk <xifos@qavm-f9b691f5.qemu>
libxml2 changed the required signature of error handler callbacks to make the passed xmlError struct "const". This is causing build failures on buildfarm member caiman, and no doubt will start showing up in the field quite soon. Add a version check to adjust the declaration of xml_errorHandler() according to LIBXML_VERSION. 2.12.x also produces deprecation warnings for contrib/xml2/xpath.c's assignment to xmlLoadExtDtdDefaultValue. I see no good reason for that to still be there, seeing that we disabled external DTDs (at a lower level) years ago for security reasons. Let's just remove it. Back-patch to all supported branches, since they might all get built with newer libxml2 once it gets a bit more popular. (The back branches produce another deprecation warning about xpath.c's use of xmlSubstituteEntitiesDefault(). We ought to consider whether to back-patch all or part of commit 65c5864d7 to silence that. It's less urgent though, since it won't break the buildfarm.) Discussion: https://postgr.es/m/1389505.1706382262@sss.pgh.pa.us
If we are building with openssl but USE_SSL_ENGINE didn't get set, initialize_SSL's variable "pkey" is declared but used nowhere. Apparently this combination hasn't been exercised in the buildfarm before now, because I've not seen this warning before, even though the code has been like this a long time. Move the declaration to silence the warning (and remove its useless initialization). Per buildfarm member sawshark. Back-patch to all supported branches.
We have two sections in a Makefile - one for CPP_OBJS and one for OBJS. CPP_OBJS use wildcards and src/protos includes bot in CPP_OBJS and in OBJS. So generated gcc string includes multiple items of proto *.o files. That leads to multiple definitions errors in linking time. Do not include proto files in CPP_OBJS macros and use it in OBJS macros.
Some functions are used in the tree and are currently marked as deprecated by upstream. This commit refreshes the code to use the recommended functions, leading to the following changes: - xmlSubstituteEntitiesDefault() is gone, and needs to be replaced with XML_PARSE_NOENT for the paths doing the parsing. - xmlParseMemory() -> xmlReadMemory(). These functions, as well as more functions setting global states, have been officially marked as deprecated by upstream in August 2022. Their replacements exist since the 2001-ish area, as far as I have checked, so that should be safe. Author: Dmitry Koval Discussion: https://postgr.es/m/18274-98d16bc03520665f@postgresql.org
ORCA's window frame translation always emits a BETWEEN frame (start + end bound), so include FRAMEOPTION_BETWEEN alongside FRAMEOPTION_NONDEFAULT to match the executor's expectations.
…ated host (#1702) * Fix null dereference on dedicated hot standby coordinator getCdbComponentInfo() populates hostPrimaryCountHash with primary hosts only. When IS_HOT_STANDBY_QD() is true, mirror and standby hosts are also looked up in the hash but return NULL on dedicated standby nodes that host no primary segments. Replace Assert(found) with a null-safe check to prevent SIGSEGV.
…nsumer (#1719) * orca: fallback to Postgres optimizer on cross-slice replicated CTE Consumer. Inspired by greengage 51fe92e: before Expr->DXL translation, walk the physical tree and track which slice each CTE Producer and Consumer lives on. If a Consumer is on a different slice than its Producer and the Producer's distribution is replicated, force a fallback to the Postgres optimizer. The replicated filter is essential: ordinary cross-slice CTE plans (non-replicated Producer with Gather/Redistribute Consumer) are a normal ORCA pattern and must not trigger fallback. 51fe92e doesn't trigger when a CTE over a replicated table is referenced from a scalar subquery, so the query hangs. This commit replaces the single-point check with a whole-tree walker that catches both cases. Tests: shared_scan adds a scalar-subquery reproducer guarded by statement_timeout. qp_orca_fallback adds two cases over a replicated CTE: a scalar-subquery form that triggers the walker (the hang case 51fe92e missed -- fallback to Postgres), and the original 51fe92e JOIN form where ORCA emits a safe plan with a One-Time Filter (gp_execution_segment() = N) and the walker correctly stays silent (guards against false positives). (cherry picked from commit open-gpdb/gpdb@3a9aebf)
Update Go version to 1.25.10 across all development Docker images for Rocky Linux 8/9/10 and Ubuntu 22.04/24.04. Changes: - Go version: go1.24.13 -> go1.25.10 - Updated SHA256 checksums for linux-amd64 and linux-arm64 archives See: apache/cloudberry-go-libs#19 (comment)
A scalar (plain) aggregate with no grouping columns always emits exactly one row regardless of input cardinality. Predicates above it (from a HAVING clause) filter that output row, so they cannot be moved onto the aggregate's input without changing semantics: SELECT count(*) FROM t HAVING false -- 0 rows SELECT count(*) FROM t WHERE false -- 1 row (count=0) CNormalizer::FPushable previously only blocked pushing volatile predicates below a GbAgg. Any other predicate -- including a constant false -- was considered pushable because its used-column set was trivially contained in the aggregate's output columns. The normalizer then routed the Select's predicate through the GbAgg and down into its logical child, dropping HAVING semantics for scalar aggregates.
Add comprehensive parallel table scan capability to GPORCA optimizer, enabling worker-level parallelism within segments for improved query performance on large table scans. Key components: - New CPhysicalParallelTableScan operator and CDistributionSpecWorkerRandom distribution specification for worker-level data distribution - CXformGet2ParallelTableScan transformation with parallel safety checks (excludes CTEs, dynamic scans, foreign tables, replicated tables, etc.) - Cost model integration with parallel_setup_cost and efficiency degradation scaling (logarithmic based on worker count) - DXL serialization/deserialization for CDXLPhysicalParallelTableScan - Plan translation to PostgreSQL SeqScan nodes with parallel_aware=true - Rewindability constraints (parallel scans are non-rewindable) - GUC integration: max_parallel_workers_per_gather controls worker count
In Cloudberry's MPP architecture, segment stats are delivered asynchronously to the coordinator. The seq_scan counter can be registered before seq_tup_read arrives from segments, causing wait_for_stats() to exit prematurely and the subsequent assertion to fail intermittently in the pax-ic-good-opt-off CI job. Add an explicit wait condition (updated6) for seq_tup_read reaching the expected value, and update the comment to reflect Cloudberry's segment-level async stats delivery rather than parallel workers.
Add libicu-devel package to Rocky Linux 8, 9, and 10 Dockerfiles to provide ICU (International Components for Unicode) library support required for PostgreSQL 16 kernel compilation. This dependency is already present in Ubuntu 22.04 and Ubuntu 24.04 development images, ensuring consistency across all supported build platforms for PostgreSQL 16 compilation requirements.
The command `gppkg --clean` fails with the following error: "'SyncPackages' object has no attribute 'ret'". This occurs because `operations` was being passed positionally during the OperationWorkerPool initialization, which incorrectly bound it to the `should_stop` argument instead of `items` in the base WorkerPool class. The solution is to pass `operations` as a keyword argument..
* Fix: FDW OPTIONS encoding accepts symbolic names (issue #1726) Both the FDW catalog reader (src/backend/access/external/external.c) and the gp_exttable_fdw option validator (gpcontrib/gp_exttable_fdw/option.c) parsed the "encoding" OPTIONS value with atoi(). atoi("UTF8") returns 0 (PG_SQL_ASCII) and PG_VALID_ENCODING(0) is true, so symbolic names like 'UTF8', 'utf-8', 'GBK' silently fell through validation and were stored as SQL_ASCII at read time. By contrast, the legacy CREATE EXTERNAL TABLE ... ENCODING ... path resolves names via pg_char_to_encoding() and persists a numeric form into OPTIONS — only the FDW OPTIONS entry point bypassed that translation. Add a small shared helper parse_fdw_encoding_option(const char *) in src/backend/access/external/external.c (declared in src/include/access/external.h): - first try pg_char_to_encoding(name) — same logic as the legacy path; - otherwise try a strict numeric form via strtol() with end-of-string and PG_VALID_ENCODING() checks (atoi is intentionally avoided, since atoi("UTF8")==0 is the bug being fixed); - otherwise ereport(ERROR). Both the validator and GetExtFromForeignTableOptions() call this helper. On-disk values in pg_foreign_table.ftoptions are stored verbatim as the user wrote them; correctness is established at read time. This avoids a ProcessUtility_hook approach, which is unworkable here because the extension's _PG_init runs lazily on the first dlopen, after the current statement's hook check has already passed. Affected scope: gp_exttable_fdw (used by gp_exttable_server). The standalone pxf_fdw is unaffected — its validator already routes encoding through ProcessCopyOptions, which is name-aware. Behavior change on upgrade: existing rows whose ftoptions literally contain encoding=<name> have, until now, been silently interpreted as SQL_ASCII. After this fix they are interpreted as the named encoding. This will be called out in the release notes; a detection query is provided in the PR description for operators who wish to pin specific tables to numeric form before upgrade. Tests added in gpcontrib/gp_exttable_fdw/{input,output}/gp_exttable_fdw.source cover encoding '6' / 'UTF8' / 'utf-8' / 'GBK' / 'bogus' and an ALTER FOREIGN TABLE ... OPTIONS (SET encoding 'UTF8') path. The pre-existing encoding '-1' error case has its expected error message updated to match the new helper's wording. * test: pad expected output headers to match psql separator widths The new tests added in the previous commit had column header lines without the trailing-space padding that psql's aligned output emits to match the separator. The pre-existing ext_special_uri header (' a | b') was also unintentionally stripped of its trailing space during the same edit. Pure whitespace fix. No behavior change. * test: drop trailing blank line in gp_exttable_fdw expected output pg_regress diffs the expected and actual .out files strictly, including the final newline count. The new encoding test block ended with a stray empty line (";\n\n") while psql produces ";\n", causing a 1-line diff at end-of-file. Pure whitespace fix. * test: reject mixed numeric+letters in FDW encoding option Add a regression case for `encoding '6abc'`. atoi("6abc") would have silently returned 6 (= UTF8), which is the class of bug that motivated moving the FDW encoding option parser off atoi() and onto a strict strtol() form in parse_fdw_encoding_option(). Without this test, the strictness of the numeric path was not directly exercised — only the "unknown name" path ('bogus') was. Pure test addition; no code change. Lands the third of the reviewer's suggestions on issue #1726 (the first two — strict strtol parsing and a single shared helper between the validator and the read path — were already in place in the original fix commit). * ci: retrigger to clear flaky alter_distribution_policy --------- Co-authored-by: chenqiang <chenqiang@hashdata.cn>
ClearAOCSFileSegInfo/ClearFileSegInfo (called from ao_vacuum_rel_recycle_dead_segments) updates pg_aoseg rows via simple_heap_update, which assigns the current CommandId to the new tuple. AppendOptimizedTruncateToEOF then opens a catalog snapshot via GetCatalogSnapshot, which also uses GetCurrentCommandId. Because both operations share the same CommandId, the just-zeroed rows are invisible to the snapshot (cid >= snapshot->curcid), while the old rows with their original non-zero EOF values remain visible. TruncateAOSegmentFile then sees a 0-byte physical file but a non-zero logical EOF and raises: "file size smaller than logical eof" Advancing the command counter before AppendOptimizedTruncateToEOF ensures the zeroed rows are visible to its catalog snapshot (their cid is now strictly less than the new curcid). Fixes: #1746
ReleaseSysCache(htup) was called before NameStr(staForm->stxname) was read, returning a pointer into the already-released tuple buffer. Copy the name with pstrdup() first, then release the cache entry.
This PR fixes the recovery flow when the internal WAL replication slot does not already exist on the source segment. Before this change, both gpsegrecovery and gpconfigurenewsegment would start pg_basebackup first and only retry with slot creation after the backup failed. In practice, that meant a full base backup could run for a long time and then fail at the end because the slot was missing. This change fixes that at the root: adds a shared helper to check whether the replication slot already exists creates the slot up front when needed, before pg_basebackup starts removes the fallback second pg_basebackup attempt from both recovery paths updates unit tests to cover the new behavior and the new failure mode --------- Co-authored-by: Leonid <63977577+leborchuk@users.noreply.github.com>
Contributor
Author
|
Looks like this is just too much. |
Member
Hi @reshke, AFAIS, these updates should already be in the main branch, right? |
Contributor
Author
my bad, this intended to be for REL_2_STABLE |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Commits than changes ABI are excluded: