Skip to content

Merge branch 'apache:main' into MDB_STABLE#12

Merged
ostinru merged 18 commits into
MDB_STABLEfrom
bump-82dd3ca4dd1bdd560c2b10a8f36c6a20
Apr 20, 2026
Merged

Merge branch 'apache:main' into MDB_STABLE#12
ostinru merged 18 commits into
MDB_STABLEfrom
bump-82dd3ca4dd1bdd560c2b10a8f36c6a20

Conversation

@ostinru

@ostinru ostinru commented Apr 20, 2026

Copy link
Copy Markdown

No description provided.

tuhaihe and others added 18 commits April 1, 2026 10:12
Update release script to append `-incubating` suffix to the
extracted directory name, ensuring consistency with Apache
release naming conventions.

Example output:
- Tarball: apache-cloudberry-pxf-2.1.0-incubating-rc2-src.tar.gz
- Extracted: apache-cloudberry-pxf-2.1.0-incubating/
Add NOTICE, DISCLAIMER, and LICENSE files to the stage target
in Makefile to ensure they are included in all package formats
(tar, rpm, and deb). This ensures proper license attribution and
compliance with Apache Software Foundation requirements for all
distribution methods.

Previously, these files were not copied during the packaging
process, which could lead to incomplete license information in
distributed packages.
* Run JDBC test in Testcontainers
* both Ubuntu 22.04 and  Rocky Linux 9
The pxf script uses the 'ps' command in the isRunning() function
to check if the PXF process is running. However, minimal container
images like Rocky Linux 9 do not include the 'ps' command by default.

This causes the 'pxf stop' command to fail with:
  /usr/local/cloudberry-pxf/bin/pxf: line 162: ps: command not found

Add explicit package dependencies to ensure the 'ps' command is
available when PXF is installed:
- RPM packages: procps-ng
- DEB packages: procps

The 'ps' command is used in:
- server/pxf-service/src/scripts/pxf:162 (isRunning function)
**Cache CPU-heavy CI steps:**
* DEB build
* RPM build

This should speed up our CI pipelines and reduce github actions quota usage.

**Implementations details:**
* Github cache has 7 days TTL. Every build will reset expiration timeout. In order to force CI to build cloudberry from time to time - I am explicitly specifying current month in a cache key.
* keep both: `actions/cache` and `actions/actions/upload-artifact`. In case cache eviction happens during build (e.g. hit 10Gb limit) we will still have running builds.
* cache reused between `main` and other PRs (but not between PRs). So, cache will be filled only during builds in `main`.
### Add ClickHouse JDBC tests

Add new tests to cover Cloudberry -> PXF -> JDBC path. In this test we are verifying that main data Cloudberry and ClickHouse types can be converted back and forth.

Test covers ClickHouse 24.x and 26.x versions. Open-source ClickHouse has short support lifetime.

I have tested following clickhouse-jdbc drivers:
* 0.6.x (`jdbc-v1`) - works well with old ClickHouse versions. In ClickHouse 25.10 `jdbc-v1` got broken ClickHouse/clickhouse-java#2636 and all queries raised an "Magic is not correct" error message.
* 0.9.4 - fixes "Magic" issues, however has issue with String <-> bytea conversion `ERROR: PXF server error : Method: getBytes("bin") encountered an exception.`
* 0.9.7+ - works well

### Add jdbc-pxf-drivers project

Add new `jdbc-pxf-drivers` project to server. It excluded from default DEB-package build. Explicit actions required to install jdbc-drivers.

### Side quest

Cloudberry FDW serializes rows as Greenplum CSV => PXF `TextRecordReader` parses the CSV stream via univocity CSV parser => for `BYTEA` columns it calls `pgUtilities.parseByteaLiteral()` => which returns a `ByteBuffer`.
Speed up our CI pipeline:
* cache testcontainer images
* do not rebuild singlecluster image on every test run
* allow gradle parallel builds It was disabled for gradle 4.x in apache@124115d
- CI workflow: GO_VERSION 1.21 -> 1.24
- Dockerfile: GO_VERSION 1.21.13 -> 1.24.0
- go.mod: go 1.21.3 -> go 1.24
- Documentation: Update minimum Go version requirement
* Refactor ci/singlecluster/Dockerfile to use GO_VERSION variable

See: apache#96
Update cloudberry-pxf-release.sh to generate artifact filenames
without RC suffix (e.g., apache-cloudberry-pxf-2.1.0-incubating-src.tar.gz)
while keeping RC identifier in directory names only.

Benefits:
- Simplifies promotion process after vote passes
- No need to rename files for checksums/signatures
- Consistent with best practices from other ASF Incubator projects
- Files are ready for final release from RC stage, can use `svn mv` to
  promote the RC artifacts to the final version.

Example structure:
  dev/incubator/cloudberry/2.1.0-incubating-rc1/
    apache-cloudberry-pxf-2.1.0-incubating-src.tar.gz
Use pinned commit hash (4d9f0ba0025fe599b4ebab900eb7f3a1d93ef4c2) instead
of version tag @v5 to comply with Apache GitHub Actions policy.

Fixes apache#101
Add new tests to cover Cloudberry -> PXF -> JDBC path. In this test we are verifying that main data Cloudberry and Oracle types can be converted back and forth.

Test covers Oracle 23 (only supported by Testcontainers version).
… in /server (apache#105)

* Bump org.apache.tomcat.embed:tomcat-embed-core in /server

Bumps org.apache.tomcat.embed:tomcat-embed-core from 9.0.72 to 9.0.117.

---
updated-dependencies:
- dependency-name: org.apache.tomcat.embed:tomcat-embed-core
  dependency-version: 9.0.117
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* Update server/build.gradle

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Dianjin Wang <wangdianjin@gmail.com>
Start HBase only for test groups that require it (hbase, proxy) instead of launching it as part of the default Hadoop stack. This frees RAM for all other test groups.        
   
Additionally, run HBase in standalone mode:                                                                                                                                   
 * No separate RegionServer process                        
 * No separate ZooKeeper process                                                                                                                                               
 * No dependency on HDFS
* Docs: update JDBC-related docs

* review
### Add Microsoft SQL JDBC tests

Add new tests to cover Cloudberry -> PXF -> JDBC path. In this test we are verifying that main data Cloudberry and MS SQL types can be converted back and forth.

Test covers MS SQL 2019 and 2022 versions.
When a numeric column is compared with an integer constant (e.g.,
  biz_dt = 20260331), PostgreSQL wraps the constant in an implicit
  int4 -> numeric type cast FuncExpr. The filter extraction code in
  both the FDW path (fdw/pxf_filter.c) and the external table path
  (external-table/src/pxffilters.c) only recognized plain Const
  nodes, so these predicates were silently dropped and the query
  fell back to a full scan on the remote source.

  Before matching the Var+Const pattern, run eval_const_expressions()
  on any operand that is neither a Var nor a Const. This folds the
  implicit cast into a plain Const, after which the existing filter
  serialization path can push the predicate down unchanged.

  Fix is applied symmetrically in OpExprToPxfFilter (FDW) and
  opexpr_to_pxffilter (external table). Neither function mutates
  the original plan tree; only the local leftop/rightop pointers
  are redirected at the simplified Const.

  Test coverage:
  * Extend FilterPushDownTest and FDW_FilterPushDownTest regression
    suites with c3 = 5, c3 < 5, c3 > 1, c3 <= 2, c3 >= 5, c3 <> 5,
    and c3 BETWEEN 1 AND 5, covering the full set of supported
    scalar operators when the constant type differs from the column
    type.
  * Synchronize the int-const block across both test segments of
    FilterPushDownTest.sql and its expected output.
In this PR following issues addressed:
* Docker image size reduced from 4.52G to 3.34G (github provides 10G cache)
* Runs MS SQL tests with FDW plugin
* Always specify username/password for PostgreSQL JDBC tests (required for hardened PXF versions)
* minor cleanups
@ostinru ostinru marked this pull request as ready for review April 20, 2026 11:33
@ostinru ostinru merged commit e3930e9 into MDB_STABLE Apr 20, 2026
93 of 109 checks passed
@ostinru ostinru deleted the bump-82dd3ca4dd1bdd560c2b10a8f36c6a20 branch April 20, 2026 11:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants