Skip to content

Setup C++ Infra in GiGL#558

Draft
mkolodner-sc wants to merge 60 commits intomainfrom
mkolodner-sc/cpp-infrastructure
Draft

Setup C++ Infra in GiGL#558
mkolodner-sc wants to merge 60 commits intomainfrom
mkolodner-sc/cpp-infrastructure

Conversation

@mkolodner-sc
Copy link
Copy Markdown
Collaborator

@mkolodner-sc mkolodner-sc commented Mar 25, 2026

Scope of work done

C++ Infrastructure for GiGL

This PR adds the foundational C++ infrastructure to GiGL — everything needed to write, build, lint, and test C++ code alongside the existing Python codebase.

What's included

  • Build system — Switches the build backend from setuptools to scikit-build-core + CMake. CMakeLists.txt at the repo root auto-discovers and builds all python_*.cpp (and python_*.cu) files under gigl/csrc/ as pybind11 extension modules. The build is triggered automatically by make build_cpp_extensions (uv pip install -e . --no-build-isolation) and is wired into make unit_test_py, make integration_test, make install_dev_deps, and pipeline compilation targets so it runs without a separate manual step.

  • Release pipelinerelease.yml is updated to build and publish two wheels — one against CPU torch headers and one against CUDA 12.8 torch headers — using a matrix build. Both wheels coexist in the registry index, differentiated by a build tag (1cpu / 1cu128).

  • Formatting and linting.clang-format and .clang-tidy configs establish C++ style conventions. make format_cpp formats in-place. make check_lint_cpp runs fast static analysis using clangd --check (roughly 10× faster than clang-tidy due to preamble caching). make fix_lint_cpp applies auto-fixable violations via clang-tidy --fix and is intentionally separate from make format since it rewrites logic, not just style. check_lint_cpp is included in lint_test; it is not part of type_check (Python/mypy only).

  • Dependency installationrequirements/install_cpp_deps.sh installs clang-format-15, clang-tidy-15, clangd-15, clang++-15, libstdc++-12-dev, and cmake on Linux. clang++-15 and libstdc++-12-dev are required so that generate_compile_commands.py can rewrite compile_commands.json to use clang natively and clangd can resolve standard headers without a --query-driver workaround.

  • Compile commands generationscripts/generate_compile_commands.py generates .cache/compile_commands.json for clangd by running CMake in a dedicated lint build directory (.cache/cmake_build_lint, separate from scikit-build-core's .cache/cmake_build to avoid generator conflicts). It post-processes the output to replace the compiler with clang++-15. Called automatically by run_cpp_lint.py before each lint run, and directly via make generate_compile_commands for IDE/clangd setup.

  • Linting scriptscripts/run_cpp_lint.py wraps clangd --check, runs checks in parallel, and produces clean output: a green pass message or a per-file list of failures. All clangd noise is filtered.

  • Documentationdocs/cpp_style_guide.md documents the formatting and linting rules, how to run them, and rationale for disabled checks.

  • Unit test scaffoldingtests/unit/cpp/ with a standalone CMakeLists and a smoke test to verify the infrastructure works end to end. Run via make unit_test_cpp; included in make unit_test.

Where is the documentation for this feature?: N/A

Did you add automated tests or write a test plan?

Updated Changelog.md? NO

Ready for code review?: NO

@mkolodner-sc mkolodner-sc changed the title [WIP] Setup C++ Infra Setup C++ Infra in GiGL Apr 1, 2026
Comment thread gigl/scripts/post_install.py
@github-actions
Copy link
Copy Markdown
Contributor

GiGL Automation

@ 20:29:04UTC : Starting to build base images for CUDA and CPU.
This may take a while, please be patient.
The images will be pushed to the GCP Artifact Registry.
Once done, the workflow will update the gigl/dep_vars.env file with the new image names.

@github-actions
Copy link
Copy Markdown
Contributor

GiGL Automation

@ 20:42:54UTC : Starting to build base images for CUDA and CPU.
This may take a while, please be patient.
The images will be pushed to the GCP Artifact Registry.
Once done, the workflow will update the gigl/dep_vars.env file with the new image names.

@github-actions
Copy link
Copy Markdown
Contributor

GiGL Automation

@ 20:48:49UTC : Starting to build base images for CUDA and CPU.
This may take a while, please be patient.
The images will be pushed to the GCP Artifact Registry.
Once done, the workflow will update the gigl/dep_vars.env file with the new image names.

@github-actions
Copy link
Copy Markdown
Contributor

GiGL Automation

@ 20:58:39UTC : Built and pushed new images:

  • CUDA base image: us-central1-docker.pkg.dev/external-snap-ci-github-gigl/public-gigl/gigl-cuda-base:30e4e309ebff6ac167c09efa0ae70cc80b2c22c9.96.1
  • CPU base image: us-central1-docker.pkg.dev/external-snap-ci-github-gigl/public-gigl/gigl-cpu-base:30e4e309ebff6ac167c09efa0ae70cc80b2c22c9.96.1
  • Dataflow base image: us-central1-docker.pkg.dev/external-snap-ci-github-gigl/public-gigl/gigl-dataflow-base:30e4e309ebff6ac167c09efa0ae70cc80b2c22c9.96.1
  • Builder image: us-central1-docker.pkg.dev/external-snap-ci-github-gigl/gigl-base-images/gigl-builder:30e4e309ebff6ac167c09efa0ae70cc80b2c22c9.96.1

Updated gigl/dep_vars.env with new image names.
Updated .github/cloud_builder/run_command_on_pr_cloud_build.yaml with new builder image.

@mkolodner-sc
Copy link
Copy Markdown
Collaborator Author

/unit_test

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 20, 2026

GiGL Automation

@ 21:04:18UTC : 🔄 Scala Unit Test started.

@ 21:14:07UTC : ✅ Workflow completed successfully.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 20, 2026

GiGL Automation

@ 21:04:20UTC : 🔄 Python Unit Test started.

@ 22:01:34UTC : ✅ Workflow completed successfully.

@mkolodner-sc
Copy link
Copy Markdown
Collaborator Author

/unit_test

@mkolodner-sc
Copy link
Copy Markdown
Collaborator Author

/e2e_test

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 20, 2026

GiGL Automation

@ 21:11:42UTC : 🔄 Python Unit Test started.

@ 22:19:40UTC : ✅ Workflow completed successfully.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 20, 2026

GiGL Automation

@ 21:11:44UTC : 🔄 Scala Unit Test started.

@ 21:19:59UTC : ✅ Workflow completed successfully.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 20, 2026

GiGL Automation

@ 21:11:48UTC : 🔄 E2E Test started.

@ 21:21:24UTC : ❌ Workflow failed.
Please check the logs for more details.

@mkolodner-sc
Copy link
Copy Markdown
Collaborator Author

/e2e_test

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 20, 2026

GiGL Automation

@ 21:43:04UTC : 🔄 E2E Test started.

@ 23:16:09UTC : ✅ Workflow completed successfully.

Comment thread .github/cloud_builder/run_command_on_active_checkout.yaml Outdated
Copy link
Copy Markdown
Collaborator

@kmontemayor2-sc kmontemayor2-sc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the excellent work Matt!

I think the only thing left blocking on my side is adding the cpp test to our merge suite https://github.com/Snapchat/GiGL/pull/558/changes#r3118534359


COPY CMakeLists.txt CMakeLists.txt
COPY MANIFEST.in MANIFEST.in
COPY README.md README.md
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QQ any reason we copy this over? If we need it, can we add a comment saying why?

Comment thread docs/cpp_style_guide.md
| Priority | Pattern | Group |
| -------- | ------------------------------------ | ------------------------------------- |
| 1 | `.*` | Local project headers (first) |
| 2 | `^"(llvm\|llvm-c\|clang\|clang-c)/"` | LLVM/Clang internal headers |
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm, this seems very specific to projects relying on LLVM / clang internals, I don't think we're really going to use those internals? I know we'll use the tools, but do we forsee needing their imports?

Comment thread docs/cpp_style_guide.md
Comment on lines +137 to +140
| Type template parameters | `CamelCase` | `NodeType` |
| Functions, methods | `camelBack` | `sampleNeighbors` |
| Variables, parameters, members | `camelBack` | `numNodes` |
| Private/protected members | `camelBack` with `_` prefix | `_nodeFeatures` |
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, I feel like being able to more easily differentiate between MyClass and myFunction my be useful (and is more inline with our Python code base). But I wont push back here anymore,

Comment thread docs/cpp_style_guide.md

| Identifier kind | Convention | Example |
| --------------------------------------------------------- | --------------------------- | ----------------- |
| Classes, enums, unions | `CamelCase` | `DistDataset` |
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You know I was always under the impression first cap is PascalCase, and camelCase doesn't have an initialCapital but then wiki says it doesn't really matter. https://en.wikipedia.org/wiki/Camel_case#Description

Though since we use both (for different things) maybe we should differentiate?

Comment thread docs/cpp_style_guide.md
Comment on lines +19 to +153
## Formatting (`.clang-format`)

The style is based on LLVM with the following notable deviations:

### Line length

```
ColumnLimit: 120
```

120 columns rather than the LLVM default of 80. ML and graph code tends to have longer identifiers and nested template
types; 120 gives enough room without forcing awkward wraps.

### Indentation and braces

```
IndentWidth: 4
BreakBeforeBraces: Attach # K&R / "same-line" style
UseTab: Never
IndentCaseLabels: true # case labels indented inside switch
NamespaceIndentation: None # namespace bodies not indented
```

### Pointer and reference alignment

```
PointerAlignment: Left
```

Pointers bind to the type, not the name: `int* x`, not `int *x`.

### Parameter and argument wrapping

```
BinPackArguments: false
BinPackParameters: false
```

When a function call or declaration doesn't fit on one line, every argument/parameter gets its own line. Mixed
"bin-packing" (some on one line, some wrapped) is not allowed.

### Templates

```
AlwaysBreakTemplateDeclarations: true
```

`template <...>` always appears on its own line, keeping the declaration signature visually separate from the template
header.

### Include ordering

Includes are sorted and split into three priority groups:

| Priority | Pattern | Group |
| -------- | ------------------------------------ | ------------------------------------- |
| 1 | `.*` | Local project headers (first) |
| 2 | `^"(llvm\|llvm-c\|clang\|clang-c)/"` | LLVM/Clang internal headers |
| 3 | `^(<\|"(gtest\|isl\|json)/)` | System and third-party headers (last) |

### Raw string formatting

Raw string literals with the `pb` delimiter (e.g. `R"pb(...)pb"`) are formatted as TextProto using Google style,
matching the protobuf idiom used throughout the codebase.

______________________________________________________________________

## Static Analysis (`.clang-tidy`)

### Check philosophy

A broad set of check families is enabled to catch bugs, enforce modern C++ idioms, and maintain readability. All
warnings are errors — there is no "warning-only" category.

Enabled families:

| Family | What it covers |
| --------------------------- | -------------------------------------------------------------------------------------------------------------------------------------- |
| `boost-use-to-string` | Prefer `std::to_string` over `boost::lexical_cast` for numeric conversions |
| `bugprone-*` | Common programming mistakes: dangling handles, suspicious string construction, assert side effects, etc. |
| `cert-*` | CERT secure coding rules for error handling (`err34-c`), floating-point loops (`flp30-c`), and RNG seeding (`msc32-c`, `msc50/51-cpp`) |
| `clang-diagnostic-*` | Compiler diagnostic warnings surfaced as lint checks (e.g. `-Wall`, `-Wextra` violations) |
| `cppcoreguidelines-*` | C++ Core Guidelines: no raw `malloc`, no union member access, no object slicing, safe downcasts |
| `google-*` | Google C++ style: explicit constructors, no global names in headers, safe `memset` usage |
| `hicpp-exception-baseclass` | All thrown exceptions must derive from `std::exception` |
| `misc-*` | Miscellaneous: header-only definitions, suspicious enum usage, throw-by-value/catch-by-reference, etc. |
| `modernize-*` | Modernize to C++11/14/17: `nullptr`, range-based for, `make_unique`, `using` aliases, etc. |
| `performance-*` | Unnecessary copies, inefficient string ops, missed `emplace`, type promotions in math functions |
| `readability-*` | Naming conventions, braces around statements, boolean simplification, function size limits |

### Disabled checks

Some checks in the above families are disabled where they produce excessive noise or conflict with common patterns in
this codebase:

| Disabled check | Reason |
| ----------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `bugprone-easily-swappable-parameters` | Tensor and sampler APIs legitimately have many adjacent same-typed parameters |
| `bugprone-implicit-widening-of-multiplication-result` | Crashes clang-tidy 15 on a construct in `ATen/core/dynamic_type.h` (upstream LLVM bug). Re-enable when upgrading past clang-tidy 15. |
| `bugprone-narrowing-conversions` | Too noisy in ML code mixing `int`/`int64_t`/`size_t` for tensor dimensions |
| `misc-confusable-identifiers` | Performs an O(n²) comparison of all identifiers in scope to detect Unicode homoglyphs. PyTorch headers introduce thousands of identifiers, making this check account for ~70% of total lint time. All identifiers in this codebase are standard ASCII. |
| `misc-const-correctness` | Produces false positives with pybind11 types whose mutation happens through `operator[]` (which is non-const). The check incorrectly suggests `const` on variables that are mutated. |
| `misc-no-recursion` | Recursive graph algorithms are intentional |
| `modernize-avoid-c-arrays` | C arrays are needed for pybind11 and C-interop code |
| `modernize-use-trailing-return-type` | Trailing return types (`auto f() -> T`) are only useful when the return type depends on template params. Requiring them everywhere is non-standard and reduces readability. |
| `readability-avoid-const-params-in-decls` | Incorrectly fires on `const T&` parameters in multi-line declarations (clang-tidy 15 bug). The check is meant for top-level const on by-value params, which is a separate, valid concern. |
| `readability-container-contains` | `.contains()` requires C++20; the codebase builds with C++17 |
| `readability-identifier-length` | Short loop variables (`i`, `j`, `k`) are idiomatic |
| `readability-function-cognitive-complexity` | Algorithmic code often requires nesting that is inherent to the problem structure. Enforcing an arbitrary complexity ceiling discourages clarity and encourages artificial decomposition. |
| `readability-magic-numbers` | Literal constants are common in ML code (e.g. feature dimensions) |

### Naming conventions

Enforced via `readability-identifier-naming`:

| Identifier kind | Convention | Example |
| --------------------------------------------------------- | --------------------------- | ----------------- |
| Classes, enums, unions | `CamelCase` | `DistDataset` |
| Type template parameters | `CamelCase` | `NodeType` |
| Functions, methods | `camelBack` | `sampleNeighbors` |
| Variables, parameters, members | `camelBack` | `numNodes` |
| Private/protected members | `camelBack` with `_` prefix | `_nodeFeatures` |
| Constants (`constexpr`, `const` globals, class constants) | `CamelCase` with `k` prefix | `kMaxBatchSize` |

### Key option tuning

| Option | Value | Effect |
| ---------------------------------------------------------- | ----------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `WarningsAsErrors` | `*` | Every check failure is a hard error in CI |
| `HeaderFilterRegex` | `.*/gigl/csrc/.*` | Scopes checks to our own headers. Using `.*` causes clang-tidy to report warnings from every PyTorch/pybind11 header it parses, flooding output with thousands of third-party issues. |
| `FormatStyle` | `none` | clang-tidy does not auto-reformat; use clang-format separately |
| `bugprone-string-constructor.LargeLengthThreshold` | `8388608` (8 MB) | Strings larger than 8 MB from a length argument are flagged |
| `modernize-loop-convert.NamingStyle` | `CamelCase` | Auto-generated loop variable names use CamelCase |
| `readability-function-size.LineThreshold` | `1000` | Functions over 1000 lines are flagged |
| `readability-braces-around-statements.ShortStatementLines` | `0` | Braces required for all control-flow bodies, even single-line |
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No strong preference, this is fine :)

Comment thread Makefile

# Not part of `make format`: clang-tidy --fix rewrites logic (renames identifiers,
# changes expressions, adds/removes keywords), not just style. Run manually and
# review the diff before committing.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can't fix everything right? Should we note that this is not guaranteed to fix everything?

Comment thread Makefile
generate_compile_commands:
uv run python -m scripts.generate_compile_commands

check_lint_cpp:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I more ment like, "Typing make format_check does not also check the cpp linter" but that's a small nit.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jobs:
build:
name: Build and release pip whl
name: Build and release pip whl (${{ matrix.torch-variant }})
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to update our install guidance for the correct name now?


_REPO_ROOT: Path = Path(__file__).resolve().parent.parent
_CMAKE_BUILD_DIR: Path = _REPO_ROOT / ".cache" / "cmake_build_lint"
_COMPILE_COMMANDS: Path = _REPO_ROOT / ".cache" / "compile_commands.json"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be PUBLIC if we import it later?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants