Skip to content

First attempt to the ACS onboarding#100

Open
p-rog wants to merge 55 commits intovalidatedpatterns:mainfrom
p-rog:acs-onboarding
Open

First attempt to the ACS onboarding#100
p-rog wants to merge 55 commits intovalidatedpatterns:mainfrom
p-rog:acs-onboarding

Conversation

@p-rog
Copy link
Collaborator

@p-rog p-rog commented Feb 23, 2026

Red Hat Advanced Cluster Security (RHACS/StackRox) consists of two main deployment types:

Central Services (Hub Cluster)

Central:

  • Management console and API server
  • Policy engine and enforcement
  • Centralized data aggregation
  • Vulnerability database management

Scanner:

  • Vulnerability scanning for container images
  • Pulls image layers from registries
  • Identifies installed packages
  • Compares against CVE databases

Secured Cluster Services (Per Cluster)

Sensor:

  • Monitors cluster activity
  • Listens to Kubernetes API events
  • Collects data from Collectors
  • Reports cluster state to Central

Admission Controller:

  • Policy enforcement at deployment time
  • Validates resources before admission
  • Prevents policy violations
  • Configurable bypass options

Collector:

  • Per-node DaemonSet deployment
  • Runtime monitoring and network activity
  • Container activity analysis
  • Sends data to Sensor

@p-rog p-rog marked this pull request as draft February 23, 2026 16:27
@p-rog
Copy link
Collaborator Author

p-rog commented Feb 23, 2026

I have to fix the ACS init secret issue:

  1. Init bundle can ONLY be generated AFTER ACS Central is deployed and running
  2. The Validated Patterns framework processes ALL secrets BEFORE deploying applications
  3. With onMissingValue: error, installation fails if the secret doesn't exist in Vault

Przemyslaw Roguski and others added 2 commits February 23, 2026 20:17
- Fix indentation in values-hub.yaml (stackrox namespace)
- Comment out acs-init-bundle secret (not needed for same-cluster deployment)
- RHACS operator auto-generates auth for co-located Central + SecuredCluster

Fixes vault namespace deployment issue.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@p-rog
Copy link
Collaborator Author

p-rog commented Feb 23, 2026

The secret issue is fixed.
I'm working on Vault service creation issue.

Przemyslaw Roguski and others added 24 commits February 24, 2026 13:34
This commit resolves two critical issues preventing ACS Central and
SecuredCluster Custom Resources from being deployed:

1. Uncommented extraValueFiles for acs-central and acs-secured-cluster
   applications in values-hub.yaml. This enables helm charts to receive
   global configuration values (localClusterDomain, secretStore, etc.)
   required for proper template rendering.

2. Added ExternalSecret template for central-htpasswd admin password.
   This syncs the admin password from Vault (hub/infra/acs) to the
   Kubernetes secret expected by the Central CR.

With these fixes, ArgoCD will successfully render and deploy:
- Central CR (Wave 10) with PostgreSQL DB and Scanner components
- Init bundle job (Wave 12) to generate TLS secrets
- OAuth integration job (Wave 13) for OpenShift authentication
- SecuredCluster CR (Wave 15) with Sensor, Collector, and Admission Controller

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
… the central-cr.yaml and secured-cluster-cr.yaml, removing the perNode duplication, adding explicit scannerV4 configuration to central-cr.yaml
The cluster only has ACM release-2.15 channel available.
Changed from release-2.14 to release-2.15 to fix subscription failure.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Two critical fixes to resolve ArgoCD manifest generation errors:

1. Fixed acs-central chart: Removed Helm template syntax from comment
   in create-cluster-init-bundle.yaml line 4. Helm parses template
   syntax even in comments, causing 'invalid value; expected string'
   error at column 98.

2. Fixed acs-secured-cluster chart: Removed quotes from clusterName
   override value in values-hub.yaml. The quoted template syntax
   caused 'key } has no value' error because ArgoCD was passing
   literal curly braces to helm --set command.

These fixes allow both ACS applications to render manifests correctly.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Fixed nil pointer error in ExternalSecret template by adding default
secretStore configuration to values.yaml.

Error: 'nil pointer evaluating interface {}.name'
Root cause: global.secretStore.name and global.secretStore.kind were
undefined, causing ExternalSecret template to fail.

Solution: Added default values matching validated patterns convention:
- secretStore.name: vault-backend
- secretStore.kind: ClusterSecretStore

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Changed Vault secret path from 'hub/infra/acs' to 'hub/infra/acs/acs-central'
to match the actual location where validated patterns framework stores
the secret.

Root cause: Framework creates secrets at {vaultPrefixes}/{name} which
results in hub/infra/acs/acs-central, not hub/infra/acs.

This fixes the error: 'Secret does not exist at hub/infra/acs'

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Disabled both scanner V3 and V4 to reduce resource requirements.
This allows Central to deploy on resource-constrained clusters.
Scanners can be re-enabled later when more resources are available.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…mplate labels

- Changed adminPasswordSecretRef to adminPasswordSecret (correct API field)
- Added labels to create-cluster-init-bundle Job template (required by Kubernetes)
- Fixes authentication error preventing init-bundle generation
- Add create-htpasswd-field job to automatically generate bcrypt htpasswd entry
  from the plain password in central-htpasswd secret (sync-wave 6)
- Modify create-cluster-init-bundle job to:
  * Check for existing init bundles with the same cluster name
  * Delete existing bundle before creating new one
  * Validate API response contains kubectlBundle before attempting to apply
- Fixes authentication issues and init bundle conflicts
- Replace heredoc with printf for Python script (heredoc inside YAML literal block causes parse errors)
- Fix quote escaping in Python one-liners (use single quotes for outer, double for inner)
- Ensures YAML parses correctly in ArgoCD
- Remove output redirection to /dev/null to make errors visible
- Add progress messages to help debug installation issues
- Change image registry from registry.redhat.io to registry.access.redhat.com
- Remove Sync hook annotation to prevent blocking ArgoCD sync
- httpd-tools package is available in ubi-9-appstream-rpms repository
- Bcrypt generates different hashes each time due to random salt
- Change logic to check if valid bcrypt htpasswd entry exists (starts with admin:$2[aby]$)
- This makes the job idempotent - exits successfully if valid entry already exists
Root cause analysis revealed three critical issues:
1. UBI9 base image lacks kubectl binary
2. Container runs as non-root (UID 1000810000) due to OpenShift SCC
3. Cannot install httpd-tools with dnf (requires root privileges)

Solution:
- Use OpenShift CLI image (has oc/kubectl and python3)
- Replace htpasswd command with Python's crypt module
- Python crypt.METHOD_BLOWFISH generates valid bcrypt hashes
- Change kubectl to oc (both work, oc is native to image)
- Set imagePullPolicy to Always for internal registry

Tested successfully:
- Python crypt generates valid bcrypt: admin:$2b$12$...
- OpenShift CLI image runs without privilege issues
- Job is now idempotent and works in restricted SCC

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Przemyslaw Roguski and others added 10 commits March 3, 2026 14:36
The openid scope is mandatory for OIDC authentication. Added scope definition
and included it in realm default scopes and ACS client configuration.
Also moved offline_access to optional scopes for ACS client.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Changed OIDC mode from "auto" to "query" to use standard authorization code flow
- Added offline_access role to admin user to allow offline token requests
- Prevents "code already used" and "offline tokens not allowed" errors

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add retry loop to wait for Keycloak OIDC discovery endpoint to be available
before attempting to create the auth provider. This prevents 404 errors when
ACS tries to validate the OIDC configuration during provider creation.

Fixes timing issue where create-auth-provider job runs before Keycloak
realm is fully imported and ready.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Show Keycloak discovery endpoint response before creating provider
- Capture and display HTTP status codes for all API calls
- Show full response bodies for debugging
- Better error messages with HTTP codes

This will help diagnose issues with auth provider creation and role mapping.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add "roles": "roles" to claimMappings so ACS knows to look for the roles
claim in the OIDC token. Without this, ACS cannot map Keycloak roles to
ACS roles, resulting in "no valid role" error.

This is the critical fix for role-based authorization with Keycloak OIDC.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@p-rog
Copy link
Collaborator Author

p-rog commented Mar 4, 2026

An update:

I fixed all ACS-Keycloak OIDC Integration issues.
In short:
Changes:

  • Added openid client scope definition (mandatory OIDC scope)
  • Added openid to realm defaultDefaultClientScopes
  • Added openid to ACS client defaultClientScopes
  • Moved offline_access from ACS client default scopes to optional scopes
  • Added offline_access role to admin user's realmRoles

Why:

  • openid scope is required by the ACS, which follows OIDC specification and must be present in the realm
  • offline_access allows ACS to request refresh tokens
  • Admin user needs offline_access role to prevent "Offline tokens not allowed" error

Now ACS can be automatically deployed as a part of the layered-zero-trust pattern and by default uses Keycloak OIDC authentication. Let me know if I should add to the ACS deployment documentation workflow how the Keycloak integration works.

@p-rog p-rog requested review from mlorenzofr and sabre1041 and removed request for sabre1041 March 5, 2026 10:16
values-hub.yaml Outdated
path: charts/acs-central
overrides:
- name: central.persistence.storageClass
value: gp3-csi
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets comment this out to avoid users from having to modify the values for their environment so that they can leverage the default StorageClass. Commenting it out still makes it visible ad provides insights for users to the configurations they need to apply

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

values-hub.yaml Outdated
namespace: openshift-operators
channel: stable
source: redhat-operators
# csv: rhacs-operator.v4.9.0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the csv be removed?

Przemyslaw Roguski and others added 4 commits March 10, 2026 13:41
  - Commented out hardcoded gp3-csi storageClass in values-hub.yaml to use cluster default

  2. Job Container Image - Made Configurable

  - Added jobImage configuration in values.yaml (registry, repository, tag, pullPolicy)
  - Changed default imagePullPolicy from Always to IfNotPresent
  - Applied to all 3 jobs: create-auth-provider, create-cluster-init-bundle, create-htpasswd-field

  3. Service Account - Made Configurable

  - Added serviceAccountName to values.yaml (default: create-cluster-init)
  - Updated all 3 job templates to use the variable
  - Updated all 4 RBAC templates (ServiceAccount, Role, RoleBinding, ClusterRoleBinding)

  4. OIDC/Keycloak Integration Improvements

  - Removed unused KEYCLOAK_REALM environment variable
  - Made OIDC claim mappings configurable (name, email, groups, roles) for multi-provider support
  - Changed auth provider name from Keycloak OIDC to generic OIDC

  5. Central CR Template Fix

  - Fixed exposure configuration to use values directly instead of conditional rendering with hardcoded enabled: true

  Files Modified: 10

  - values-hub.yaml
  - charts/acs-central/values.yaml
  - charts/acs-central/templates/central-cr.yaml
  - charts/acs-central/templates/jobs/* (3 files)
  - charts/acs-central/templates/rbac/* (4 files)
@p-rog p-rog requested a review from sabre1041 March 10, 2026 14:30
Copy link
Collaborator Author

@p-rog p-rog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sabre1041 I addressed all your concerns and suggestions.

values-hub.yaml Outdated
path: charts/acs-central
overrides:
- name: central.persistence.storageClass
value: gp3-csi
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants