Skip to content

Http4s v7.0.0#2760

Open
constantine2nd wants to merge 21 commits intoOpenBankProject:developfrom
constantine2nd:develop
Open

Http4s v7.0.0#2760
constantine2nd wants to merge 21 commits intoOpenBankProject:developfrom
constantine2nd:develop

Conversation

@constantine2nd
Copy link
Copy Markdown
Contributor

API request is atomic transaction
Endpoint count: 27. Test scenarios: 92.
Tests duration via GitHub CI: up to 16 minutes

Root cause: fromFuture set currentProxy on the IO compute thread before
submitting Futures and never cleared it.  After the fiber completed,
that compute thread retained the proxy indefinitely.  In tests,
RequestScopeConnectionTest's `after` block only cleared the test
thread, leaving io-compute threads dirty.  Subsequent DB.use calls
on those threads received the test's tracking proxy, whose getMetaData()
returns null, producing NPE at MetaMapper._dbTableNameLC:1390.

Fix: fromFuture now uses IO.defer so that set-submit-clear all run
synchronously on the same compute thread T:
  1. currentProxy.set(proxy) on T
  2. val f = fut  — Future submitted; TtlRunnable captures T's proxy
  3. currentProxy.remove() on T  — T is clean
  4. IO.fromFuture(IO.pure(f))  — await the already-submitted future

The Future worker still receives the proxy via TtlRunnable (captured
at step 2 before the clear).  No compute thread is ever left dirty.

withRequestTransaction also removed its own currentProxy.set call
(another dirty-thread source if guaranteeCase ran on a different
thread).  All TTL lifecycle is now local to fromFuture.
…nectionManager

new Exception().getStackTrace.drop(1).take(5).mkString(" <- ") was called
unconditionally on every newConnection invocation — introduced in the debug
logging commit (c28c692) to trace the NPE origin.  Stack trace capture
walks all JVM frames and is expensive; with thousands of DB operations across
the test suite it inflated CI time from ~32 min to ~54 min.

Also removed Thread.currentThread().getName and unconditional debug string
interpolation from both newConnection and releaseConnection hot paths.
The WARN log for the stale-proxy fallback (uncommon) is kept as-is.
…exhaustion

withRequestTransaction (v7 native endpoints via ResourceDocMiddleware) holds
one HikariCP connection for the full duration of each request. ScalaCache
rate-limit queries (RateLimiting.findAll on cache miss) run concurrently on
the OBP EC and need additional pool connections. With the default pool of 10
and 10 concurrent test threads, all connections are held by active requests
leaving none for background queries — 30-second pool timeout fires, the test's
10-second HTTP client timeout trips first, and Property 7.1 fails with
"Futures timed out after [10 seconds]".

Worst-case analysis: N concurrent requests hold N connections; up to N
background rate-limit queries each need 1 more → 2*N needed. Pool of 20
covers the 10-thread test (2*10=20) with zero waste.
Three targeted fixes based on per-test timing analysis:

1. code.api.util (1m2s → ~8s, saves 54s)
   JavaWebSignatureTest had Thread.sleep(60 seconds) to let a JWS
   signature expire. JwsUtil.verifySigningTime now reads
   jws.signing_time_validity_seconds from props (default 60). CI sets
   it to 5; the test sleeps 6s. Same coverage, 54s faster.

2. code.api.ResourceDocs1_4_0 (2m0s → ~45s, saves 75s)
   - ResourceDocsTest called stringToNodeSeq on ALL 600+ endpoint
     descriptions per scenario (7,800 HTML5 parses across 13 versions).
     Changed to take(3).foreach — verifies the function works without
     the O(N) per-version cost.
   - SwaggerDocsTest validated the full OpenAPI spec via OpenAPIParser
     for 12 API versions. Kept v5.1.0, v4.0.0, v1.2.1 (representative
     range); dropped 9 redundant intermediate versions. Access-control
     scenarios unchanged. 19 → 10 scenarios.

3. code.api.http4sbridge (2m49s → ~50s, saves 119s)
   50 property scenarios each ran val iterations = 10 (three at 20),
   totalling 530 HTTP round-trips. Added CI_ITERATIONS=3 /
   CI_ITERATIONS_HEAVY=5 constants; all scenarios now reference them.
   Iteration ratio cut by 70%; same properties exercised.
…leep

Build OpenBankProject#48 showed JavaWebSignatureTest still failing: the test slept 6s
but the server's validity window is 60s (default), so the 6s-old signature
was still valid and the endpoint returned 200 instead of 401.

Root cause: the previous fix (sleep 60→6s + prop jws.signing_time_validity_seconds=5)
depended on the prop being injected at runtime. Jenkins uses its own props
mechanism and did not pick up the .github/workflows YAML change.

Fix: sign with signingTime = now - 65s — always outside the 60s default
window, no sleep needed, no prop dependency. Same pattern as the unit
test scenarios at lines 74-118. Also removes the now-unnecessary
jws.signing_time_validity_seconds=5 from the CI YAML.
Compile job runs once (~10 min), uploads target/ as an artifact.
Three test shards download the artifact and run in parallel (~8 min),
cutting wall-clock time from ~32 min to ~18 min.

Key details:
- mvn clean install -DskipTests compiles test classes into target/
- Each shard runs `mvn test -DwildcardSuites=...` (comma-separated
  packages; YAML >- scalar produces spaces, converted via tr ' ' ',')
- obp-parent POM + obp-commons JAR installed into ~/.m2 on each shard
  so Maven can resolve com.tesobe:* deps without hitting remote repos
- Redis service, full props echo block, hikari.maximumPoolSize=20 on
  every shard
- Verify step dumps class-file counts to confirm artifact was unpacked
- build_container.yml updated identically
…ation

download-artifact preserves original compile-job timestamps; checkout
gives source files the current (later) time; Zinc sees sources as newer
and does a full recompile (431 main + 320 test Scala sources = ~215 s
per shard). Touching everything in target/ makes artifact files appear
just-downloaded, newer than sources, so Zinc skips compilation entirely.
Auto-discovers any test package not covered by shards 1/2 at runtime,
so new packages are never silently skipped in CI.
New report job (always runs after test shards) downloads all shard
artifacts and runs .github/scripts/test_speed_report.py to emit a
per-test speed table to the step log and GitHub Actions job summary.
Shard 1 now runs v4_0_0 alone (~258s). Former shard-1 packages minus v4
move to shard 2 (adds v6_0_0, ~267s). Former shard-2 becomes shard 3
(~252s). Former shard-3 with catch-all becomes shard 4 (~232s).

Catch-all block updated in both workflow files: shard check changed from
"3" to "4", ASSIGNED now references SHARD1+SHARD2+SHARD3, echo message
updated. CLAUDE.md shard table updated to match.
getCacheConfig, getCacheInfo, getDatabasePoolInfo,
getStoredProcedureConnectorHealth, getMigrations, getCacheNamespaces.

All follow the standard withUser + role pattern; middleware enforces
entitlement checks. 17 new test scenarios (401/403/200) added to
Http4s700RoutesTest — getStoredProcedureConnectorHealth has only
401/403 since StoredProcedureUtils init block requires DB props not
present in the test environment.

Endpoint count: 21 → 27. Test scenarios: 69 → 92.
Real remaining counts (v6.0.0 only):
- GET: 98
- POST: 57
- PUT: 33
- DELETE: 26
- Total: 214

Previous estimates (~192 GETs, ~256 POST/PUT/DELETE) were rough guesses.
Removed hard-coded v7.0.0 version lock. Delegates to
ResourceDocs140.ImplementationsResourceDocs.getResourceDocsList(requestedApiVersion)
so /obp/v7.0.0/resource-docs/v6.0.0/obp returns v6 docs instead of 400.
Updated "Reject non-v7.0.0" test to "Serve v6.0.0 docs" with ?functions filter
to avoid serializing 600+ endpoints. Added "invalid version → 400" test.
@sonarqubecloud
Copy link
Copy Markdown

Quality Gate Failed Quality Gate failed

Failed conditions
2 Security Hotspots

See analysis details on SonarQube Cloud

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant