[Test][E2E] Stabilize DB2 upsert timestamp ordering#11161
Closed
davidzollo wants to merge 2 commits into
Closed
Conversation
This PR introduces a comprehensive data tracing system for SeaTunnel Engine
that tracks sampled records through the entire pipeline with minimal overhead.
## Core Features
### Trace Infrastructure
- StainTraceEvent: Event system for trace points
- StainTracePayload: Compact binary payload (STTR protocol v1)
- StainTraceSampler: Deterministic sampling (seq % rate == 0)
- StainTraceBudget: Per-worker per-second budget control
- TaskMappingBuilder: Maps task IDs to human-readable names
### Trace Stages (6 stages)
1. SOURCE_EMIT: Source emits record
2. QUEUE_IN: Queue enqueue complete
3. QUEUE_OUT: Queue dequeue start
4. TRANSFORM_IN: Transform receives record
5. TRANSFORM_OUT: Transform outputs record
6. SINK_WRITE_DONE: Sink write complete
### Framework Integration (Zero connector changes)
- RecordSerializer: Extended for payload transmission (backward compatible)
- SeaTunnelSourceCollector: Create payload, append SOURCE_EMIT
- IntermediateQueue: Append QUEUE_IN/QUEUE_OUT
- TransformFlowLifeCycle: Append TRANSFORM_IN/OUT (handles 1-to-N)
- SinkFlowLifeCycle: Append SINK_WRITE_DONE, report event
### Trace Collector Service
Standalone HTTP service for collecting and querying traces:
- Multi-database support: PostgreSQL, MySQL, ClickHouse
- REST APIs: /ingest, /traces, /trace/{id}, /health, /metrics
- Task mapping cache: Auto-fetch task names from Engine
- Payload decoder: Parse binary payload to timing entries
- Built-in metrics: Ingestion rate, errors, query latency
### Web UI Integration
- New trace visualization page
- Query by trace_id or job_id
- Display timing breakdown per stage
- Visualize bottlenecks
### Configuration
```yaml
seatunnel:
engine:
stain-trace-enabled: true
stain-trace-sample-rate: 100000
stain-trace-max-traces-per-second-per-worker: 50
stain-trace-max-entries-per-trace: 32
```
### Production Safety
- Zero overhead when disabled (single boolean check)
- ~0.1-1% overhead with 1% sampling when enabled
- Strict event volume upper bound (per-worker budget)
- One event per sampled record (no event storm)
- Backward compatible serialization (new reads old)
## Implementation Details
### Binary Payload Protocol
```
MAGIC(4) = 0x53545452 // 'STTR'
VER(2) = 1
TRACE_ID(8)
START_TS_MS(8)
COUNT(2)
ENTRIES:
STAGE(1), TASK_ID(8), TS_MS(8)
```
### Performance Analysis
From trace entries, calculate:
- End-to-end latency: SINK_WRITE_DONE.ts - SOURCE_EMIT.ts
- Queue wait time: QUEUE_OUT.ts - QUEUE_IN.ts
- Transform processing: TRANSFORM_OUT.ts - TRANSFORM_IN.ts
- Sink write time: SINK_WRITE_DONE.ts - TRANSFORM_OUT.ts
Aggregate P95/P99 metrics to identify bottlenecks.
## Testing
- Unit tests: StainTracePayloadTest, StainTraceSamplerTest, RecordSerializerTest
- Integration tests: StainTraceFlowIT, TransformFlowLifeCycleStainTraceTest
- Backward compatibility: Old format → new reader
- Edge cases: 1-to-N (FlatMap), 1-to-0 (Filter), checkpoint recovery
## Documentation
- Quick start guide: seatunnel-trace/STAIN_TRACE_QUICKSTART.md
- Database init scripts: init-mysql.sql, etc.
- Config reference and deployment guide
## Files Changed
82 files changed, 6612 insertions(+), 42 deletions(-)
Contributor
Author
|
Closing this accidental fork-base PR. The DB2 ordering fix is already present in current apache/dev, and PR #11137 has been updated to latest dev so its rerun can pick up that baseline fix. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
JdbcDb2UpsertIT.testDb2UpsertE2eC_UPDATED_ATwithout a deterministic row order, so the expected timestamp sequence can be compared against the wrong row orderWhat
ORDER BY C_INTto the DB2 verification queries inJdbcDb2UpsertITValidation
./mvnw spotless:apply./mvnw -pl seatunnel-e2e/seatunnel-connector-v2-e2e/connector-jdbc-e2e/connector-jdbc-e2e-common,seatunnel-e2e/seatunnel-connector-v2-e2e/connector-jdbc-e2e/connector-jdbc-e2e-part-1 -DskipTests test-compile./mvnw -pl seatunnel-e2e/seatunnel-connector-v2-e2e/connector-jdbc-e2e/connector-jdbc-e2e-common,seatunnel-e2e/seatunnel-connector-v2-e2e/connector-jdbc-e2e/connector-jdbc-e2e-part-1 -Dtest=JdbcDb2UpsertIT -Dsurefire.failIfNoSpecifiedTests=false testValidation notes
seatunnel-engine-uiwas not built because this change only touches JDBC E2E tests