-
Notifications
You must be signed in to change notification settings - Fork 8
feat: OpenTelemetry integration for distributed tracing and metrics #217
Copy link
Copy link
Open
Labels
enhancementNew feature or requestNew feature or requestrustPull requests that update rust codePull requests that update rust code
Description
Summary
Add native OpenTelemetry (OTel) support to Synapse for distributed tracing, metrics, and log correlation. While Prometheus metrics are already available, OTel would provide standardized distributed tracing across the proxy layer, giving operators full request lifecycle visibility.
Motivation
- Operators using modern observability stacks (Jaeger, Tempo, Datadog, etc.) expect OTel-native instrumentation
- Distributed tracing across Synapse → upstream backends enables end-to-end latency debugging
- OTel is becoming the industry standard, replacing vendor-specific integrations
- Correlating WAF decisions, fingerprint lookups, and upstream routing with trace spans would significantly improve incident investigation
Proposed Features
Tracing
- Generate trace spans for key operations:
- Incoming request handling (root span)
- TLS handshake + JA4+ fingerprint computation
- WAF rule evaluation (wirefilter expression matching)
- GeoIP / threat intelligence lookups
- Rate limiter checks
- Upstream selection and load balancing
- Upstream request + response
- CAPTCHA verification (when triggered)
- Content scanning / ClamAV (when triggered)
- Propagate
traceparent/tracestateheaders (W3C Trace Context) to upstream backends - Support configurable sampling rates
Metrics (OTel native)
- Export existing Prometheus metrics via OTel metrics SDK as an alternative
- Add histogram metrics for per-stage latencies (fingerprinting, WAF eval, upstream RTT)
Configuration
telemetry:
opentelemetry:
enabled: true
exporter: "otlp" # otlp, jaeger, zipkin
endpoint: "http://otel-collector:4317"
protocol: "grpc" # grpc, http
sampling_rate: 0.1 # 10% sampling
propagation: "w3c" # w3c, b3, jaeger
resource_attributes:
service.name: "synapse"
deployment.environment: "production"Environment Variables
AX_OTEL_ENABLED— Enable/disable OTelAX_OTEL_ENDPOINT— Collector endpointAX_OTEL_SAMPLING_RATE— Sampling rate (0.0–1.0)
Implementation Notes
- Use
opentelemetry+opentelemetry-otlp+tracing-opentelemetrycrates to bridge existingtracinginstrumentation - Since Synapse already uses
tracing/tracing-subscriber, the integration should layer on top of existing spans - Consider making this feature-gated (
--features otel) to avoid adding dependencies for users who don't need it - OTLP exporter should support both gRPC and HTTP protocols
- Graceful shutdown must flush pending spans before exit
Labels
enhancement, observability
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestrustPull requests that update rust codePull requests that update rust code