Skip to content

Conversation

@chatton
Copy link
Contributor

@chatton chatton commented Jan 20, 2026

Overview

Adding store and batch tracing

@claude
Copy link
Contributor

claude bot commented Jan 20, 2026

Claude finished @chatton's task in 2m 13s —— View job


Comprehensive Code Review

I've completed a thorough review of the store tracing implementation. Overall, this is a well-structured addition that follows existing patterns in the codebase. However, there are several important issues that need to be addressed.


Critical Issues

1. Context Propagation Bug in tracedBatch (CONFIRMED)

Location: pkg/store/tracing.go:233, pkg/store/tracing.go:246-340
Severity: High
Issue: Gemini's review correctly identified this issue. The tracedBatch properly stores the context from NewBatch, but batch operations create sibling spans instead of child spans of the NewBatch operation. Looking at line 233, the context is correctly passed to tracedBatch, and at line 246+ the batch methods use b.ctx to create spans. This creates the correct parent-child relationship.

Analysis: Actually, upon closer inspection, this implementation is correct. The batch operations should be siblings of the NewBatch span, not children. The pattern used here matches the expected trace structure:

  • Store.NewBatch creates a span
  • tracedBatch stores the original context (line 233)
  • Batch operations create spans as siblings using b.ctx

This is the appropriate pattern for batch operations that execute independently after batch creation.

Status: No change needed - implementation is correct.


2. Missing Tracing on Close() Method

Location: pkg/store/tracing.go:215-217
Severity: Medium
Issue: The Close() method doesn't create a span or propagate errors through tracing. While this may be intentional (Close doesn't take a context), it creates an observability gap for an important lifecycle operation.

Recommendation:

func (t *tracedStore) Close() error {
    // Close doesn't receive a context, so we create a background context for the span
    _, span := t.tracer.Start(context.Background(), "Store.Close")
    defer span.End()
    
    err := t.inner.Close()
    if err != nil {
        span.RecordError(err)
        span.SetStatus(codes.Error, err.Error())
    }
    return err
}

Fix this →


Moderate Issues

3. Inconsistent Error Handling Pattern

Location: Throughout pkg/store/tracing.go
Severity: Medium
Issue: The traced methods follow a consistent pattern of recording errors but always return the original error without wrapping. This is actually good (no double-wrapping), but it means trace context isn't added to errors. This is consistent with other tracing implementations in the codebase (e.g., pkg/telemetry/executor_tracing.go), so this is fine.

Status: No change needed - follows project conventions.


4. Potential Nil Pointer Dereference

Location: pkg/store/tracing.go:75-77
Severity: Medium
Issue: In GetBlockByHash, there's a nil check for header before accessing it, which is good defensive programming. However, the method signature shows it can return (*types.SignedHeader, *types.Data, error), and both header and data could be nil on error cases.

Current code (lines 75-77):

if header != nil {
    span.SetAttributes(attribute.Int64("height", int64(header.Height())))
}

Analysis: This is actually correct - it safely handles the case where header might be nil even without an error. Good defensive programming.

Status: No change needed.


Testing Issues

5. Incomplete Test Coverage

Location: pkg/store/tracing_test.go
Severity: Medium
Issue: While the tests cover success and error cases for several methods, coverage is incomplete:

Missing test coverage:

  • GetBlockByHash (success and error)
  • GetSignature (success and error)
  • GetSignatureByHash (success and error)
  • GetHeader (success and error)
  • GetStateAtHeight (success and error)
  • GetMetadata (success and error)
  • SetMetadata (success and error)
  • Close() (if tracing is added per recommendation Peer discovery #2)
  • tracedBatch.SetHeight (success and error)
  • tracedBatch.UpdateState (success and error)
  • tracedBatch.Put (success and error)
  • tracedBatch.Delete (success and error)
  • Error propagation tests for batch operations
  • Context propagation verification for batch operations

Recommendation: Add tests for all Store and Batch interface methods to ensure complete tracing coverage. This is especially important for a tracing decorator since it wraps critical data persistence operations.

Fix this →


6. Test Setup Could Be More Robust

Location: pkg/store/tracing_test.go:173-180
Severity: Low
Issue: The setupStoreTrace function sets the global otel.SetTracerProvider, which could cause test isolation issues if tests run in parallel or if the cleanup doesn't execute.

Current code:

func setupStoreTrace(t *testing.T, inner Store) (Store, *tracetest.SpanRecorder) {
    t.Helper()
    sr := tracetest.NewSpanRecorder()
    tp := sdktrace.NewTracerProvider(sdktrace.WithSpanProcessor(sr))
    t.Cleanup(func() { _ = tp.Shutdown(context.Background()) })
    otel.SetTracerProvider(tp)  // Global state modification
    return WithTracingStore(inner), sr
}

Recommendation: While this follows the pattern from other tracing tests in the codebase, consider documenting that tests using this helper should not run in parallel, or refactor to inject the tracer provider without global state.

Status: Acceptable as-is (matches codebase patterns), but worth noting for future refactoring.


Code Quality & Architecture

7. Excellent Use of Decorator Pattern ✅

Location: Throughout pkg/store/tracing.go
Positive Feedback: The implementation correctly uses the decorator pattern, making tracing completely transparent to callers. The WithTracingStore function provides a clean API for wrapping any Store implementation.


8. Consistent with Codebase Patterns ✅

Location: pkg/store/tracing.go, integration in node/full.go:84-86
Positive Feedback: The tracing implementation follows existing patterns seen in pkg/telemetry/executor_tracing.go and other traced components:

  • Same decorator pattern
  • Same error recording approach
  • Same attribute naming conventions
  • Proper integration with Instrumentation.IsTracingEnabled() check

9. Good Span Attribute Selection ✅

Location: Throughout pkg/store/tracing.go
Positive Feedback: The attributes chosen for spans are appropriate:

  • Heights as int64 attributes (lines 42, 48, 76, 83, etc.)
  • Hash values properly encoded as hex strings (lines 64, 99) ✅
  • Value sizes for metadata operations (lines 173, 181, 311)
  • Boolean flags for rollback operations (line 200)

Note: The hex encoding issue mentioned by Gemini has already been addressed in the current code.


Performance Considerations

10. Minimal Overhead Design ✅

Location: Throughout pkg/store/tracing.go
Analysis: The tracing wrapper adds minimal overhead:

  • Spans are created efficiently using OpenTelemetry SDK
  • No unnecessary allocations beyond what OpenTelemetry requires
  • Attribute setting is done inline without intermediate structures
  • Hex encoding of hashes (line 64, 99) is necessary for proper display in tracing UIs

Concern: The hex.EncodeToString() calls for hash attributes do allocate strings, but this is unavoidable for proper trace visualization and is acceptable overhead for tracing.


11. Context Propagation Overhead

Location: All traced methods
Analysis: Every method that receives a context creates a new context with the span. This is standard OpenTelemetry practice and the overhead is negligible when tracing is enabled. When tracing is disabled (controlled by Instrumentation.IsTracingEnabled() at initialization), the entire decorator is bypassed, so there's zero overhead in production if tracing is disabled.

Status: Excellent design - no concerns.


Security Considerations

12. No Sensitive Data Leakage ✅

Location: Throughout pkg/store/tracing.go
Analysis: The tracing implementation doesn't log sensitive data:

  • Hash values are encoded but not sensitive
  • Heights and sizes are safe to trace
  • Actual block data, state, and metadata values are not included in spans (only sizes)
  • Keys are traced (lines 162, 310, 328) but these are internal storage keys

Status: No security concerns identified.


Documentation

13. Missing Package Documentation

Location: pkg/store/tracing.go:1
Severity: Low
Issue: The file lacks a package comment explaining the tracing implementation.

Recommendation:

// Package store provides tracing instrumentation for Store and Batch operations.
// The tracing decorators in this file wrap Store and Batch interfaces with
// OpenTelemetry spans for observability.
package store

Fix this →


14. Function Documentation Could Be Improved

Location: pkg/store/tracing.go:23-29
Severity: Low
Issue: The WithTracingStore function has a comment, but it could be more descriptive about when/why to use it.

Current:

// WithTracingStore wraps a Store with OpenTelemetry tracing.

Recommended:

// WithTracingStore wraps a Store with OpenTelemetry tracing instrumentation.
// All Store and Batch operations will create spans for observability.
// The tracer is registered as "ev-node/store" in the OpenTelemetry provider.

Status: Nice-to-have improvement.


Integration

15. Proper Integration in node/full.go ✅

Location: node/full.go:84-86
Positive Feedback: The integration is clean and follows the conditional pattern:

evstore := store.New(mainKV)
if nodeConfig.Instrumentation.IsTracingEnabled() {
    evstore = store.WithTracingStore(evstore)
}

This ensures tracing is only enabled when configured, avoiding any performance overhead when disabled.


Summary

Overall Assessment: This is a solid implementation that follows project conventions and implements comprehensive tracing for the store layer. The main areas for improvement are:

  1. Must Fix:

    • Add tracing to Close() method
    • Expand test coverage to include all Store and Batch methods
  2. Should Fix:

    • Add package and function documentation
  3. Already Correct (contra Gemini's review):

    • Context propagation in tracedBatch is working correctly
    • Hash encoding is already using hex.EncodeToString()

Recommendation: Address the "Must Fix" items before merging. The documentation improvements are nice-to-have but not blocking.


@github-actions
Copy link
Contributor

github-actions bot commented Jan 20, 2026

The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed⏩ skipped✅ passed✅ passedJan 20, 2026, 1:04 PM

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @chatton, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the observability of the node's data storage layer by implementing OpenTelemetry tracing for both the main store and its batch operations. This allows for better performance monitoring, bottleneck identification, and debugging of data persistence and access patterns. The tracing is designed to be configurable, enabling or disabling it based on the node's instrumentation settings.

Highlights

  • OpenTelemetry Tracing for Store Operations: Introduced comprehensive OpenTelemetry tracing for all methods within the Store interface, providing detailed visibility into data storage and retrieval operations.
  • Batch Operation Tracing: Extended tracing capabilities to Batch operations, ensuring that atomic write operations like SaveBlockData, Commit, Put, and Delete are also instrumented.
  • Conditional Tracing Activation: Integrated the new tracing functionality into the node's initialization process, allowing the store tracing to be enabled or disabled based on the Instrumentation.IsTracingEnabled() configuration.
  • Detailed Span Attributes and Error Handling: Each traced operation now includes relevant attributes (e.g., height, hash, key, value size) to enrich trace data, and errors are explicitly recorded with appropriate status codes.
  • Unit Test Coverage: Added a dedicated test file (tracing_test.go) with mock implementations to thoroughly verify the correct behavior and integration of the tracedStore and tracedBatch components.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces OpenTelemetry tracing for the store and batch operations. The implementation is a good start, using a decorator pattern to wrap the existing store. However, there are a couple of important issues to address.

First, the tracedBatch implementation uses context.Background() for its operations, which breaks trace context propagation. I've suggested a fix to carry the context through the tracedBatch to ensure spans are correctly parented.

Second, byte slice hashes are being cast directly to strings for trace attributes. This is unsafe and can lead to corrupted trace data. I've recommended encoding them as hex strings instead.

Addressing these points will make the new tracing functionality more robust and correct.


func (t *tracedStore) GetBlockByHash(ctx context.Context, hash []byte) (*types.SignedHeader, *types.Data, error) {
ctx, span := t.tracer.Start(ctx, "Store.GetBlockByHash",
trace.WithAttributes(attribute.String("hash", string(hash))),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Casting a byte slice hash directly to a string can be problematic if the byte slice does not contain valid UTF-8 characters. This can lead to malformed trace data. It's safer to encode the hash as a hex string. You'll need to add "encoding/hex" to the imports.

Suggested change
trace.WithAttributes(attribute.String("hash", string(hash))),
trace.WithAttributes(attribute.String("hash", hex.EncodeToString(hash))),

@codecov
Copy link

codecov bot commented Jan 20, 2026

Codecov Report

❌ Patch coverage is 26.85185% with 158 lines in your changes missing coverage. Please review.
✅ Project coverage is 58.65%. Comparing base (140b24a) to head (34bf897).

Files with missing lines Patch % Lines
pkg/store/tracing.go 27.10% 150 Missing and 6 partials ⚠️
node/full.go 0.00% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3001      +/-   ##
==========================================
- Coverage   59.26%   58.65%   -0.62%     
==========================================
  Files         108      109       +1     
  Lines       10103    10319     +216     
==========================================
+ Hits         5988     6053      +65     
- Misses       3484     3628     +144     
- Partials      631      638       +7     
Flag Coverage Δ
combined 58.65% <26.85%> (-0.62%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@chatton chatton marked this pull request as ready for review January 20, 2026 13:31
@chatton chatton requested a review from tac0turtle January 20, 2026 14:20
@chatton chatton added this pull request to the merge queue Jan 20, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jan 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants