Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
9b0052d
feat: add event system foundation for provider lifecycle
sameerank Dec 9, 2025
147a9c3
feat: add provider lifecycle interfaces and state management
sameerank Dec 10, 2025
2833a4a
fix: correct test expectation for init_with_timeout method signature
sameerank Dec 10, 2025
7f85a25
feat: implement async provider initialization with event-driven lifec…
sameerank Dec 10, 2025
45684bc
test: add OpenFeature specification requirement tests for provider li…
sameerank Dec 10, 2025
8c91379
style: remove excessive comments to match project conventions
sameerank Dec 11, 2025
821e71c
fix: address code review feedback
sameerank Dec 11, 2025
0ab5d30
refactor: remove ContextAwareStateHandler as it provides no additiona…
sameerank Dec 11, 2025
e4887f1
fix: address second round of Gemini code review feedback
sameerank Dec 11, 2025
6eefe78
fix: address third round of Gemini code review feedback
sameerank Dec 11, 2025
ce8d3f1
refactor: use dedicated ProviderEventDispatcher class
sameerank Dec 11, 2025
700ad79
refactor: remove EventAwareNoOpProvider and reorganize compatibility …
sameerank Dec 11, 2025
2fa7865
fix: address race conditions in provider lifecycle management
sameerank Dec 11, 2025
fc88010
refactor: improve API structure and fix nil provider handling
sameerank Dec 11, 2025
d22ca3c
refactor: improve error messages and init parameter handling
sameerank Dec 11, 2025
2611c40
refactor: improve parameter inspection and simplify event details
sameerank Dec 11, 2025
e6e176a
refactor: replace PROVIDER_FATAL string literals with constants
sameerank Dec 11, 2025
fb37e18
improve: enhance error handling by propagating original exception object
sameerank Dec 11, 2025
2d531a1
fix: prevent memory leaks in provider state registry
sameerank Dec 11, 2025
fc2efaf
fix: prevent race condition in async provider initialization
sameerank Dec 11, 2025
4cdc814
improve: enhance diagnostics and robustness of provider eventing system
sameerank Dec 11, 2025
f858bfb
fix: correct semantically incorrect original_error assignment
sameerank Dec 11, 2025
ca59659
fix: remove problematic provider reversion logic and dead code
sameerank Dec 11, 2025
e0e354c
fix: address code review feedback on exception handling and test stab…
sameerank Dec 11, 2025
8a13c74
fix: add thread-safety to set_provider_and_wait method
sameerank Dec 12, 2025
4f7c547
fix: resolve race condition in set_provider_and_wait
sameerank Dec 12, 2025
2be203e
perf: improve concurrency in set_provider_and_wait
sameerank Dec 12, 2025
ebe7621
refactor: improve error message clarity and code organization
sameerank Dec 12, 2025
bc37195
refactor: replace hash mapping with idiomatic case statement
sameerank Dec 12, 2025
3f0e659
test: achieve 100% test coverage with comprehensive edge case testing
sameerank Dec 12, 2025
feb988b
refactor: remove redundant comments and add OpenFeature spec reference
sameerank Dec 12, 2025
40ec89e
Merge remote-tracking branch 'origin/main' into feat/add-provider-eve…
sameerank Dec 12, 2025
b301cf9
style: apply Standard Ruby lint fixes for CI compliance
sameerank Dec 12, 2025
124b689
docs: update README with eventing documentation
sameerank Dec 12, 2025
19dd12f
refactor: improve API encapsulation by making testing utilities private
sameerank Dec 12, 2025
ba4a136
refactor: remove unnecessary StateHandler module to simplify PR
sameerank Dec 13, 2025
e5ba150
refactor: simplify provider initialization with explicit blocking con…
sameerank Dec 16, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Gemfile.lock
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,7 @@ PLATFORMS
arm64-darwin-21
arm64-darwin-22
arm64-darwin-23
arm64-darwin-24
x64-mingw-ucrt
x64-mingw32
x86_64-darwin-19
Expand Down
49 changes: 41 additions & 8 deletions README.md
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the feature table with the new status and update the eventing section in the readme.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated!

I was initially going to use the ⚠️ symbol because PROVIDER_CONTEXT_CHANGED and PROVIDER_RECONCILING are not implemented as outlined in Req 5.3.5. However, I see that eventing is marked as ✅ in the Java or Go SDKs so it seems those are optional

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CONTEXT_CHANGED and RECONCILING only make sense in static-context (client) paradigm. Server SDKs don't need to support these.

Events | OpenFeature:

Providers built to conform to the static context paradigm feature two additional events: PROVIDER_RECONCILING and PROVIDER_CONTEXT_CHANGED.

And section 5.3.4 is conditioned on static-context paradigm.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct!

Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ object = client.fetch_object_value(flag_key: 'object_value', default_value: { na
| ⚠️ | [Hooks](#hooks) | Add functionality to various stages of the flag evaluation life-cycle. |
| ❌ | [Logging](#logging) | Integrate with popular logging packages. |
| ✅ | [Domains](#domains) | Logically bind clients with providers. |
| | [Eventing](#eventing) | React to state changes in the provider or flag management system. |
| | [Eventing](#eventing) | React to state changes in the provider or flag management system. |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

major(spec): one thing that seems to be missing is client-level event handlers. The user shall be able to attach event handler to specific clients instead of globally.

Copy link
Member

@toddbaert toddbaert Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct: https://openfeature.dev/specification/sections/events#requirement-521

A corollary to this is "domain scoping" - a client bound to domain x should not get events originating from a provider from domain y. Additionally, late binding to domains should be observed. For example, initially a client bound to domain x will use the default provider, but if a new provider is bound to domain x, that becomes the provider from which the client sources its events.

See 5.1.2 and 5.1.3

| ⚠️ | [Shutdown](#shutdown) | Gracefully clean up a provider during application shutdown. |
| ❌ | [Transaction Context Propagation](#transaction-context-propagation) | Set a specific [evaluation context](https://openfeature.dev/docs/reference/concepts/evaluation-context) for a transaction (e.g. an HTTP request or a thread) |
| ⚠️ | [Extending](#extending) | Extend OpenFeature with custom providers and hooks. |
Expand Down Expand Up @@ -142,7 +142,8 @@ begin
rescue OpenFeature::SDK::ProviderInitializationError => e
puts "Provider failed to initialize: #{e.message}"
puts "Error code: #{e.error_code}"
puts "Original error: #{e.original_error}"
# Note: original_error is only present for timeout errors, nil for provider event errors
puts "Original error: #{e.original_error}" if e.original_error
Comment on lines +145 to +146

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: I think it makes sense to always set original_error for ergonomics and semantics (it's a bit surprising that it is not set when there is an underlying error). The fact that it's now nilable is arguably a breaking change as well.

end

# With custom timeout (default is 30 seconds)
Expand All @@ -164,7 +165,8 @@ end
The `set_provider_and_wait` method:
- Waits for the provider's `init` method to complete successfully
- Raises `ProviderInitializationError` with `PROVIDER_FATAL` error code if initialization fails or times out
- Provides access to the original error, provider instance, and error code for debugging
- Provides access to the provider instance and error code for debugging
- The `original_error` field only contains the underlying exception for timeout errors; it is `nil` for errors that occur through the provider event system
- Uses the same thread-safe provider switching as `set_provider`

In some situations, it may be beneficial to register multiple providers in the same application.
Expand Down Expand Up @@ -231,15 +233,46 @@ legacy_flag_client = OpenFeature::SDK.build_client(domain: "legacy_flags")

### Eventing

Coming Soon! [Issue available](https://github.com/open-feature/ruby-sdk/issues/51) to be worked on.

<!-- Events allow you to react to state changes in the provider or underlying flag management system, such as flag definition changes, provider readiness, or error conditions.
Events allow you to react to state changes in the provider or underlying flag management system, such as flag definition changes, provider readiness, or error conditions.
Initialization events (`PROVIDER_READY` on success, `PROVIDER_ERROR` on failure) are dispatched for every provider.
Some providers support additional events, such as `PROVIDER_CONFIGURATION_CHANGED`.

Please refer to the documentation of the provider you're using to see what events are supported. -->
Please refer to the documentation of the provider you're using to see what events are supported.

```ruby
# Register event handlers at the API (global) level
ready_handler = ->(event_details) do
puts "Provider #{event_details[:provider].metadata.name} is ready!"
end

OpenFeature::SDK.add_handler(OpenFeature::SDK::ProviderEvent::PROVIDER_READY, ready_handler)

# Providers can emit events using the EventHandler mixin
class MyEventAwareProvider
include OpenFeature::SDK::Provider::EventHandler

<!-- TODO: code example of a PROVIDER_CONFIGURATION_CHANGED event for the client and a PROVIDER_STALE event for the API -->
def init(evaluation_context)
# Start background process to monitor for configuration changes
# Note: SDK automatically emits PROVIDER_READY when init completes successfully

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: this note seems inaccurate. PROVIDER_READY is only emitted automatically iff the provider does not include the EventHandler mixin. If the mixin is used, the provider must emit its own PROVIDER_READY event. (Which this example doesn't seem to do?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whether or not the provider implements events, the SDK should automatically emit READY/ERROR events when the init function succeeds/fails (raises exception in this case). More here: https://github.com/open-feature/ruby-sdk/pull/207/changes#r2624429319

start_background_process
end

def start_background_process
Thread.new do
# Monitor for configuration changes and emit events when they occur
if configuration_changed?
emit_event(
OpenFeature::SDK::ProviderEvent::PROVIDER_CONFIGURATION_CHANGED,
message: "Flag configuration updated"
)
end
end
end
end

# Remove specific handlers when no longer needed
OpenFeature::SDK.remove_handler(OpenFeature::SDK::ProviderEvent::PROVIDER_READY, ready_handler)
```

### Shutdown

Expand Down
22 changes: 22 additions & 0 deletions lib/open_feature/sdk/api.rb
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,28 @@ def build_client(domain: nil, evaluation_context: nil)
rescue
Client.new(provider: Provider::NoOpProvider.new, evaluation_context:)
end

def add_handler(event_type, handler)
configuration.add_handler(event_type, handler)
end

def remove_handler(event_type, handler)
configuration.remove_handler(event_type, handler)
end

def logger
configuration.logger
end

def logger=(new_logger)
configuration.logger = new_logger
end

private

def clear_all_handlers
configuration.clear_all_handlers
end
end
end
end
162 changes: 115 additions & 47 deletions lib/open_feature/sdk/configuration.rb
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,10 @@
require "timeout"
require_relative "api"
require_relative "provider_initialization_error"
require_relative "event_emitter"
require_relative "provider_event"
require_relative "provider_state_registry"
require_relative "provider/event_handler"

module OpenFeature
module SDK
Expand All @@ -14,75 +18,139 @@ class Configuration
extend Forwardable

attr_accessor :evaluation_context, :hooks
attr_reader :logger

def initialize
@hooks = []
@providers = {}
@provider_mutex = Mutex.new
@logger = nil
@event_emitter = EventEmitter.new(@logger)
@provider_state_registry = ProviderStateRegistry.new
end

def provider(domain: nil)
@providers[domain] || @providers[nil]
end

# When switching providers, there are a few lifecycle methods that need to be taken care of.
# 1. If a provider is already set, we need to call `shutdown` on it.
# 2. On the new provider, call `init`.
# 3. Finally, set the internal provider to the new provider
def logger=(new_logger)
@logger = new_logger
@event_emitter.logger = new_logger if @event_emitter
end

def add_handler(event_type, handler)
@event_emitter.add_handler(event_type, handler)
end
Comment on lines +41 to +43

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

major(spec):

Events | OpenFeature:

Handlers attached after the provider is already in the associated state, MUST run immediately.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya I noticed this as well due to a skipped test case: #207 (comment)


def remove_handler(event_type, handler)
@event_emitter.remove_handler(event_type, handler)
end

def clear_all_handlers
@event_emitter.clear_all_handlers
end

def set_provider(provider, domain: nil)
@provider_mutex.synchronize do
@providers[domain].shutdown if @providers[domain].respond_to?(:shutdown)
provider.init if provider.respond_to?(:init)
new_providers = @providers.dup
new_providers[domain] = provider
@providers = new_providers
set_provider_internal(provider, domain: domain, wait_for_init: false)
end
end

# Sets a provider and waits for the initialization to complete or fail.
# This method ensures the provider is ready (or in error state) before returning.
#
# @param provider [Object] the provider to set
# @param domain [String, nil] the domain for the provider (optional)
# @param timeout [Integer] maximum time to wait for initialization in seconds (default: 30)
# @raise [ProviderInitializationError] if the provider fails to initialize or times out
def set_provider_and_wait(provider, domain: nil, timeout: 30)
def set_provider_and_wait(provider, domain: nil)
Comment on lines -49 to +59

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

major: removing the timeout parameter is a breaking change. Is this intended?

Copy link
Member

@toddbaert toddbaert Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good observation! It is a breaking change, but it's a divergence from the spec. I would actually recommend removing it. We instead prefer to have providers handle this themselves, with some kind of timeout, ideally specified in their options/constructors.

Simply adding a ! to the title (feat!: ...) will mark this change as breaking, which is fine since this is a <1.0 SDK.

@provider_mutex.synchronize do
old_provider = @providers[domain]
set_provider_internal(provider, domain: domain, wait_for_init: true)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: given how set_provider_internal is used (always acquiring mutex for a single call), it almost looks like this method should acquire the mutex internally. This would also make it easier to read that method (don't need to look back at all call sites to confirm that modifying @providers is safe).

It might also use a shorter synchronize block (e.g. calling init_provider inside the synchronize block seems like it can be problematic if provider blocks).

end
end

private

def set_provider_internal(provider, domain:, wait_for_init:)
old_provider = @providers[domain]

begin
old_provider.shutdown if old_provider.respond_to?(:shutdown)
rescue => e
@logger&.warn("Error shutting down previous provider #{old_provider&.class&.name || "unknown"}: #{e.message}")
end
Comment on lines +70 to +74

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: mentioned in the other comment, it seems like calling user-provided methods is best done outside of synchronize block.


# Remove old provider state to prevent memory leaks
@provider_state_registry.remove_provider(old_provider)

new_providers = @providers.dup
new_providers[domain] = provider
@providers = new_providers

@provider_state_registry.set_initial_state(provider)

provider.attach(ProviderEventDispatcher.new(self)) if provider.is_a?(Provider::EventHandler)

# Capture evaluation context to prevent race condition
context_for_init = @evaluation_context

# Shutdown old provider (ignore errors)
begin
old_provider.shutdown if old_provider.respond_to?(:shutdown)
rescue
# Ignore shutdown errors and continue with provider initialization
if wait_for_init
init_provider(provider, context_for_init, raise_on_error: true)
else
Thread.new do
init_provider(provider, context_for_init, raise_on_error: false)
end
end
Comment on lines +90 to +96

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: does it make sense to always initialize provider in a new thread and for set_provider_and_wait to wait for PROVIDER_READY event? Otherwise, set_provider_and_wait does not wait for async-initializing providers (which is its major reasons to exist).

end

begin
# Initialize new provider with timeout
if provider.respond_to?(:init)
Timeout.timeout(timeout) do
provider.init
end
end

# Set the new provider
new_providers = @providers.dup
new_providers[domain] = provider
@providers = new_providers
rescue Timeout::Error => e
raise ProviderInitializationError.new(
"Provider initialization timed out after #{timeout} seconds",
provider:,
original_error: e
)
rescue => e
raise ProviderInitializationError.new(
"Provider initialization failed: #{e.message}",
provider:,
original_error: e
)
def init_provider(provider, context, raise_on_error: false)
if provider.respond_to?(:init)
init_method = provider.method(:init)
if init_method.parameters.empty?
provider.init
else
provider.init(context)
end
end

unless provider.is_a?(Provider::EventHandler)
dispatch_provider_event(provider, ProviderEvent::PROVIDER_READY)
end
rescue => e
dispatch_provider_event(provider, ProviderEvent::PROVIDER_ERROR,
error_code: Provider::ErrorCode::PROVIDER_FATAL,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

major: PROVIDER_FATAL seems too harsh as it signal irrecoverable error. The provider might still be trying to initialize and emit PROVIDER_READY later, so non-fatal error is a safer choice.

message: e.message)

Comment on lines +109 to +116
Copy link
Member

@toddbaert toddbaert Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The emission of the READY/ERROR events should occur regardless of whether or not the provider is capable of emitting its own events.

The SDK should always emit either an init or error depending on the normal/abnormal termination of the provider's init method. If the provider has no init method, the SDK should just emit READY at the same point the method would have been called.

This is outlined here in section 5.3.

A bit more background:

The provider can "spontaneously" emit events (for instance, an error event if a persistent connection is lost) but the SDK emits events that correspond to the provider lifecycle and actions the application integrator has initiated (such as the READY event expected after a provider is set and initialized).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How should this work with asynchronous provider initialization?

I'm not too familiar with Ruby concurrency model, so sorry if my question does not make sense. My understanding is that set_provider should not block the thread (even if initialization takes a while) but set_provider_and_wait should. In languages that have promises, the SDK would wait for the initialization promise to resolve before emitting event, which may be much later that initialization function return.

Ruby doesn't have promises, so the options I see are:

  1. wait for provider event before declaring it ready
  2. let the provider block in initialization method
  3. the current PR is a mix of the two, depending on whether provider is able to emit events

From what I gathered, the second option is preferred because we can't rule out a method blocking the thread and blocking seems very common in Ruby, so it's likely that many providers already implement it this way. I think it makes sense to ask initialize to always block before it's ready/error and always emit events after initialize returns. Now, if I'm correct that we want set_provider to be non-blocking, this means that we need to initialize provider in a new Thread, which seems suboptimal but it's not too bad.

Does this match your expectations?

if raise_on_error
# Re-raise as ProviderInitializationError for synchronous callers
raise ProviderInitializationError.new(
"Provider #{provider.class.name} initialization failed: #{e.message}",
provider:,
error_code: Provider::ErrorCode::PROVIDER_FATAL,
original_error: e
)
end
end

def dispatch_provider_event(provider, event_type, details = {})
@provider_state_registry.update_state_from_event(provider, event_type, details)

# Trigger event handlers
event_details = {
provider:,
provider_name: provider.class.name
}.merge(details)

@event_emitter.trigger_event(event_type, event_details)
end

def provider_state(provider)
@provider_state_registry.get_state(provider)
end

private

class ProviderEventDispatcher
def initialize(config)
@config = config
end

def dispatch_event(provider, event_type, details)
@config.send(:dispatch_provider_event, provider, event_type, details)
end
end
end
end
Expand Down
80 changes: 80 additions & 0 deletions lib/open_feature/sdk/event_emitter.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# frozen_string_literal: true

require_relative "provider_event"

module OpenFeature
module SDK
# Thread-safe pub-sub for provider events
class EventEmitter
attr_writer :logger

def initialize(logger = nil)
@handlers = {}
@mutex = Mutex.new
@logger = logger
ProviderEvent::ALL_EVENTS.each { |event| @handlers[event] = [] }
end

def add_handler(event_type, handler)
raise ArgumentError, "Invalid event type: #{event_type}" unless valid_event?(event_type)
raise ArgumentError, "Handler must respond to call" unless handler.respond_to?(:call)

@mutex.synchronize do
@handlers[event_type] << handler
end
end

def remove_handler(event_type, handler)
return unless valid_event?(event_type)

@mutex.synchronize do
@handlers[event_type].delete(handler)
end
end

def remove_all_handlers(event_type)
return unless valid_event?(event_type)

@mutex.synchronize do
@handlers[event_type].clear
end
end

def trigger_event(event_type, event_details = {})
return unless valid_event?(event_type)

handlers_to_call = nil
@mutex.synchronize do
handlers_to_call = @handlers[event_type].dup
end

# Call handlers outside of mutex to avoid deadlocks
handlers_to_call.each do |handler|
handler.call(event_details)
rescue => e
@logger&.warn "Event handler failed for #{event_type}: #{e.message}\n#{e.backtrace.join("\n")}"
end
end

def handler_count(event_type)
return 0 unless valid_event?(event_type)

@mutex.synchronize do
@handlers[event_type].size
end
end

def clear_all_handlers
@mutex.synchronize do
@handlers.each_value(&:clear)
end
end

private

def valid_event?(event_type)
ProviderEvent::ALL_EVENTS.include?(event_type)
end
end
end
end
Loading
Loading