-
Notifications
You must be signed in to change notification settings - Fork 0
fix: harden supervisor recovery and stuck scan #207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: holyfuchs/supervisor-fix
Are you sure you want to change the base?
Changes from all commits
8d609a4
8447480
66ccb02
88ae670
9284c7e
eab7ad0
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -81,22 +81,45 @@ access(all) contract FlowYieldVaultsAutoBalancers { | |
| return nil | ||
| } | ||
|
|
||
| /// Checks if an AutoBalancer has at least one active (Scheduled) transaction. | ||
| /// Checks if an AutoBalancer has at least one active internally-managed transaction. | ||
| /// Used by Supervisor to detect stuck yield vaults that need recovery. | ||
| /// | ||
| /// A transaction is considered active when it is: | ||
| /// - still `Scheduled`, or | ||
| /// - already marked `Executed` by FlowTransactionScheduler and still within a bounded | ||
| /// grace period after its scheduled timestamp. | ||
| /// | ||
| /// The second case matters because FlowTransactionScheduler flips status to `Executed` | ||
| /// before the handler actually runs. Without treating that in-flight window as active, | ||
| /// the Supervisor can falsely classify healthy vaults as stuck and recover them twice. | ||
| /// But that window must be bounded: if the handler panics after the optimistic status | ||
| /// update, the vault must eventually become recoverable instead of remaining "active" | ||
| /// forever. | ||
| /// | ||
| /// @param id: The yield vault/AutoBalancer ID | ||
| /// @return Bool: true if there's at least one Scheduled transaction, false otherwise | ||
| /// @return Bool: true if there's at least one active internally-managed transaction, false otherwise | ||
| /// | ||
| access(all) fun hasActiveSchedule(id: UInt64): Bool { | ||
| let autoBalancer = self.borrowAutoBalancer(id: id) | ||
| if autoBalancer == nil { | ||
| return false | ||
| } | ||
|
|
||
| let currentTimestamp = getCurrentBlock().timestamp | ||
| let optimisticExecutionGracePeriod: UFix64 = 10.0 | ||
| let txnIDs = autoBalancer!.getScheduledTransactionIDs() | ||
| for txnID in txnIDs { | ||
| if autoBalancer!.borrowScheduledTransaction(id: txnID)?.status() == FlowTransactionScheduler.Status.Scheduled { | ||
| return true | ||
| if let scheduledTxn = autoBalancer!.borrowScheduledTransaction(id: txnID) { | ||
| if let status = scheduledTxn.status() { | ||
| if status == FlowTransactionScheduler.Status.Scheduled { | ||
| return true | ||
| } | ||
|
|
||
| if status == FlowTransactionScheduler.Status.Executed | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is the same issue we faced in onflow/FlowYieldVaultsEVM#70 |
||
| && currentTimestamp <= scheduledTxn.timestamp + optimisticExecutionGracePeriod { | ||
| return true | ||
| } | ||
| } | ||
| } | ||
| } | ||
| return false | ||
|
|
@@ -123,7 +146,7 @@ access(all) contract FlowYieldVaultsAutoBalancers { | |
| return false // Not configured for recurring, can't be "stuck" | ||
| } | ||
|
|
||
| // Check if there's an active schedule | ||
| // Check if there's an active schedule or an in-flight due execution | ||
| if self.hasActiveSchedule(id: id) { | ||
| return false // Has active schedule, not stuck | ||
| } | ||
|
|
@@ -226,8 +249,14 @@ access(all) contract FlowYieldVaultsAutoBalancers { | |
| let scheduleCap = self.account.capabilities.storage | ||
| .issue<auth(DeFiActions.Schedule) &DeFiActions.AutoBalancer>(storagePath) | ||
|
|
||
| // Register yield vault in registry for global mapping of live yield vault IDs | ||
| FlowYieldVaultsSchedulerRegistry.register(yieldVaultID: uniqueID.id, handlerCap: handlerCap, scheduleCap: scheduleCap) | ||
| // Register the yield vault in the global scheduler registry. | ||
| // Only recurring vaults participate in the Supervisor's stuck-scan ordering. | ||
| FlowYieldVaultsSchedulerRegistry.register( | ||
| yieldVaultID: uniqueID.id, | ||
| handlerCap: handlerCap, | ||
| scheduleCap: scheduleCap, | ||
| participatesInStuckScan: recurringConfig != nil | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The contract mentions its: Would we ever have an instance where this is not the case?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, we could do that. It would make the semantics cleaner: I kept this PR narrower because that would be a broader refactor. Today registration follows the vault lifecycle, not the recurring-config lifecycle, so changing the global registry to recurring-only would mean adding/removing entries whenever recurring config is enabled/disabled, and updating the related admin/recovery flows to match. So I agree your approach is valid, but I’d treat it as a separate design change. In this PR I only made the stuck-scan ordering recurring-only. I’ll update the comments to make that distinction explicit.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. check this commit : 9284c7e |
||
| ) | ||
|
|
||
| // Start the native AutoBalancer self-scheduling chain if recurringConfig was provided | ||
| // This schedules the first rebalance; subsequent ones are scheduled automatically | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,12 +1,12 @@ | ||
| import "FlowYieldVaultsAutoBalancers" | ||
|
|
||
| /// Returns true if the yield vault/AutoBalancer has at least one active (Scheduled) transaction. | ||
| /// Used to verify that healthy yield vaults maintain their scheduling chain. | ||
| /// Returns true if the yield vault/AutoBalancer has at least one active internally-managed | ||
| /// transaction. Active includes `Scheduled`, plus a recently `Executed` transaction still | ||
| /// within the optimistic-execution grace period. | ||
| /// | ||
| /// @param yieldVaultID: The YieldVault/AutoBalancer ID | ||
| /// @return Bool: true if there's at least one Scheduled transaction, false otherwise | ||
| /// @return Bool: true if there's at least one active internally-managed transaction, false otherwise | ||
| /// | ||
| access(all) fun main(yieldVaultID: UInt64): Bool { | ||
| return FlowYieldVaultsAutoBalancers.hasActiveSchedule(id: yieldVaultID) | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain this?
I am not sure I understand.
The status can be Executed even though it is not executed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes,
Executedcan be set before the handler actually runs. In FlowTransactionScheduler, the scheduler marks a tx as Executed optimistically before the handler logic has actually finished running. The contract says this directly here:https://github.com/onflow/flow-core-contracts/blob/27e0eb625ebe056c78cf42d6feaa6ce00a8e06c9/contracts/FlowTransactionScheduler.cdc#L1169-L1186
https://github.com/onflow/flow-core-contracts/blob/27e0eb625ebe056c78cf42d6feaa6ce00a8e06c9/contracts/FlowTransactionScheduler.cdc#L250-L264