Skip to content

Conversation

@MartinquaXD
Copy link
Contributor

@MartinquaXD MartinquaXD commented Jan 6, 2026

Description

Currently the bad token detection assumes that we are perfectly able to detect "broken" orders and only orders that trade specific tokens that a particular solver is not able to handle cause problems. However this assumption does not work well with the increasing complexity of new order types that can suddenly start failing for any number of reasons.
The most prominent recent example were flashloan orders where the EIP 1271 signature verified correctly but transferring the tokens into the settlement contract failed because the user's Aave debt position was not healthy enough.
Our current logic caused a lot of collateral damage because such orders could cause many reasonable tokens to be flagged as unsupported although the tokens themselves were perfectly fine and only that particular order was problematic.

Changes

To address this this PR change the metrics based detection mechanism to only flag on an order by order basis instead flagging all orders trading specific tokens. The change itself is relatively simple (collect metrics keyed by Uid instead of token`) but came with a few related changes:

  • the name bad_token_detection is now incorrect in most (but not all!) cases so many things were renamed
    • this includes a few config parameters so they must be updated in the infra repo as well!
  • caching uids has a lot more potential to bloat the cache so a cache eviction task was introduced, this required 2 new config parameters (max_age, gc_interval)

How to test

  • adjusted existing unit to make sure the metrics logic still works correctly with Uid
  • added a new unit test for the cache eviction

Related issues

Fixes #4019

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

github unfortunately marks this whole file as new when it actually was actually just moved and modified slightly. The new parts are:

  • added last_seen_at
  • added spawn_gc_task (and the associated unit test)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The refactoring/renaming should probably be separated from the logic changes to avoid this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This diff is much better https://www.diffchecker.com/iMNVcpf7/

@m-sz m-sz self-assigned this Jan 7, 2026
@m-sz
Copy link
Contributor

m-sz commented Jan 7, 2026

@m-sz m-sz marked this pull request as ready for review January 7, 2026 14:52
@m-sz m-sz requested a review from a team as a code owner January 7, 2026 14:52
Copy link
Contributor

@jmg-duarte jmg-duarte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@m-sz
Copy link
Contributor

m-sz commented Jan 8, 2026

Moved bad_token and bad_order detection as submodules to common detector

The public facing api (Quality and the Detector) now reside inside of it. Then its submodules are bad_tokens::simulation and bad_orders::metrics which exactly describe how we rule out the specific offending trades. The rest of the crate uses only the main detector module and detector::Detector struct which makes it less confusing.

Also removed the hand rolled current_unix_timestamp() in favour of now_in_epoch_seconds()

@m-sz
Copy link
Contributor

m-sz commented Jan 8, 2026

I am now left wondering how should the configuration struct be called. It's currently named BadOrderDetectionConfig and configures both the bad token detection based on simulation and bad order detection based on metrics. Calling it simply DetectorConfig could be confusing.

@jmg-duarte
Copy link
Contributor

jmg-duarte commented Jan 8, 2026

I am now left wondering how should the configuration struct be called. It's currently named BadOrderDetectionConfig and configures both the bad token detection based on simulation and bad order detection based on metrics. Calling it simply DetectorConfig could be confusing.

Suggestions:

  1. FaultDetectorConfig
  2. Split the configs into token/order or metrics/simulation (or both) and then create a separate one called DetectionConfigurations — its no longer ambiguous because you can open it and see both explicit ones

@squadgazzz
Copy link
Contributor

I am now left wondering how should the configuration struct be called. It's currently named BadOrderDetectionConfig and configures both the bad token detection based on simulation and bad order detection based on metrics. Calling it simply DetectorConfig could be confusing.

We now have 2 detectors that live in different modules. Can we simply split their configs or do they share some config params?

@m-sz
Copy link
Contributor

m-sz commented Jan 9, 2026

I am now left wondering how should the configuration struct be called. It's currently named BadOrderDetectionConfig and configures both the bad token detection based on simulation and bad order detection based on metrics. Calling it simply DetectorConfig could be confusing.

We now have 2 detectors that live in different modules. Can we simply split their configs or do they share some config params?

They could be split into two if there was not the common hardcoded token statuses:

Here is the overall struct, I'll call it "DetectorConfig" to avoid confusion

pub struct DetectorConfig {
    pub token_supported: HashMap<eth::Address, bool>,
    pub enable_simulation_strategy: bool,
    pub enable_metrics_strategy: bool,
    pub metrics_strategy_failure_ratio: f64,
    pub metrics_strategy_required_measurements: u32,
    pub metrics_strategy_log_only: bool,
    pub metrics_strategy_freeze_time: Duration,
    pub metrics_strategy_gc_interval: Duration,
    pub metrics_strategy_gc_max_age: Duration,
}

It could be split into 2 configs

pub struct BadTokenDetectorConfig {
    pub enable_simulation_strategy: bool,
}
pub struct BardOrderDetectorConfig {
    pub enable_metrics_strategy: bool,
    pub metrics_strategy_failure_ratio: f64,
    pub metrics_strategy_required_measurements: u32,
    pub metrics_strategy_log_only: bool,
    pub metrics_strategy_freeze_time: Duration,
    pub metrics_strategy_gc_interval: Duration,
    pub metrics_strategy_gc_max_age: Duration,
}

with the remaining field: pub token_supported: HashMap<eth::Address, bool> which truly applies to neither, as the overall Detector that combines inside the BadToken and BadOrder short-circuits the check for token quality based on this field.

I am inclined to leave it as-is, keeping the name as BadOrderDetectionConfig since this is truly used at the order level of an auction.

The Competition strategy uses the detector on an per-order basis to ask if it is unsupported. The underlying mechanism makes a decision based either on the token itself or on the specific order, which might be considered an implementation detail. Thus the BadOrderDetector has inside of it an order-level detector based on metrics an a token-level one based on simulation that together provide us the answer if an order is unsupported or not.

Let's keep it as is and move forward with the PR.

@m-sz m-sz requested a review from squadgazzz January 12, 2026 10:54
@m-sz m-sz requested a review from squadgazzz January 12, 2026 12:10
account = "{account}"
merge-solutions = {merge_solutions}
quote-using-limit-orders = {quote_using_limit_orders}
enable-simulation-bad-token-detection = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this config still relevant?

Copy link
Contributor

@m-sz m-sz Jan 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@m-sz
Copy link
Contributor

m-sz commented Jan 15, 2026

Thanks for the reviews. I will wait for @MartinquaXD to provide his thoughts as he is the original author and merge only then.

Copy link
Contributor Author

@MartinquaXD MartinquaXD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks alright to me. I like the discussion about proper naming and where to move stuff.
One more nit from my side: detector is a very generic name for the module. risk_detector would give the reader at least some information on what it's supposed to be.
BTW GH doesn't let me approve my own PR.

@m-sz
Copy link
Contributor

m-sz commented Jan 16, 2026

I'll change the detector naming to risk_detector and merge

@m-sz m-sz enabled auto-merge January 16, 2026 12:38
@m-sz m-sz disabled auto-merge January 16, 2026 12:48
@m-sz m-sz added this pull request to the merge queue Jan 16, 2026
Merged via the queue into main with commit 04f4e63 Jan 16, 2026
19 checks passed
@m-sz m-sz deleted the convert-bad-token-to-bad-order-detection branch January 16, 2026 13:25
@github-actions github-actions bot locked and limited conversation to collaborators Jan 16, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Change bad token detection to bad order detection

5 participants