Skip to content

Conversation

@TheBlueMatt
Copy link
Collaborator

@TheBlueMatt TheBlueMatt commented Oct 7, 2025

Reading `ChannelMonitor`s on startup is one of the slowest parts of
LDK initialization. Now that we have an async `KVStore`, there's no
need for that, we can simply paralellize their loading, which we do
here.

Sadly, because Rust futures are pretty unergonomic, we have to add
some `unsafe {}` here, but arguing its fine is relatively
straightforward.

Based on #4146 #4175 just cause I don't want to resolve conflicts.

@ldk-reviews-bot
Copy link

ldk-reviews-bot commented Oct 7, 2025

👋 Thanks for assigning @jkczyz as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

@codecov
Copy link

codecov bot commented Oct 7, 2025

Codecov Report

❌ Patch coverage is 47.58065% with 65 lines in your changes missing coverage. Please review.
✅ Project coverage is 86.50%. Comparing base (765e43d) to head (986a3ca).
⚠️ Report is 35 commits behind head on main.

Files with missing lines Patch % Lines
lightning/src/util/persist.rs 35.82% 38 Missing and 5 partials ⚠️
lightning/src/util/native_async.rs 56.25% 14 Missing ⚠️
lightning-block-sync/src/gossip.rs 0.00% 5 Missing ⚠️
lightning/src/util/async_poll.rs 85.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4147      +/-   ##
==========================================
- Coverage   89.36%   86.50%   -2.87%     
==========================================
  Files         180      158      -22     
  Lines      139366   101937   -37429     
  Branches   139366   101937   -37429     
==========================================
- Hits       124547    88177   -36370     
+ Misses      12227    11336     -891     
+ Partials     2592     2424     -168     
Flag Coverage Δ
fuzzing 36.15% <0.00%> (+0.97%) ⬆️
tests 85.80% <47.58%> (-2.91%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ldk-reviews-bot ldk-reviews-bot requested a review from jkczyz October 7, 2025 23:28
@TheBlueMatt TheBlueMatt force-pushed the 2025-10-parallel-reads branch from 4a773c9 to 31fbd16 Compare October 8, 2025 10:54
@TheBlueMatt TheBlueMatt marked this pull request as draft October 8, 2025 11:40
@TheBlueMatt TheBlueMatt removed the request for review from jkczyz October 8, 2025 11:40
@TheBlueMatt
Copy link
Collaborator Author

This doesn't actually work. monitor loading is (at least for a semi-local disk) CPU-bound, so we really need to spawn each monitor load task rather than having one task.

@TheBlueMatt TheBlueMatt force-pushed the 2025-10-parallel-reads branch 3 times, most recently from 64412a5 to 2c807b9 Compare October 9, 2025 17:58
@TheBlueMatt TheBlueMatt marked this pull request as ready for review October 9, 2025 19:54
@TheBlueMatt TheBlueMatt force-pushed the 2025-10-parallel-reads branch from 2c807b9 to 0a2547a Compare October 9, 2025 19:54
@TheBlueMatt
Copy link
Collaborator Author

This requires GATs which requires an MSRV bump (to 1.64), but we're planning on doing that soon anyway.

@ldk-reviews-bot ldk-reviews-bot requested a review from jkczyz October 9, 2025 19:55
@TheBlueMatt TheBlueMatt force-pushed the 2025-10-parallel-reads branch from 0a2547a to a4a778c Compare October 12, 2025 23:02
@ldk-reviews-bot
Copy link

🔔 1st Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@TheBlueMatt TheBlueMatt force-pushed the 2025-10-parallel-reads branch from a4a778c to 3346cba Compare October 13, 2025 13:27
@ldk-reviews-bot
Copy link

🔔 2nd Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link

🔔 3rd Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link

🔔 4th Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link

🔔 5th Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link

🔔 6th Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link

🔔 7th Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link

🔔 8th Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link

🔔 9th Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link

🔔 10th Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link

🔔 11th Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link

🔔 12th Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link

🔔 13th Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@TheBlueMatt TheBlueMatt force-pushed the 2025-10-parallel-reads branch from 15706ef to d607625 Compare December 11, 2025 15:45
@TheBlueMatt TheBlueMatt self-assigned this Dec 11, 2025
@ldk-reviews-bot
Copy link

🔔 2nd Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link

🔔 3rd Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link

🔔 4th Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

Comment on lines 1117 to 1127
log_error!(
self.logger,
"Monitor update failed. monitor: {} update: {} reason: {:?}",
monitor_key,
update_name.as_str(),
e
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation looks off.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take it up with rustfmt 🤷‍♂️

@ldk-reviews-bot
Copy link

👋 The first review has been submitted!

Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer.

@TheBlueMatt TheBlueMatt added this to the 0.3 milestone Jan 5, 2026
@TheBlueMatt TheBlueMatt force-pushed the 2025-10-parallel-reads branch from f2b14fb to fa48356 Compare January 5, 2026 18:53
@TheBlueMatt TheBlueMatt requested a review from jkczyz January 5, 2026 18:53
@TheBlueMatt
Copy link
Collaborator Author

Tagging 0.3 since we've had complaints about startup time with remote persistence (which appears to be becoming more common).

Copy link
Contributor

@jkczyz jkczyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Please squash.

@TheBlueMatt TheBlueMatt requested a review from jkczyz January 5, 2026 21:28
@TheBlueMatt TheBlueMatt force-pushed the 2025-10-parallel-reads branch 2 times, most recently from 82cbb76 to 8873d1c Compare January 5, 2026 21:30
@TheBlueMatt
Copy link
Collaborator Author

Squashed and pushed with only reverting the change to make FutureQueueCompletion private as its actually exported as an associated type so has to be pub(crate) like the utility that uses it.

@TheBlueMatt TheBlueMatt force-pushed the 2025-10-parallel-reads branch from 8873d1c to 5444e76 Compare January 5, 2026 21:32
@TheBlueMatt
Copy link
Collaborator Author

Oops, no, it actually has to be pub due to a rust bug :( Its test-only though.

@TheBlueMatt TheBlueMatt force-pushed the 2025-10-parallel-reads branch from 5444e76 to b47ea70 Compare January 5, 2026 21:41
@TheBlueMatt
Copy link
Collaborator Author

TheBlueMatt commented Jan 5, 2026

Also silenced clippy:

$ git diff-tree -U1 5444e7613 986a3caa8
diff --git a/lightning/src/util/native_async.rs b/lightning/src/util/native_async.rs
index 32c6049f3b..0c380f2b1d 100644
--- a/lightning/src/util/native_async.rs
+++ b/lightning/src/util/native_async.rs
@@ -82,3 +82,10 @@ impl<O> Clone for FutureQueueCompletion<O> {
 	fn clone(&self) -> Self {
-		Self(self.0.clone())
+		#[cfg(all(test, feature = "std"))]
+		{
+			Self(Arc::clone(&self.0))
+		}
+		#[cfg(all(test, not(feature = "std")))]
+		{
+			Self(Rc::clone(&self.0))
+		}
 	}

`tokio::spawn` can be use both to spawn a forever-running
background task or to spawn a task which gets `poll`ed
independently and eventually returns a result which the callsite
wants.

In LDK, we have only ever needed the first, and thus didn't bother
defining a return type for `FutureSpawner::spawn`. However, in the
next commit we'll start using `FutureSpawner` in a context where we
actually do want the spawned future's result. Thus, here, we add a
result output to `FutureSpawner::spawn`, mirroring the
`tokio::spawn` API.
`MonitorUpdatingPersister::read_all_channel_monitors_with_updates`
was made to do the IO operations in parallel in a previous commit,
however in practice this doesn't provide material parallelism for
large routing nodes. Because deserializing `ChannelMonitor`s is the
bulk of the work (when IO operations are sufficiently fast), we end
up blocked in single-threaded work nearly the entire time.

Here, we add an alternative option - a new
`read_all_channel_monitors_with_updates_parallel` method which uses
the `FutureSpawner` to cause the deserialization operations to
proceed in parallel.
When reading `ChannelMonitor`s from a `MonitorUpdatingPersister` on
startup, we have to make sure to load any `ChannelMonitorUpdate`s
and re-apply them as well. For users of async persistence who don't
have any `ChannelMonitorUpdate`s (e.g. because they set
`maximum_pending_updates` to 0 or, in the future, we avoid
persisting updates for small `ChannelMonitor`s), this means two
round-trips to the storage backend, one to load the
`ChannelMonitor` and one to try to read the next
`ChannelMonitorUpdate` only to have it fail.

Instead, here, we use `KVStore::list` to fetch the list of stored
`ChannelMonitorUpdate`s, which for async `KVStore` users allows us
to parallelize the list of update fetching and the
`ChannelMonitor` loading itself. Then we know exactly when to stop
reading `ChannelMonitorUpdate`s, including reading none if there
are none to read. This also avoids relying on `KVStore::read`
correctly returning `NotFound` in order to correctly discover when
to stop reading `ChannelMonitorUpdate`s.
When reading `ChannelMonitor`s from a `MonitorUpdatingPersister` on
startup, we have to make sure to load any `ChannelMonitorUpdate`s
and re-apply them as well. Now that we know which
`ChannelMonitorUpdate`s to load from `list`ing the entries from the
`KVStore` we can parallelize the reads themselves, which we do
here.

Now, loading all `ChannelMonitor`s from an async `KVStore` requires
only three full RTTs - one to list the set of `ChannelMonitor`s,
one to both fetch the `ChanelMonitor` and list the set of
`ChannelMonitorUpdate`s, and one to fetch all the
`ChannelMonitorUpdate`s (with the last one skipped when there are
no `ChannelMonitorUpdate`s to read).
This is really dumb, `assert!(cfg!(fuzzing))` is a perfectly
reasonable thing to write!
@TheBlueMatt TheBlueMatt force-pushed the 2025-10-parallel-reads branch from b47ea70 to 986a3ca Compare January 5, 2026 21:42
@ldk-reviews-bot
Copy link

✅ Added second reviewer: @wpaulino

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

3 participants