Skip to content

CodeCache: Accept multiple threads in SaveData#5500

Open
bylaws wants to merge 6 commits into
FEX-Emu:mainfrom
bylaws:multicomp
Open

CodeCache: Accept multiple threads in SaveData#5500
bylaws wants to merge 6 commits into
FEX-Emu:mainfrom
bylaws:multicomp

Conversation

@bylaws

@bylaws bylaws commented May 17, 2026

Copy link
Copy Markdown
Collaborator

Needs some consideration of if this is the cleanest way, but this is certainly the simplest.

neobrain and others added 6 commits May 17, 2026 22:36
This allows mapping the code directly into memory for execution.
…ation

This ensures that any code buffer memory owned by the MappedResource is
invalidated before being deallocated.
A significant proportion of time with the prior approach was spent in
the unbuffered write calls used once per code page/entrypoint; taking
upwards of 15 seconds to serialize a 700MB P3R cache. Switching to an
mmap approach reduces the CPU time there to less than a second, leaving
I/O to dominate.

A two-pass approach is used:
1. Calculate the required size for the cache file, issue a callback to
   frontend to map a cache file of that size.
2. Write cache data into the buffer provided by that callback.
Merges and sorts relocations from all provided threads before
serializing, enabling multi-threaded offline code cache generation.
@Sonicadvance1

Copy link
Copy Markdown
Member

Rebase necessary.

@neobrain neobrain left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you PR an actual implementation of multithreaded writing instead so that there's a meaningful base for discussion? The interface change to SaveData on its own isn't very useful.

Regarding mmap writing, I would prefer this to stay in a PR for now. What we have today is easier to iterate on, and I've got upcoming implementation changes in SaveData that would interfere unnecessarily. If serialization time is critical, simple buffered writing would work as a stop-gap.

@bylaws

bylaws commented May 21, 2026

Copy link
Copy Markdown
Collaborator Author

Regarding mmap writing, I would prefer this to stay in a PR for now. What we have today is easier to iterate on, and I've got upcoming implementation changes in SaveData that would interfere unnecessarily. If serialization time is critical, simple buffered writing would work as a stop-gap.

I think it should be fairly easy to iterate on, like I explicitly did it in a generic way that is trivial to extend without new address logic? I think unless the code quality is meaningfully bad it would be best to merge instead of asking to reinvent the wheel again, delaying it again just means more conflict and more effort for whoever ends up needing to clean it up. And like it makes things 10x faster or smth so may as well take the win.

Could you PR an actual implementation of multithreaded writing instead so that there's a meaningful base for discussion? The interface change to SaveData on its own isn't very useful.
This is dependent on all my existing PRs landing, I can do so but it would mean there's like 5 stacked PRs, if you feel that's best sure I can do

@neobrain

neobrain commented May 22, 2026

Copy link
Copy Markdown
Member

I think it should be fairly easy to iterate on, like I explicitly did it in a generic way that is trivial to extend without new address logic? I think unless the code quality is meaningfully bad it would be best to merge instead of asking to reinvent the wheel again, delaying it again just means more conflict and more effort for whoever ends up needing to clean it up. And like it makes things 10x faster or smth so may as well take the win.

I'm only measuring a 2x improvement (both for small and large caches), which can equally be achieved with simple buffering that doesn't require us to awkwardly split IO logic or to do a double-pass. (My guess is your 10x measurement was made before I addressed the trivial low-hanging fruit in SaveData?)

Could you PR an actual implementation of multithreaded writing instead so that there's a meaningful base for discussion? The interface change to SaveData on its own isn't very useful.
This is dependent on all my existing PRs landing, I can do so but it would mean there's like 5 stacked PRs, if you feel that's best sure I can do

How is it dependent? Surely this can already be illustrated Linux-side with everything being upstream there already?

@bylaws

bylaws commented May 22, 2026

Copy link
Copy Markdown
Collaborator Author

How is it dependent? Surely this can already be illustrated Linux-side with everything being upstream there already?

I have not implemented this for the Linux side and do not have time to. If you want me to pr windows side I can.

I'm only measuring a 2x improvement (both for small and large caches), which can equally be achieved with simple buffering that doesn't require us to awkwardly split IO logic or to do a double-pass. (My guess is your 10x measurement was made before I addressed the trivial low-hanging fruit in SaveData?)

I really don't think it's that awkward to do two passes, 2x improvement is pretty good and meaningful. Especially in the future mmap is much more flexible that just chained writes, and FEX is pretty well acquainted with mmap anyway :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants