Investigate sharing self-hosted compartment across content processes
Categories
(Core :: JavaScript Engine, enhancement, P1)
Tracking
()
Tracking | Status | |
---|---|---|
firefox90 | --- | fixed |
People
(Reporter: tcampbell, Assigned: nbp)
References
(Blocks 3 open bugs)
Details
(Whiteboard: [overhead:250k])
Attachments
(3 files)
As part of Fission efforts, we should consider sharing the self-hosted compartment across processes. We already share within the process so this is a good target to look at. We need to figure out what APIs Gecko has and which we should use.
Reporter | ||
Updated•6 years ago
|
Updated•6 years ago
|
Comment 1•6 years ago
|
||
According to my analyzer, memory associated with self-hosted scripts are something like 150kb (including some scripts in the message manager), and the self-hosting-global is another about 300kb. I'm not sure how much of that might be sharable.
Updated•6 years ago
|
Reporter | ||
Comment 2•5 years ago
|
||
This has been stalled long enough. I'm taking this bug for ff69 train.
The most minimal approach here now would be to prime the SharedScriptData de-duplicated cache with shared data for the bytecode. This involves splitting SharedScriptData into SharedScriptData and RuntimeScriptData (and taking a word-per-script regression in the meantime).
The size of bytecode, etc for the self-hosted bytecode is 220kB. (A 1-word regression per script is usually 20kB in our overhead numbers).
I'll also dig into the memory used by the global itself that mcrr8 identifies and see if we making it lazier makes sense. I know we've hit a few regressions in the last few months due to adding language/library features via self-hosting.
Reporter | ||
Updated•5 years ago
|
Reporter | ||
Updated•4 years ago
|
Assignee | ||
Comment 5•3 years ago
|
||
Early conservative estimate show that we can save ~16ms from the startup of each process on Android (Android 8.0 Pixel2 AArch64), by using a cached selfhosted.xdr instead of parsing it.
This estimation was made using Bug 1690570 patches, and comparing JS shell execution with an external file as a cache for selfhosted.xdr content, compared against a version which is parsing selfhosted code, and divided by the number of tests executed.
Comment 6•3 years ago
|
||
There's a v8 blog post from 2018 on their approach to reducing memory usage for builtins: https://v8.dev/blog/embedded-builtins
Assignee | ||
Comment 7•3 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #6)
There's a v8 blog post from 2018 on their approach to reducing memory usage for builtins: https://v8.dev/blog/embedded-builtins
So far our approach is to use their previous approach of sharing memory across processes, where each instance will instantiate the stencil.
This is not perfect, but easier than figuring out cross compilation issues in our build system, and would keep the current binary size.
Assignee | ||
Comment 8•3 years ago
|
||
Early result shows that during the start of Firefox, the Parent process and each content process are parsing 4 times the self-hosted code, each producing their own Stencil.
With patches that I am polishing now, the Parent process parses self-hosted code once, create a shared memory where the content is copied over. The shared memory is then used by the Worker thread within the Parent process. The shared memory is successfully transmitted to the Content process which then use the shared memory to decode the self-hosted content twice, one for the main thread and one for a worker thread.
Thus, with the work from Arai to borrow content while decoding, this new patch, with a single Content process should already be a memory saving despite the overhead added by shared memory. This should also be a time saving, but I have not measured it yet.
Patches are coming…
Assignee | ||
Comment 9•3 years ago
|
||
The JSRuntime already has an API to set the self-hosted content before the
initialization. This modification adds a proper API such that the rest of Gecko,
which does not have access to the JSRuntime implementation can use this
functions as well.
Assignee | ||
Comment 10•3 years ago
|
||
This modification relies on the shared memory implemented in Bug 1698045 and on
the ability to encode and decode self-hosted content from Bug 1668361 to
optimize the JS engine initialization by making the parent process encode the
self-hosted stencil, such that all other runtime initialization would only have
to decode it, including content processes.
Assignee | ||
Comment 11•3 years ago
|
||
Comment 12•3 years ago
|
||
Pushed by npierron@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/d7bf37fd9355 part 0 - Simplify mozilla::GetBuildId to be safely called from any thread. r=tcampbell,mccr8 https://hg.mozilla.org/integration/autoland/rev/51ff7ca947e9 part 1 - Add an API to set self-hosted XDR content. r=tcampbell https://hg.mozilla.org/integration/autoland/rev/20bc1a4de242 part 2 - Use shared memory to initialize the JS engine. r=smaug,tcampbell,necko-reviewers
Comment 13•3 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/d7bf37fd9355
https://hg.mozilla.org/mozilla-central/rev/51ff7ca947e9
https://hg.mozilla.org/mozilla-central/rev/20bc1a4de242
Description
•