Closed
Bug 1224577
Opened 9 years ago
Closed 9 years ago
Leaking chrome windows with basic SDK addon
Categories
(Add-on SDK Graveyard :: General, defect)
Add-on SDK Graveyard
General
Tracking
(firefox42-, firefox43+ fixed, firefox44+ fixed, firefox45+ fixed)
VERIFIED
FIXED
mozilla45
People
(Reporter: bzbarsky, Assigned: mossop)
References
Details
(Whiteboard: [MemShrink])
Attachments
(2 files)
1.01 MB,
application/x-gzip
|
Details | |
779 bytes,
patch
|
bzbarsky
:
review+
lizzard
:
approval-mozilla-aurora+
lizzard
:
approval-mozilla-beta+
lizzard
:
approval-mozilla-release-
|
Details | Diff | Splinter Review |
[Tracking Requested - why for this release]: Because this makes the browser really suck to use. SYMPTOMS: I get 2s or so GC pauses after running the browser for a bit, which obviously makes it a huge pain to use. I haven't figured out yet what causes this. DETAILS: 1) E10S is disabled. Since that's what we're shipping anyway, I figure that's fair game. I don't know whether this happens with e10s enabled. 2) I don't know whether 44 is affected; this started happening in the last several weeks, I think, but I don't know exactly when. 3) GC log says: GC(T+685172.8) Summary - Max Pause: 2106.738ms; MMU 20ms: 0.0%; MMU 50ms: 0.0%; Total: 3434.846ms; Zones: 577 of 577; Compartments: 3322 of 3322; HeapSize: 1734.738 MiB; HeapChange (abs): +4 (4); That particular GC had 107 slices. The max pause is this slice: GC Slice 77 - Pause: 2076.859ms of 10ms budget (@ 2333.459ms); Reason: REFRESH_FRAME; Reset: no; Times: Sweep: 2055.293ms, Other: 0.966ms, Mark During Sweeping: 1965.940ms, Mark Weak: 1865.910ms, Mark Incoming Gray Pointers: 0.284ms, Mark Gray: 86.467ms, Mark Gray and Weak: 13.098ms, Finalize Start Callbacks: 5.840ms, Per-Slice Weak Callback: 2.607ms, Per-Compartment Weak Callback: 3.227ms, Sweep Compartments: 73.772ms, Sweep Discard Code: 44.667ms, Sweep Inner Views: 0.177ms, Sweep Cross Compartment Wrappers: 41.703ms, Sweep Base Shapes: 5.276ms, Sweep Initial Shapes: 20.571ms, Sweep Type Objects: 44.497ms, Sweep Breakpoints: 19.368ms, Sweep Regexps: 0.846ms, Sweep Miscellaneous: 21.691ms, Sweep type information: 0.204ms, Other: 0.133ms, Sweep Object: 8.434ms, Sweep Shape: 0.131ms, Minor GCs to Evict Nursery: 1.888ms, Other: 0.427ms, Mark Roots: 1.461ms, Other: 1.253ms, Mark Rooters: 0.208ms Note the "Mark weak" number. 4) Whenever this happens, about:memory says that I have a single (or at least not annotated as multiple in about:memory afaict) "window(chrome://browser/content/browser.xul)" under "explicit/window-objects/top(none)/detached" that takes up about a gig of memory. Right now it's taking up 1115MB. most of that (841MB) is a "js-compartment([System Principal], about:blank)" (why about:blank??) and most of the stuff in there is either "scripts" or "classes/class(Function)". Also a fair amount of Block, XBL prototype JSClass, Window (20MB??), CSS2Properties, Proxy, Call, With, nsXPCComponents_Interfaces, etc.
Reporter | ||
Comment 1•9 years ago
|
||
The fact that I'm claimed to have 6.19 MB of "classes/class(nsXPCComponents_Interfaces)" under my detached browser.xul makes me doubt that this is in fact a single window we're talking about here.... Especially since if I open things up and look at "classes/class(nsXPCComponents_Interfaces)/shapes/malloc-heap/tree-tables" (which is where most of the memory is) I see a "[323]" after it. Nicholas, is it possible that we only show the numbers in brackets on leaves in about:memory and I'm looking at multiple windows leaked here?
Flags: needinfo?(n.nethercote)
Reporter | ||
Comment 2•9 years ago
|
||
terrence, what does the "Mark Weak" bit in the GC log mean? Is this WeakMap or something else? We need to figure out which code is keeping all this junk alive, so I'm trying to figure out what code we might want to start auditing, and this seems like at least a lead.
Flags: needinfo?(terrence)
Reporter | ||
Comment 3•9 years ago
|
||
Oh, and I can provide GC/CC logs from about:memory on request. Olli and Steve already have them...
Reporter | ||
Comment 4•9 years ago
|
||
Oh, and currently I'm using a 2015-11-05 nightly.
Reporter | ||
Comment 5•9 years ago
|
||
One other note. Consider this bit from about:memory: 5,827.34 MB (100.0%) -- explicit ├──2,104.61 MB (36.12%) -- window-objects │ ├──1,141.97 MB (19.60%) -- top(none) │ │ ├──1,135.52 MB (19.49%) -- detached │ │ │ ├──1,122.66 MB (19.27%) -- window(chrome://browser/content/browser.xul) │ │ │ │ ├────847.20 MB (14.54%) -- js-compartment([System Principal], about:blank) │ │ │ │ │ ├──502.71 MB (08.63%) -- classes │ │ │ │ │ │ ├──218.02 MB (03.74%) -- class(Function) │ │ │ │ │ │ │ ├──204.94 MB (03.52%) -- objects │ │ │ │ │ │ │ │ ├──201.17 MB (03.45%) ── gc-heap [357] Note the [357]. If I open and close a browser window and click "Measure" it becomes [358] (not surprising). If I now click "Minimize memory usage" and then click "Measure" it's still [358]. This part _is_ surprising, and indicates that we're just leaking every single window I open and close. Which certainly explains the suck; I open/close windows a _lot_.
Comment 6•9 years ago
|
||
Could you attach an about:memory report, please? You can use an anonymized one if you prefer.
Whiteboard: [MemShrink]
Comment 7•9 years ago
|
||
bug 1223445 certainly can lead to massive runtime leaks when one has background tabs. One way to test if that is possibly the cause is to iterate through all the tabs by making each of them being foreground for a paint (refresh driver tick + paint is enough to release the leak on each tab). The leak is such that it doesn't show up in cc or gc logs, since cc doesn't end up seeing where the unknown edges are coming from. Though I think this is actually different issue, but I couldn't reproduce when I tried earlier. (I'll look at the gc log today.)
Comment 8•9 years ago
|
||
If you email me some GC logs I can try to analyze what exactly is in the heap, or I can get a copy from Olli later today. Looking through those can let you see exactly what scripts are alive. One big change to how we GC XPConnect stuff that landed recently is bug 1214961.
Reporter | ||
Comment 9•9 years ago
|
||
Reporter | ||
Comment 10•9 years ago
|
||
> bug 1223445 certainly can lead to massive runtime leak I don't see how bug 1223445 would cause me to leak a window that I just open and close, right? But that's what I see happening in comment 5. > One big change to how we GC XPConnect stuff that landed recently is bug 1214961. Hmm. That's plausibly the right timeframe...
Reporter | ||
Comment 11•9 years ago
|
||
> If you email me some GC logs
Very funny. They're 300MB _after_ compression. :) Ping me on IRC and I'll temporarily put them somewhere where you can download them.
Comment 12•9 years ago
|
||
FWIW, there is at least the leak where 0x1e41e4e00 [rc=2] FragmentOrElement (xhtml) input class='newtab-control newtab-control-block' (orphan) about:newtab 0x1e41e4c00 [rc=2] FragmentOrElement (xhtml) input class='newtab-control newtab-control-pin' (orphan) about:newtab tend to be the roots.
Comment 13•9 years ago
|
||
https://dxr.mozilla.org/mozilla-central/source/js/src/jsgc.cpp#4121 It could be weakmaps, watchpoints, debugger, or jitcode global tables (whatever those are). My money is on weakmap. If you could reproduce in rr, it shouldn't be hard to break here and figure out exactly which table it is, then narrow it down to the exact script allocation.
Flags: needinfo?(terrence)
Comment 14•9 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #8) > If you email me some GC logs I can try to analyze what exactly is in the > heap, or I can get a copy from Olli later today. Looking through those can > let you see exactly what scripts are alive. > > One big change to how we GC XPConnect stuff that landed recently is bug > 1214961. I'd expect to see a long FinalizeStartCallback phase in that case, rather than a long weak marking phase.
Reporter | ||
Comment 15•9 years ago
|
||
> If you could reproduce in rr,
Not trivial given that this is something that just randomly happens during my normal browsing and I'm on Mac. :(
Comment 16•9 years ago
|
||
(In reply to Boris Zbarsky [:bz] from comment #1) > is it possible that we only show the numbers in brackets on leaves in > about:memory and I'm looking at multiple windows leaked here? Yeah, it certainly sounds like we are leaking chrome windows.
Flags: needinfo?(n.nethercote)
Comment 17•9 years ago
|
||
Based on comment 12, some change to the new tab code may be involved. Bug 1204361 is the only change that I could see in the last month to the new tab code. The New Tab Override addon was also updated 10/28. I'll see if I can reproduce.
Comment 18•9 years ago
|
||
I can reproduce this locally. Install the New Tab Override addon. Disable e10s. Restart. Open and close some windows, maybe loading stuff maybe not. Now you have some top(none)/detached/window(chrome://browser/content/browser.xul).
Summary: Some sort of memory leak in nightly keeps leading to huge GC pauses → Leaking chrome windows with New Tab Override addon
Reporter | ||
Comment 19•9 years ago
|
||
The fun part is that afaict this addon just calls some addon sdk stuff and one of our JSMs in totally innocuous ways. So chances are, this is a leak in the addon sdk.
Comment 20•9 years ago
|
||
I backed out bug 1204361 locally and I can still reproduce a leak so that's not the issue.
Comment 21•9 years ago
|
||
Boris has reduced this to a very tiny Addon SDK addon.
Product: Core → Add-on SDK
Version: Trunk → unspecified
Comment 22•9 years ago
|
||
I'm going to try bisecting it to see if it is really a regression.
Reporter | ||
Comment 23•9 years ago
|
||
OK, so I tried changing the addon in question so index.js just contains this: exports.main = () => {}; exports.onUnload = () => {}; and I still see leaks. I didn't try changign its bootstrap.js because all that seems like total boilerplate. With the addon thus modified and installed, the steps to reproduce the leak are pretty simple: 1) Start browser. 2) Open about:memory. 3) Open a new window. 4) Close that new window. 5) Repeat steps 3 and 4 five more times. 6) Click "Minimize memory". 7) Click "Measure". Then expand "explicit", expand "window-objects", expand "top(none)/detached", expand "js-compartment([System Principal], about:blank)", expand "classes", expand "class(Function)", expand "objects". Note the [6] after both entries under "objects", indicating that there are 6 detached windows still alive. These are the windows we opened and closed. Dave, do you know who might be able to look at this on the SDK end?
Flags: needinfo?(dtownsend)
Summary: Leaking chrome windows with New Tab Override addon → Leaking chrome windows with basic SDK addon
Comment 24•9 years ago
|
||
I can still reproduce this same issue with a May 3 Nightly build of Firefox, using the New Tab Override addon. Maybe a change in the addon caused the problem?
Reporter | ||
Comment 25•9 years ago
|
||
It's also possible that the problem has been there all along and I just finally noticed it. I would not have started using this addon before July sometime (because the pref worked before then) and I was out for most of August...
Comment 26•9 years ago
|
||
> I open things up and look at
> "classes/class(nsXPCComponents_Interfaces)/shapes/malloc-heap/tree-tables"
> (which is where most of the memory is) I see a "[323]" after it. Nicholas,
> is it possible that we only show the numbers in brackets on leaves in
> about:memory and I'm looking at multiple windows leaked here?
Yes. The "[323]" annotations indicate duplicate reports and they only show up on leaves and you're almost certainly seeing evidence of multiple windows.
Assignee | ||
Comment 27•9 years ago
|
||
Well this was dumb...
Assignee: nobody → dtownsend
Flags: needinfo?(dtownsend)
Attachment #8688623 -
Flags: review?(bzbarsky)
Reporter | ||
Comment 28•9 years ago
|
||
Comment on attachment 8688623 [details] [diff] [review] patch r=me, though I'm hardly an expert on this code. The old thing is clearly wrong, though. ;) This code looks like it was part of the bug 1139189 uplift, right? So are we leaking like this in release builds right now? :(
Flags: needinfo?(dtownsend)
Attachment #8688623 -
Flags: review?(bzbarsky) → review+
Reporter | ||
Comment 29•9 years ago
|
||
Or put another way, which branches do we need to backport this to?
Reporter | ||
Comment 30•9 years ago
|
||
I did confirm this fixes the leak per the steps in comment 23.
Assignee | ||
Comment 32•9 years ago
|
||
(In reply to Boris Zbarsky [:bz] from comment #29) > Or put another way, which branches do we need to backport this to? We should uplift this to every branch we can :(
Flags: needinfo?(dtownsend)
Reporter | ||
Comment 33•9 years ago
|
||
Well, we don't need this on ESR38, right? [Tracking Requested - why for this release]: Leaks are bad. Very very bad.
Comment 34•9 years ago
|
||
Is this the same as bug 1142597?
Reporter | ||
Comment 35•9 years ago
|
||
Seems likely, yes.
Comment 36•9 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/924aca2113a4
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla45
Reporter | ||
Comment 38•9 years ago
|
||
Dave, are you planning to request branch approvals?
Flags: needinfo?(dtownsend)
Assignee | ||
Comment 39•9 years ago
|
||
Comment on attachment 8688623 [details] [diff] [review] patch Approval Request Comment [Feature/regressing bug #]: Bug 1130529 [User impact if declined]: Any users with SDK based add-ons will leak every opened browser window until shutdown or the add-ons get disabled. [Describe test coverage new/current, TreeHerder]: On m-c, we don't have any test coverage for this specific kind of leak though. [Risks and why]: Very low risk, this fixes what is essentially a typo. [String/UUID change made/needed]: None.
Flags: needinfo?(dtownsend)
Attachment #8688623 -
Flags: approval-mozilla-release?
Attachment #8688623 -
Flags: approval-mozilla-beta?
Attachment #8688623 -
Flags: approval-mozilla-aurora?
Comment 40•9 years ago
|
||
Comment on attachment 8688623 [details] [diff] [review] patch Fix for memory leak, please uplift to aurora and beta.
Attachment #8688623 -
Flags: approval-mozilla-beta?
Attachment #8688623 -
Flags: approval-mozilla-beta+
Attachment #8688623 -
Flags: approval-mozilla-aurora?
Attachment #8688623 -
Flags: approval-mozilla-aurora+
Comment 41•9 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-aurora/rev/09bfa4e72dbf
status-firefox44:
--- → fixed
Tracked bcuz it's a memory leak issue, good to see that it's fixed already.
Bz, can you please verify that this issue is fixed as expected on a latest Nightly build? Thanks! :)
Flags: needinfo?(bzbarsky)
Reporter | ||
Comment 45•9 years ago
|
||
I'm not seeing the leaked detached windows, at least. We'll see how the gc pauses do.
Flags: needinfo?(bzbarsky)
Verified as per comment 45.
Status: RESOLVED → VERIFIED
Comment 48•9 years ago
|
||
Comment on attachment 8688623 [details] [diff] [review] patch This uplift to m-r was requested a couple of weeks ago, i.e. for 43, but m-r is now 43. It is too late to take this in 42.
Attachment #8688623 -
Flags: approval-mozilla-release? → approval-mozilla-release-
Updated•9 years ago
|
You need to log in
before you can comment on or make changes to this bug.
Description
•