Closed
Bug 1232549
Opened 9 years ago
Closed 8 years ago
Win7/Win8 mochitest-2 permaleak of AsyncTransactionTrackersHolder, CompositableChild, CondVar, Mutex, PCompositableChild, ... on mozilla-aurora 45
Categories
(Core :: Graphics, defect, P3)
Core
Graphics
Tracking
()
RESOLVED
DUPLICATE
of bug 1252677
Tracking | Status | |
---|---|---|
e10s | + | --- |
People
(Reporter: philor, Assigned: nical)
References
Details
(Keywords: memory-leak, Whiteboard: gfx-noted [MemShrink:P2][e10s-orangeblockers])
This is keeping mozilla-aurora closed after the merge. Win7: https://treeherder.mozilla.org/logviewer.html#?job_id=1620646&repo=mozilla-aurora Win8: https://treeherder.mozilla.org/logviewer.html#?job_id=1619392&repo=mozilla-aurora <mccr8> philor : and of that, the PImageContainerChild and PTextureChild are probably the most interesting ones Thus filing in gfx.
Comment 1•9 years ago
|
||
As philor pointed out, the most obvious change from Nightly in graphics code is the driver crash guard: http://mxr.mozilla.org/mozilla-central/source/gfx/src/DriverCrashGuard.cpp#59 I think the first thing to do would be to check if making that return in Initialize() happen unconditionally fixes the leak. Could you look into this, David? At least to establish that the crash guard isn't at fault. Thanks.
Flags: needinfo?(dvander)
Doubt it's this, but sure, it's worth a check. :roc may be on PTO, :sotaro is the other person that was involved in this code, but the fact that it fails only on aurora build is strange.
Flags: needinfo?(sotaro.ikeda.g)
Flags: needinfo?(roc)
Comment 3•9 years ago
|
||
I did a try push with the early return in the crash guard removed: https://treeherder.mozilla.org/#/jobs?repo=try&revision=b2e4a65add78 We'll see what happens there.
Comment 4•9 years ago
|
||
It looks like the leak is happening while running dom/media/tests/mochitest/ipc/test_ipc.html, so maybe there's something odd about how the child process is set up for that test.
Comment 5•9 years ago
|
||
M2 is green in my try run, so the leak does seem to be triggered by crash guard.
This code has been in at least one release (it landed in August) and has not changed since it first landed, other than to add that #ifdef in early September. So I'm not sure why it would suddenly be blamed for a leak. It doesn't really hold any meaningful references and it's a stack class. It can be attached to a ContentParent, but it's held in a UniquePtr and that shouldn't leak ContentChilds anyway?
Flags: needinfo?(dvander)
Reporter | ||
Comment 7•9 years ago
|
||
The other thing holding aurora closed is now fixed, so this is the only remaining blocker.
Comment 9•9 years ago
|
||
(In reply to David Anderson [:dvander] from comment #6) > This code has been in at least one release (it landed in August) and has not > changed since it first landed, other than to add that #ifdef in early > September. So I'm not sure why it would suddenly be blamed for a leak. I only added code that was able to detect smaller leaks like this in 44. As you can see from bug 1219916, this leak also showed up when 44 merged to Aurora. I added a suppression to 44 so we could reopen the tree, and then the bug just sat there for five weeks. So it is possible this leak was present and unnoticed since it initially landed. We can add a suppression again so the tree can be reopened, but it would be good if somebody actually looked into fixing the leak this time around. > It doesn't really hold any meaningful references and it's a stack class. It can > be attached to a ContentParent, but it's held in a UniquePtr and that > shouldn't leak ContentChilds anyway? The output is a little hard to read due to various other graphics leaks (bug 1215265), but what is leaking here is: 4 CompositableChild, 4 AsyncTransactionTrackersHolder, 4 CondVar, 8 Mutex, 4 PCompositableChild, 4 PImageContainerChild, 4 PTextureChild, 4 SharedMemory, 4 TextureChild, 4 TextureData, and 12 WeakReference<MessageListener>. Note that there is more leaking now than when I previously added an Aurora 44 suppression in bug 1219919. Not really knowing anything about this code, I'd guess that maybe this code causes some other graphics code to get run in the middle of shutdown or something so that other graphics stuff does not get shut down properly. This is only showing up in test_ipc.html, not actual e10s tests, and it shims up parent process stuff in some odd way so maybe something is wrong with that (like maybe the content process expects the parent to send some message before it does cleanup): http://mxr.mozilla.org/mozilla-central/source/dom/media/tests/mochitest/ipc/test_ipc.html?force=1
Comment 10•9 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #9) > This is only showing up in test_ipc.html, not actual e10s tests I just remembered that we unfortunately do not run Windows debug M-e10s tests, so it is possible that we're always leaking these texture things in Windows content processes.
Comment 11•9 years ago
|
||
I take a look.
Assignee: nobody → sotaro.ikeda.g
Flags: needinfo?(sotaro.ikeda.g)
Updated•9 years ago
|
Comment 12•8 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b85fec81c026&selectedJob=14733290 From the try push, DriverCrashGuard seems not related to the problem.
Comment 13•8 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #12) > From the try push, DriverCrashGuard seems not related to the problem. Oh, wow, I totally misunderstood the results for my try push in comment 3. That was a Nightly push with Crashguard enabled, and it didn't leak. (And I guess your push is an Aurora push with crash guard disabled, and it still leaks.) Sorry for my confusion!
Comment hidden (Intermittent Failures Robot) |
Comment 15•8 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #13) > (And I guess your push is an Aurora push with crash guard disabled, and it still > leaks.) Sorry for my confusion! Yes. No problem.
Comment 16•8 years ago
|
||
Hmm, bug 1215265 seems to solve the root problem.
Updated•8 years ago
|
Assignee: sotaro.ikeda.g → nobody
:nical is going to check bug 1215265 (we're thin on the ground in media because of PTOs), but if it turns out that this is "just" a shutdown leak, I'd suggest we let aurora go out with it, and deal with this bug once everybody's back.
Flags: needinfo?(roc)
Comment 18•8 years ago
|
||
(In reply to Milan Sreckovic [:milan] from comment #17) > :nical is going to check bug 1215265 (we're thin on the ground in media > because of PTOs), but if it turns out that this is "just" a shutdown leak, > I'd suggest we let aurora go out with it, and deal with this bug once > everybody's back. talked to nical on irc: and--> it looks like a mix of media resources held alive past the gfx shutdown and some testing that was made with a different configuration on central 03:52 < nical> sotaro says one of roc's patches in another bug fixes it (which confirms that it's just shutdown stuff) but the patch in question is blocked on some non trivial IPDL stuff I haven't figured out and also nical still is working on fixing this leak, so i will reopen aurora for the approvals
Comment 19•8 years ago
|
||
I think this is an e10s-only leak, so it probably isn't a huge deal to just let it ride the trains. I'll add a leak suppression for Aurora so the tree will be green. We really need to enable e10s debug tests on Windows and fix these leaks before we ship e10s, though.
Updated•8 years ago
|
Whiteboard: gfx-noted
Landed the leak suppression patch from bug 1219919 on aurora. We'll see if that clears everything up.
Comment 21•8 years ago
|
||
This only shows up on Aurora because content process leak checking does not work properly in Windows Nightly (bug 1219369) due to a tighter sandbox.
Let's not lose track of this. Andrew, how do we simulate in local builds the setup that Aurora eventually hits? Wait for bug 1219369 to land?
Assignee: nobody → nical.bugzilla
Flags: needinfo?(continuation)
Reporter | ||
Comment 23•8 years ago
|
||
The full Aurora simulation is applying https://hg.mozilla.org/try/rev/0a92e967df4b but you can get 99% of the difference by just changing the a1 to a2 in browser/config/version.txt and config/milestone.txt - for debug builds the only other difference is branding, which won't matter here.
Comment 24•8 years ago
|
||
I think this is just a side effect of the issue where we never shut down CompositorChild or ImageBridgeChild in the child process, and we probably just hold onto a little more stuff on Windows for something. I think I successfully papered over the immediate issue of the tree turning orange by landing a bunch of extra suppressions in bug 1219919 for graphics-related leaks: https://hg.mozilla.org/mozilla-central/rev/3885c33f2013 In addition to the CompositorChild and ImageBridgeChild shutdown issues, it would be nice if somebody looked into those miscellaneous leaks that show up in only a few particular tests, like DXGID3D9TextureData or ITextureClientRecycleAllocator. Those might be an indication of an actual leak. (In reply to Milan Sreckovic [:milan] from comment #22) > Let's not lose track of this. Andrew, how do we simulate in local builds > the setup that Aurora eventually hits? Wait for bug 1219369 to land? Yes, to reproduce this locally you need to apply the patches in bug 1219369, and back out 3885c33f2013. The comments in the latter patch have a few non-e10s tests that create content processes and leak various things.
Updated•8 years ago
|
Status: NEW → RESOLVED
Closed: 8 years ago
Flags: needinfo?(continuation)
Resolution: --- → DUPLICATE
Let's leave this open, because the leak isn't actually gone, and it may show up when we make nightly do the same checks.
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
Updated•8 years ago
|
tracking-e10s:
--- → ?
Updated•8 years ago
|
Comment 27•8 years ago
|
||
I think the most interesting thing to fix here is the PTextureChild leak, which as seen in bug 1252677 seems to grow depending on how many tests we run. There's a similar larger leak in dom/html/test on Windows, so I wonder if we leak a texture for every canvas or something.
Whiteboard: gfx-noted → gfx-noted [MemShrink]
Updated•8 years ago
|
Whiteboard: gfx-noted [MemShrink] → gfx-noted [MemShrink][e10s-orangeblockers]
Updated•8 years ago
|
Priority: P1 → P3
Updated•8 years ago
|
Whiteboard: gfx-noted [MemShrink][e10s-orangeblockers] → gfx-noted [MemShrink:P2][e10s-orangeblockers]
Updated•8 years ago
|
Status: REOPENED → RESOLVED
Closed: 8 years ago → 8 years ago
Resolution: --- → DUPLICATE
You need to log in
before you can comment on or make changes to this bug.
Description
•