Closed Bug 1345260 Opened 7 years ago Closed 7 years ago

Ghost window with ABP on http://www.telegraph.co.uk/

Categories

(Core :: Networking: HTTP, defect)

defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: mccr8, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: [MemShrink:P2][necko-active])

Attachments

(1 file)

I have reproduced a ghost window leak which may be one of the ones from bug 1241470.

Steps to reproduce:
1. For simplicity, set dom.ipc.processCount to 1 in about:config.
2. Install AdBlockPlus.
3. Open a page to keep the content process from going away (I use https://news.ycombinator.com/ ).
4. Open http://www.telegraph.co.uk/ and at the moment you can first see the page start to render, close the tab.
5. Go to about:memory, click minimize memory usage, then measure, and you'll see a detached window or a ghost window for the website:

├──14.24 MB (17.91%) -- window-objects
│  ├──12.24 MB (15.39%) -- top(none)
│  │  ├──12.24 MB (15.39%) -- ghost/window(http://www.telegraph.co.uk/)

Using a CC log, the thing rooting the window is an object element:

0x7fa36e888190 [FragmentOrElement (xhtml) object http://www.telegraph.co.uk/]
    --[Preserved wrapper]--> 0x7fa36e04c430 [JS Object (HTMLObjectElement)]
    --[group_global]--> 0x7fa371e53100 [JS Object (Window)]
    --[UnwrapDOMObject(obj)]--> 0x7fa38398d000 [nsGlobalWindow # 2147483655 inner http://www.telegraph.co.uk/]

    Root 0x7fa36e888190 is a ref counted object with 1 unknown edge(s).
    known edges:
       0x7fa36e04c430 [JS Object (HTMLObjectElement)] --[UnwrapDOMObject(obj)]--> 0x7fa36e888190
       0x7fa3732e4040 [nsBaseContentList] --[mElements[i]]--> 0x7fa36e888190
       0x7fa3839fe950 [FragmentOrElement (xhtml) head http://www.telegraph.co.uk/] --[mAttrsAndChildren[i]]--> 0x7fa36e888190
Blocks: GhostWindows
Is this a shutdown leak or runtime?
Attached file leaks
The leak persists through shutdown, and appears to involve both the parent and content processes. Here's the bloat log I get in shutdown after causing the leak.
I see some strong cycles between runnables and nsObjectLoadingContent, so perhaps one of those cycles doesn't get broken if you close the page at just the right time.
There are 5 runnables leaking, but I looked at their allocation stacks, and none of them seem like they could hold alive an element.

I also noticed a ObjectInterfaceRequestorShim, which in fact does hold alive an object element. It is held alive by a channel, which might be held alive somehow in the case I'm looking at.

If I change ObjectInterfaceRequestorShim::OnStopRequest to also clear mContent, then that seems to fix the leak. I don't know if that is the right fix.

I also noticed this in the console spew when running my test case:
[Child 29734] ###!!! ASSERTION: Did not receive all required callbacks!: 'NS_FAILED(mResult) || mExpectedCallbacks == 0', file /home/amccreight/mc/netwerk/base/nsAsyncRedirectVerifyHelper.cpp, line 61
I don't know if that's a cause or an effect of this leak.
Assignee: nobody → continuation
Looking at the log spew some more, I saw this right before the assertion above:

JavaScript error: resource://gre/modules/commonjs/toolkit/loader.js -> jar:file:///home/amccreight/.mozilla/firefox/nke9v10t.abp/extensions/%7Bd10d0bf8-f5b5-c8b4-a8b2-2b9879e08c5d%7D.xpi!/lib/child/requestNotifier.js, line 353: TypeError: topWnd is null

That XPI is ABP (in a method RequestNotifier.addNodeData), and "topWnd" is presumably some reference to a window, so maybe we run some ABP callback, but the window is already closed, so we get null, so the callback returns an error, which somehow leads to a leak.

And in fact, if I modify addNodeData to do a null check at the top like this then the leak stops:
  if (!topWnd)
      return;

But the question remains why throwing an error here causes Firefox to leak, because that should not happen. I guess I'll figure out where exactly that code is running.
Whiteboard: [MemShrink] → [MemShrink:P2]
If it helps, in relation to bug 1241470, I always wait for all windows to load fully (as far as I can tell) before closing them.  Nothing special that would be interpreted as "at just the right time".
As I said in comment 5, one error I see when trying out the STR is a null pointer error inside ABP (it turns out this is inside the JS implementation of asyncOnChannelRedirect).

However, sometimes it fails in a different way, maybe if I let the page load a little more. In that failure condition, we are in nsObjectLoadingContent::OnStopRequest, but (aRequest != mChannel) and we return early, leaving our reference to mChannel. In this case, what seems to be happening is that we start doing a redirect, via HttpChannelChild::Redirect1Begin, which calls nsObjectLoadingContent::AsyncOnChannelRedirect, so nsObjectLoadingContent changes mChannel from c1 to c2. Then, HttpChannelChild calls OnStopRequest, but with the old channel c1, so nsObjectLoadingContent gets confused and doesn't do anything. My guess is that the redirect hasn't been fully set up yet (it requires a few async calls between parent and child), so the OnStop goes to the old channel. I worked around that immediate problem by giving aRequest as QI'd mRedirectChannelChild if that is present, in OnStopRequest, but that is not enough to fix the leak. I could imagine other things going wrong if an OnStopRequest happens in the middle of a redirect. I'll have to confirm that's what is really happening.
Passing in mRedirectChannelChild to the OnStopRequest callback does not fix the leak entirely, but it at least makes it so that the leak is cleaned up during shutdown.
I think the problem here is that there is a redirect from channel A to channel B in progress, but for some reason by closing the tab while in the middle of the load, we never end up issuing an OnStopRequest for channel B, so the HttpChannelChild stays alive until shutdown, when we kill off all actors and it finally goes away. All that the presence of AdblockPlus does is create an additional cycle through an nsObjectLoadingContent (that should be broken by the OnStopRequest) so that everything stays alive past the shutdown of IPDL.

I tried to figure out why we fail to clean up the target of the redirect, but I have been unable to figure how this is supposed to work. Somebody who is more familiar with how HTTP redirects work will have to fix this.
Assignee: continuation → nobody
Component: Plug-ins → Networking: HTTP
Flags: needinfo?(honzab.moz)
jason can you please assign an owner?
Flags: needinfo?(jduell.mcbugs)
Whiteboard: [MemShrink:P2] → [MemShrink:P2][necko-active]
Andrew, if you are able to reproduce easily, could you please setup http logging and provide the log?  it might tell us the answer immediately.  thanks.  if you never done before, see [1].

[1] https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging
Flags: needinfo?(honzab.moz) → needinfo?(continuation)
So, I probably took this bug with the previous comment.. right?
Assignee: nobody → honzab.moz
Flags: needinfo?(jduell.mcbugs)
Hmm, well, I can't reproduce this any more with a build from today. I'll close this as WFM for now and if I have some time see if I can bisect to find a fix. The page or ABP may have just changed in the meanwhile. Thanks for taking a look.
Status: NEW → RESOLVED
Closed: 7 years ago
Flags: needinfo?(continuation)
Resolution: --- → WORKSFORME
Assignee: honzab.moz → nobody
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: