Closed Bug 1266915 Opened 8 years ago Closed 8 years ago

twitter leaks a lot of orphan nodes through window refresh

Categories

(Core :: DOM: Core & HTML, defect)

48 Branch
defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 1267693
Tracking Status
platform-rel --- ?
firefox48 --- affected

People

(Reporter: bkelly, Unassigned)

References

Details

(Whiteboard: [MemShrink:P1] btpp-followup-2016-05-05 [platform-rel-Twitter])

Attachments

(1 file)

After running my twitter tab for a week or so I noticed it was using a lot of memory.  I tried refreshing it, but it still ended up using 200+ MB.  I looked in about:memory and saw this:

│  ├────267.04 MB (13.26%) -- top(https://twitter.com/, id=104)
│  │    ├──228.74 MB (11.35%) -- active
│  │    │  ├──226.92 MB (11.26%) -- window(https://twitter.com/)
│  │    │  │  ├──161.99 MB (08.04%) -- dom
│  │    │  │  │  ├──137.80 MB (06.84%) ── orphan-nodes [2]

I then completely closed the tab, opened a new tab, and loaded twitter.com there.  I then got:

│  ├─────42.39 MB (02.23%) -- top(https://twitter.com/, id=8261)
│  │     ├──34.63 MB (01.83%) -- active
│  │     │  ├──31.84 MB (01.68%) ++ window(https://twitter.com/)
│  │     │  └───2.79 MB (00.15%) ++ (3 tiny)
│  │     └───7.76 MB (00.41%) ++ js-zone(0x27eb57d6000)

It seems a bit wrong to me that nodes can be leaked across a refresh like this.

This is FF 45.0.2 on windows 10.
Note, I used the "Minimize memory usage" button before taking that first about:memory report.
I see a similar issue with irccloud when I refresh it:

│  ├────382.91 MB (19.27%) -- top(https://irccloud.mozilla.com/#!/ircs://irc1.dmz.scl3.mozilla.com:6697/%23content, id=29)
│  │    ├──256.75 MB (12.92%) -- active
│  │    │  ├──256.40 MB (12.91%) -- window(https://irccloud.mozilla.com/#!/ircs://irc1.dmz.scl3.mozilla.com:6697/%23content)
│  │    │  │  ├──176.67 MB (08.89%) ++ js-compartment(https://irccloud.mozilla.com/#!/ircs://irc1.dmz.scl3.mozilla.com:6697/%23content)
│  │    │  │  ├───75.38 MB (03.79%) -- dom
│  │    │  │  │   ├──36.82 MB (01.85%) ── element-nodes [3]
│  │    │  │  │   ├──28.37 MB (01.43%) ── orphan-nodes [3]

But after closing the window and re-opening in a new tab:

│  ├─────53.49 MB (02.70%) -- top(https://irccloud.mozilla.com/#!/ircs://irc1.dmz.scl3.mozilla.com:6697/%23content, id=633)
│  │     ├──35.14 MB (01.77%) -- active
│  │     │  ├──34.80 MB (01.76%) ++ window(https://irccloud.mozilla.com/#!/ircs://irc1.dmz.scl3.mozilla.com:6697/%23content)
│  │     │  └───0.34 MB (00.02%) ++ window(https://js.stripe.com/v2/channel.html?stripe_xdm_e=https%3A%2F%2Firccloud.mozilla.com&stripe_xdm_c=default147821&stripe_xdm_p=1#__stripe_transport__)
│  │     └──18.35 MB (00.93%) ++ js-zone(0x27eae110800)

And the old irccloud window leaked:

│  ├────359.26 MB (18.13%) -- top(none)/detached
│  │    ├──255.53 MB (12.89%) -- window(https://irccloud.mozilla.com/#!/ircs://irc1.dmz.scl3.mozilla.com:6697/%23content)
│  │    │  ├──176.03 MB (08.88%) ++ js-compartment(https://irccloud.mozilla.com/#!/ircs://irc1.dmz.scl3.mozilla.com:6697/%23content)
│  │    │  ├───75.99 MB (03.83%) ++ dom
│  │    │  └────3.51 MB (00.18%) ++ (3 tiny)
Maybe this is related to bug 1263824?
See Also: → 1263824
Is this reproduceable? If so, is this a regression?
I just reloaded twitter in my new firefox instance and it did not reproduce.  So I guess its a symptom of some leaked state.  I'll keep checking to see if I get into it again.

Note, I did capture CC logs and sent them to Andrew when I wrote comment 0.
However the irccloud leak happened again when I closed that tab.  I suppose I should write that as a separate bug.
Is this related to the add–on we discussed, Ben?
Flags: needinfo?(bkelly)
Whiteboard: [MemShrink] → [MemShrink] btpp-followup-2016-05-05
Not sure, but possible.  Most of my leaks have gone away since removing that add-on.  We really need to fix these add-on leaks.  That add-on didn't do much.
Flags: needinfo?(bkelly)
Ben, in comment 0, I don't see what the problem is. Twitter is using a lot of memory is bad, but when you reloaded the page you got a new window.
Flags: needinfo?(bkelly)
The 137MB of orphan nodes was after a refresh.  The second about memory snippet was after closing the tab and opening in a new tab.  Pretty sure we were keeping orphaned nodes alive via the outer window in some way.

I have not reproduced it since removing the mozilla-tree-status add-on, though.  It might have somehow been a dependent leak from the ActionButton leak in some way.
Flags: needinfo?(bkelly)
I'm going to close this for now Ben, but feel free to reopen if you see the behavior again without the add-on.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WORKSFORME
I just re-produced this today.

Before refresh of twitter tab:

│    ├──157.65 MB (11.13%) -- top(https://twitter.com/, id=113)
│    │  ├──117.65 MB (08.30%) -- active
│    │  │  ├──115.56 MB (08.16%) -- window(https://twitter.com/)
│    │  │  │  ├───59.71 MB (04.21%) -- dom
│    │  │  │  │   ├──44.60 MB (03.15%) ── orphan-nodes [2]
│    │  │  │  │   └──15.11 MB (01.07%) ++ (5 tiny)
│    │  │  │  ├───38.27 MB (02.70%) -- js-compartment(https://twitter.com/)
│    │  │  │  │   ├──35.45 MB (02.50%) -- classes
│    │  │  │  │   │  ├──18.73 MB (01.32%) ++ (12 tiny)
│    │  │  │  │   │  └──16.72 MB (01.18%) ++ class(Object)
│    │  │  │  │   └───2.83 MB (00.20%) ++ (8 tiny)
│    │  │  │  └───17.57 MB (01.24%) ++ (3 tiny)
│    │  │  └────2.10 MB (00.15%) ++ (2 tiny)
│    │  └───40.00 MB (02.82%) -- js-zone(0x172a5867000)
│    │      ├──20.63 MB (01.46%) ++ strings
│    │      └──19.37 MB (01.37%) ++ (8 tiny)

After refresh of twitter tab:

│    ├──157.92 MB (11.11%) -- top(https://twitter.com/, id=113)
│    │  ├──117.35 MB (08.25%) -- active
│    │  │  ├──115.27 MB (08.11%) -- window(https://twitter.com/)
│    │  │  │  ├───59.64 MB (04.19%) -- dom
│    │  │  │  │   ├──44.65 MB (03.14%) ── orphan-nodes [2]
│    │  │  │  │   └──14.99 MB (01.05%) ++ (5 tiny)
│    │  │  │  ├───38.12 MB (02.68%) -- js-compartment(https://twitter.com/)
│    │  │  │  │   ├──35.53 MB (02.50%) -- classes
│    │  │  │  │   │  ├──18.73 MB (01.32%) ++ (13 tiny)
│    │  │  │  │   │  └──16.80 MB (01.18%) ++ class(Object)
│    │  │  │  │   └───2.58 MB (00.18%) ++ (9 tiny)
│    │  │  │  └───17.51 MB (01.23%) ++ (3 tiny)
│    │  │  └────2.08 MB (00.15%) ++ (2 tiny)
│    │  └───40.57 MB (02.85%) -- js-zone(0x172a5867000)
│    │      ├──20.65 MB (01.45%) ++ strings
│    │      └──19.92 MB (01.40%) ++ (8 tiny)

After closing the twitter tab:

│    ├──111.64 MB (07.83%) -- top(none)/detached
│    │  ├──108.12 MB (07.58%) -- window(https://twitter.com/)
│    │  │  ├───59.60 MB (04.18%) -- dom
│    │  │  │   ├──44.60 MB (03.13%) ── orphan-nodes [2]
│    │  │  │   └──15.00 MB (01.05%) ++ (5 tiny)
│    │  │  ├───37.49 MB (02.63%) -- js-compartment(https://twitter.com/)
│    │  │  │   ├──35.51 MB (02.49%) -- classes
│    │  │  │   │  ├──18.72 MB (01.31%) ++ (12 tiny)
│    │  │  │   │  └──16.80 MB (01.18%) ++ class(Object)
│    │  │  │   └───1.98 MB (00.14%) ++ (6 tiny)
│    │  │  └───11.03 MB (00.77%) ++ (2 tiny)
│    │  └────3.52 MB (00.25%) ++ (5 tiny)
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
It turned into a ghost window by the time I saved it as a memory report.

I also have CC logs if someone wants to look at them.  They are 90MB, so not posting here.
The leaked blog.wanderview.com workers in that report are bug 1270161 and bug 1271493, btw.  Should not be related to the twitter issue.
I did a heapgraph find roots on the CC log:

$ /c/devel/heapgraph/find_roots.py cc-edges.6812.1463753899.log 00000172F2B64E60
Parsing cc-edges.6812.1463753899.log. Done loading graph.

00000172AA5D5A70 [DOMEventTargetHelper ]
    --[mAnonymousGlobalScopes[i]]--> 00000172A632DD90 [JS Object (Block)]
    --[enclosing_environment]--> 00000172A632DC10 [JS Object (NonSyntacticVariablesObject)]
    --[MessageQueue]--> 000001729A65D1A0 [JS Object (Object)]
    --[_data]--> 00000172A6396CD0 [JS Object (Map)]
    --[value]--> 00000172A11C8640 [JS Object (Function - lazy)]
    --[fun_environment]--> 00000172C1896220 [JS Object (Call)]
    --[cachedValue]--> 00000172F2B64D40 [JS Object (Proxy)]
    --[private]--> 00000172F2B64F00 [JS Object (Object)]
    --[https://twitter.com]--> 00000172F2B64E60 [JS Object (Object)]

    Root 00000172AA5D5A70 is a ref counted object with 1 unknown edge(s).
    known edges:
       00000172A35F12E0 [XPCWrappedNative (ContentFrameMessageManager)] --[mIdentity]--> 00000172AA5D5A70

The fact that it claims to be a DETH with an mAnonymousScopes suggests this is either a nsInProcessTabChildGlobal or a TabChildGlobal.  I guess its nsInProcessTabChildGlobal since I was running in non-e10s mode.
Since I don't have steps to reproduce its difficult to tell what is leaving a strong ref to the global.
Alternatively, maybe there is an issue in content-sessionStore.js:

https://dxr.mozilla.org/mozilla-release/source/browser/components/sessionstore/content/content-sessionStore.js

The object trace in comment 15 suggests that the lazy function stored in _data has been called (it has a cachedValue), but its still in _data.  Why hasn't a flush happened?
Mike, do you have any idea how the situation in comment 17 could occur?  Is there a race here where if the window is closed before it receives a SessionStore::Flush it can just leave stuff in _data forever?  I'm not sure I fully understand how all of this is supposed to work.
Flags: needinfo?(mconley)
(In reply to Ben Kelly [:bkelly] from comment #18)
> Mike, do you have any idea how the situation in comment 17 could occur?  Is
> there a race here where if the window is closed before it receives a
> SessionStore::Flush it can just leave stuff in _data forever?  I'm not sure
> I fully understand how all of this is supposed to work.

When a window is closed, the content-sessionStore.js should be unloaded and its scope destroyed - it shouldn't be able to hold onto anything.

Do you mean refreshed, since this bug seems to be about tab refresh?
Flags: needinfo?(mconley) → needinfo?(bkelly)
> When a window is closed, the content-sessionStore.js should be unloaded and
> its scope destroyed - it shouldn't be able to hold onto anything.
> 
> Do you mean refreshed, since this bug seems to be about tab refresh?

This particular cc log is after closing the tab.
Flags: needinfo?(bkelly)
Hm, in that case, I would expect the JS scope to be destroyed, which takes the _data Map along with it.

We could try clearing _data in MessageQueue uninit (which is called when the tab is unloaded), but I honestly don't see how the Map could be staying alive.

So this reminds me a lot of something that we ran into when landing bug 1195295. Landing that bug caused the intermittent bug  1220517 to go perma, and the root cause seemed to be greater time spent doing cycle collection because the graph was getting quite large.

Bug 1249795 ended up being the solution. Perhaps there are other cases here that we're not considering.

smaug, you happened to fix bug 1249795. Does it sound like we're hitting similar leaks here?

(Also roping in :ting)
Flags: needinfo?(bugs)
Nothing hints we're hitting similar leak here, but nothing hints against that too.
My _guess_ is that something ends up keeping the message manager global alive via some
nsCOMPtr<nsIGlobalObject> member variable.
Flags: needinfo?(bugs)
But the leak could be something totally different. 

Do we know whether this is a shutdown leak or runtime?
From bkelly's description, this sounds like a runtime leak.
Yea, the unknown edge on the global could easily be the problem.  Not sure how to track that down given how often nsIGlobalObject is used in the tree.

And yea, it's a runtime leak.
ok, so need to try to add a breakpoint to relevant DOMEventTargetHelper's ::Release and check who all release it during shutdown. But if this happens only rarely...
Whiteboard: [MemShrink] btpp-followup-2016-05-05 → [MemShrink] btpp-followup-2016-05-05 [platform-rel-Twitter]
platform-rel: --- → ?
This seems like a pretty bad leak and considering we're seeing this on a major site lets go with P1.

Olli were you going to follow up on comment 26 or were you hoping someone else could do that?
Flags: needinfo?(bugs)
Whiteboard: [MemShrink] btpp-followup-2016-05-05 [platform-rel-Twitter] → [MemShrink:P1] btpp-followup-2016-05-05 [platform-rel-Twitter]
Well, need to find some way to reproduce this.
I assume bkelly haven't figured out that yet.
Flags: needinfo?(bugs) → needinfo?(bkelly)
I haven't seen the "orphan nodes through refresh" problem since removing the mozilla-tree-status addon.  So I think this is probably caused by bug 1267693.  I guess I will dupe to there for now.
Status: REOPENED → RESOLVED
Closed: 8 years ago8 years ago
Flags: needinfo?(bkelly)
Resolution: --- → DUPLICATE
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: