Closed Bug 1330976 Opened 7 years ago Closed 7 years ago

high heap-unclassified in nightly 53

Categories

(Core :: General, defect)

defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox53 - ---

People

(Reporter: bkelly, Unassigned)

References

Details

(Whiteboard: [MemShrink])

A few of us have noticed high heap-unclassified on nightly recently.  For example, on Jan 4 I wrote in #memshrink:

10:17 AM <bkelly> interesting... 1.2GB of heap-unclassified in the parent process this morning
10:17 AM <bkelly> and 1GB of heap-unclassified in the child process
10:17 AM <bkelly> 53.0a1 (2017-01-03) (64-bit)

I tried switching to a DMD enabled build, but could not reproduce before the pain of the added slowness caused me to switch back.  I tried for about a day.

Yesterday I saw that heap-unclassified was growing in my release nightly in both the parent and child process.  I took a gc/cc log, but again I don't have a DMD log.

Last night Boris also wrote in #content:

5:31 PM <bz> bkelly: I just restarted my browser because it was too laggy.  Heap-unclassified was at 10G

Seems like we have some pretty bad leak of untracked memory in FF53 somewhere.  We need to find this before it hits release.
[Tracking Requested - why for this release]: Leaks that make the browser really laggy and unusable.
Tracking 53+ for this leak issue.
This could be bug 1329104 which is due to a media decoder leak.  Perhaps the intermittent nature of this bug is due to luck of hitting an ad or social media video with that decoder in it.

I'm also still running a DMD build to try to catch this again.
See Also: → 1329104
This triggered for me again on nightly 53.  about:buildconfig says:

Built from https://hg.mozilla.org/mozilla-central/rev/96cb95af530477edb66ae48d98c18533476e57bb

I believe this has bug 1329104.  So I guess this is different.

Also, note that this is a lot of heap-unclassified in both the parent and the child.  My parent shows:

2,473.11 MB (100.0%) -- explicit
├──2,214.23 MB (89.53%) ── heap-unclassified
├────101.10 MB (04.09%) ++ js-non-window
├─────77.68 MB (03.14%) ++ heap-overhead
├─────54.77 MB (02.21%) ++ (22 tiny)
└─────25.33 MB (01.02%) ++ window-objects

My child shows:

3,328.52 MB (100.0%) -- explicit
├──1,906.38 MB (57.27%) ── heap-unclassified
├────900.14 MB (27.04%) ++ window-objects
├────323.66 MB (09.72%) ++ heap-overhead
├────121.22 MB (03.64%) ++ js-non-window
├─────46.22 MB (01.39%) ++ images
└─────30.90 MB (00.93%) -- (16 tiny)
      ├──12.17 MB (00.37%) ++ workers
      ├───5.02 MB (00.15%) ++ atom-tables
      ├───4.19 MB (00.13%) ++ xpconnect
      ├───3.08 MB (00.09%) ++ gfx
      ├───1.48 MB (00.04%) ++ layout
      ├───1.06 MB (00.03%) ++ dom
      ├───0.85 MB (00.03%) ++ cycle-collector
      ├───0.84 MB (00.03%) ── xpti-working-set
      ├───0.67 MB (00.02%) ── icu
      ├───0.66 MB (00.02%) ── preferences
      ├───0.27 MB (00.01%) ++ xpcom
      ├───0.23 MB (00.01%) ── telemetry
      ├───0.19 MB (00.01%) ++ media
      ├───0.18 MB (00.01%) ── history-links-hashtable
      ├───0.01 MB (00.00%) ── script-namespace-manager
      └───0.00 MB (00.00%) ── network/effective-TLD-service

Of course, I don't have DMD running any more so I can't see whats going on here.
Depends on: 1334129
I'm hitting this too heavily on Aurora / Developer Edition 53 on two different profiles.

Profile 1, after closing many tabs:

Main Process
Explicit Allocations
1,524.45 MB (100.0%) -- explicit
├──1,208.23 MB (79.26%) ── heap-unclassified

Web Content (pid 4972)
Explicit Allocations
452.29 MB (100.0%) -- explicit
├──186.56 MB (41.25%) ── heap-unclassified
├──100.89 MB (22.31%) -- heap-overhead
│  ├───88.68 MB (19.61%) ── bin-unused
│  ├───11.30 MB (02.50%) ── bookkeeping
│  └────0.91 MB (00.20%) ── page-cache

Profile 2, many tabs open:

Main Process
Explicit Allocations
1,582.82 MB (100.0%) -- explicit
├──1,266.40 MB (80.01%) ── heap-unclassified

Web Content (pid 644)
Explicit Allocations
3,128.20 MB (100.0%) -- explicit
├──1,873.89 MB (59.90%) ── heap-unclassified
├────861.19 MB (27.53%) -- window-objects

This is in a x64 build on Windows.
I dumped the mapped memory of the Main processes above to see if anything stands out. It's mostly zeros (over 70%+ of all bytes), and many blocks of contiguous 0xE5 bytes (~10% of all bytes). 0xFF is a distant third, but not clustered so much. It definitely doesn't look to me like most memory contains (or contained) useful data. Not sure if that is of any use to anybody, but there.
Unfortunately I am still seeing this even after bug 1329104 and bug 1334129 were fixed.  Using nightly built from:

https://hg.mozilla.org/mozilla-central/rev/71224049c0b52ab190564d3ea0eab089a159a4cf
One thing I have noticed is that I always have an update waiting when I see this.  That could just be coincidence, though, since nightly pushes multiple updates a day.  Still, it would be nice to rule out.

Benjamin, do you know how we manage memory for our update system?  Where is the code for that?  I don't even know where to begin looking for it.  I guess I've always assumed the updater is external to gecko, but maybe thats not the case?
Flags: needinfo?(benjamin)
302 rstrong
Flags: needinfo?(benjamin) → needinfo?(robert.strong.bugs)
Thanks!

Maybe the easiest thing to do here would be to disable updates in my nightly profile and see if the heap-unclassified reproduces.
If that does go away with updates disabled please then also try re-enabling updates and changing the app.update.staging.enabled preference to false to see if it still reproduces since bug 307181 might be the culprit.
Ok, I triggered the high heap-unclassified with updates disabled.  Guess I'll spin another dmd build...
Depends on: 1336448
Ben, could you please advise on who should own the next step for this issue and also what would be the best component for this bug?
Flags: needinfo?(bkelly)
I haven't seen this in a few weeks.  Seems fixed by a combination of bug 1329104, bug 1334129, and bug 1336448.  Going to mark is WFM.
Status: NEW → RESOLVED
Closed: 7 years ago
Flags: needinfo?(bkelly)
Resolution: --- → WORKSFORME
Moving from Core::Untriaged to Core::General https://bugzilla.mozilla.org/show_bug.cgi?id=1407598
Component: Untriaged → General
You need to log in before you can comment on or make changes to this bug.