Closed Bug 1300724 Opened 8 years ago Closed 8 years ago

Slowly growing "system-heap-allocated" memory usage

Categories

(Core :: Graphics, defect)

51 Branch
x86_64
Windows 7
defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: ulli.luftpumpe, Unassigned)

References

Details

(Whiteboard: [MemShrink:P2])

Attachments

(5 files)

Attached file memory-report.json.gz
User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:48.0) Gecko/20100101 Firefox/48.0
Build ID: 20160823121617

Steps to reproduce:

Start Firefox (pretty new profile, no add ons). Open some tabs with pages that use JavaScript. Let the "firefox.exe" process run for 3 days. Close the tabs so only a new one is open. Open "about:memory" and measure allocations. Compare amount with the amount reported by the OS.


Actual results:

Measurement says:
"Explicit Allocations: 151.41 MB (100.0%) -- explicit"
Task manager says: 3.1GB "commit size" -> real allocated memory


Expected results:

Less memory usage.
OS: Unspecified → Windows 7
Hardware: Unspecified → x86_64
... Expected results:
"Explicit Allocations" memory amount shouldn't be much different then the value reported by the OS.
Component: Untriaged → about:memory
Product: Firefox → Toolkit
(In reply to Mark from comment #1)
> ... Expected results:
> "Explicit Allocations" memory amount shouldn't be much different then the
> value reported by the OS.

about:memory is working as intended here. Although "explicit" is typically similar to the measurements from the OS, that's not always the case.

Here are some relevant measurements from the attached memory reports. (Thank you for attaching the file, BTW, that's a tremendous help with this.)

> 4,095.94 MB (100.0%) -- address-space
> ├──3,446.27 MB (84.14%) -- commit
> │  ├──3,075.24 MB (75.08%) -- private
> │  │  ├──3,046.77 MB (74.39%) ── readwrite(segments=2369)
> │  │  └─────28.47 MB (00.70%) ++ (6 tiny)
> │  ├────216.31 MB (05.28%) ++ mapped
> │  └────154.71 MB (03.78%) ++ image
> ├────375.86 MB (09.18%) ── free(segments=507)
> └────273.81 MB (06.68%) ++ reserved
> 
> 3,101.38 MB ── private
> 3,339.71 MB ── resident
> 3,305.81 MB ── resident-unique
> 2,109.07 MB ── system-heap-allocated
> 3,703.11 MB ── vsize

All these come from the OS. "private" matches with the 3.1GB "commit size" you saw in Task Manager.

Now, although about:memory is working as expected, the memory usage here is bad. In particular, having 2+ GiB of "system-heap-allocated" is bad. So I'm going to morph this bug to be about that.
Component: about:memory → General
Product: Toolkit → Core
Summary: about:memory shows too low memory allocation (Firefox Nightly) → Multi-GiB "system-heap-allocated" memory usage
Whiteboard: [MemShrink]
Ting is this something you might be interested in looking into?
As it is not counted in explicit, I assume it's something allocated by external code.

Mark, I have two questions:

1) Did you minimize memory usage before measurement? Probably you did, but I just want to make sure.
2) Could you specify the urls you opened? It could be some specific sites are causing the issue.
Flags: needinfo?(ulli.luftpumpe)
(In reply to Ting-Yu Chou [:ting] (PTO til 9-20) from comment #4)
> As it is not counted in explicit, I assume it's something allocated by
> external code.
> 
> Mark, I have two questions:
> 
> 1) Did you minimize memory usage before measurement? Probably you did, but I
> just want to make sure.
> 2) Could you specify the urls you opened? It could be some specific sites
> are causing the issue.

I saw similar behavior (high "system-heap-allocated" on Windows) in bug 1293937, comment 6. STR there was essentially: log into facebook, click a video in your feed, click "more videos" and scroll through a few.
I see, let me check it.
Flags: needinfo?(ulli.luftpumpe)
I tried facebook, the memory usage goes high very quickly if I keep clicking on ">" to move to the next clip, but it goes down if I leave the browser idle for a few minutes. Did you see system-heap-allocated went down or it kept steady high?
After I tried several times, facebook now only shows me 3 video clips, not able to scroll...
The stack I saw for RtlAllocateHeap were mostly from MFCreateAlignedMemoryBuffer() and unknown symbols in msmpeg2vdec.dll. However once I closed the tab of FB, heap size (Heap in VMMap) goes back to normal. This seems a bit different with the bug here, which he closed all the tabs. 

Mark, still need your answers for the questions in comment 4. Sorry for the NI spam.
Flags: needinfo?(ulli.luftpumpe)
> 1) Did you minimize memory usage before measurement? Probably you did, but I just want to make sure.
Yes. This freed around 20MB.

> 2) Could you specify the urls you opened? It could be some specific sites are causing the issue.
I mainly used it to test my own local JavaScript code. It redraws a canvas with 20 frames per second and I refresh the page often. I don't close the browser for days so that JS code keeps running all the time.
Flags: needinfo?(ulli.luftpumpe)
Attached file memory-report2.json.gz
Here is a new memory report, around 2.2GB real RAM usage for a single open tab. Firefox reports around 165MB Explicit Allocations.
The 2nd memory report (attachment 8789314 [details]) has numbers from the content process, and it is the parent process has system-heap-allocated high.

Is it possible to make us minimal reproduce code?
> The 2nd memory report (attachment 8789314 [details]) has numbers from the
> content process, and it is the parent process has system-heap-allocated high.
I just clicked "Measure and save" on the "about:memory" page. What shall I do else?

> Is it possible to make us minimal reproduce code?
I can upload my JS code. It is not minimal but a single file only.
Attached file memory-report3.json.gz
Another memory report. This time I only started Firefox [51.0a1 (2016-09-06)] with a single tab without doing anything. The memory usage still increases over time.


Explicit Allocations says: ca. 100MB
Task manager says: over 300MB (after 19 hours process runtime)
Attached file graph.html
Open in Firefox and let it run for many hours.
So just open graph.html and leave it there for a while? Do I need to click on any buttons?
Flags: needinfo?(ulli.luftpumpe)
No. Just load the page and wait. You may keep Firefox Nightly in the background while doing something else.
It redraws/clears the black canvas element with 20 frames per second. As long as the printed mouse coordinates change it is running.

You should already see memory increase after an hour.
Flags: needinfo?(ulli.luftpumpe)
I can't reproduce on my Windows 10. :(

Load the page and wait for more than an hour, but system-heap-allocated is just 5.22 MB in parent, and 1.53 MB in child process. I am using Nightly 52.0a1 20160920030429, with GPU Intel HD Graphics 4000 (driver version 10.18.10.4276).

Could you provide the information from about:support?
Note what I am trying to reproduce is "having 2+ GiB of system-heap-allocated".

I'll just leave the page open, and check again tomorrow.
I loaded the graph.html before off yesterday (6pm), and this morning (9am) the system-heap-allocated of parent process grows from 3.04 MB to 15.76 MB. However there's no differences in child process.
The diff of about:memory after running graphs.html for 3 hours:

16.17 MB (100.0%) -- explicit
├──12.69 MB (78.50%) ── heap-unclassified
├───0.80 MB (04.96%) -- js-non-window
│   ├──0.85 MB (05.23%) ++ zones/zone(0xNNN)
│   ├──-0.03 MB (-0.20%) -- runtime
│   │  ├──-0.25 MB (-1.55%) ++ code
│   │  ├───0.20 MB (01.21%) ++ (5 tiny)
│   │  └───0.02 MB (00.13%) -- script-sources
│   │      ├──0.18 MB (01.11%) ── source(scripts=2480, <non-notable files>)/misc
│   │      └──-0.16 MB (-0.98%) ++ (7 tiny)
│   └──-0.01 MB (-0.07%) ── gc-heap/unused-arenas
├───0.60 MB (03.73%) -- heap-overhead
│   ├──0.77 MB (04.77%) ── bookkeeping
│   ├──-0.51 MB (-3.14%) ── bin-unused
│   └──0.34 MB (02.10%) ── page-cache
├───0.73 MB (04.51%) -- window-objects
│   ├──0.31 MB (01.91%) -- top(resource://gre-resources/hiddenWindow.html, id=NNN)
│   │  ├──0.21 MB (01.31%) -- active
│   │  │  ├──0.21 MB (01.30%) ++ window(chrome://global/content/mozilla.xhtml)
│   │  │  └──0.00 MB (00.00%) ++ window(resource://gre-resources/hiddenWindow.html)
│   │  └──0.10 MB (00.60%) ++ js-zone(0xNNN)
│   ├──0.23 MB (01.40%) -- top(chrome://browser/content/browser.xul, id=NNN)
│   │  ├──0.19 MB (01.20%) ++ active/window(chrome://browser/content/browser.xul)
│   │  └──0.03 MB (00.20%) ++ js-zone(0xNNN)
│   ├──0.19 MB (01.20%) -- top(about:memory, id=NNN)
│   │  ├──0.18 MB (01.10%) -- active/window(about:memory)
│   │  │  ├──0.16 MB (01.01%) -- dom
│   │  │  │  ├──0.16 MB (01.01%) ── orphan-nodes
│   │  │  │  └──0.00 MB (00.00%) ── element-nodes
│   │  │  └──0.01 MB (00.08%) ++ (2 tiny)
│   │  └──0.02 MB (00.11%) ++ js-zone(0xNNN)
│   └──-0.00 MB (-0.00%) ++ top(about:newtab, id=NNN)
├───0.55 MB (03.43%) -- storage
│   ├──0.56 MB (03.45%) -- sqlite
│   │  ├──0.26 MB (01.61%) ++ places.sqlite
│   │  ├──0.20 MB (01.22%) ── other
│   │  └──0.10 MB (00.62%) ++ (3 tiny)
│   └──-0.00 MB (-0.02%) ++ prefix-set
├───0.39 MB (02.39%) -- images
│   ├──0.35 MB (02.14%) -- chrome
│   │  ├──0.24 MB (01.48%) -- raster/used
│   │  │  ├──0.22 MB (01.34%) -- image(60x60, chrome://browser/skin/tabbrowser/pendingpaint.png)
│   │  │  │  ├──0.20 MB (01.26%) -- locked
│   │  │  │  │  ├──0.19 MB (01.17%) -- surface(60x60 (animation))
│   │  │  │  │  │  ├──0.19 MB (01.16%) ── decoded-nonheap [+]
│   │  │  │  │  │  └──0.00 MB (00.01%) ── decoded-heap [+]
│   │  │  │  │  └──0.02 MB (00.10%) ++ surface(60x60, compositing frame)
│   │  │  │  └──0.01 MB (00.07%) ── source [+]
│   │  │  └──0.02 MB (00.14%) ++ (2 tiny)
│   │  └──0.11 MB (00.66%) ++ vector/used
│   └──0.04 MB (00.25%) ++ content/raster/used/<non-notable images>
├───0.23 MB (01.44%) ++ (8 tiny)
└───0.17 MB (01.04%) ── atom-tables/main

Other Measurements

0.00 MB (100.0%) -- address-space
├──-3,029.86 MB (100.0%) ── free(segments=344) [-]
├──2,750.57 MB (100.0%) ── free(segments=357) [+]
├──215.27 MB (100.0%) -- reserved
│  ├──479.86 MB (100.0%) ── private(segments=4142) [+]
│  ├──-264.45 MB (100.0%) ── private(segments=2108) [-]
│  └───-0.14 MB (100.0%) ── mapped(segments=9)
└──64.03 MB (100.0%) -- commit
   ├──49.70 MB (100.0%) -- private
   │  ├──414.57 MB (100.0%) ── readwrite(segments=4314) [+]
   │  ├──-365.29 MB (100.0%) ── readwrite(segments=2330) [-]
   │  ├───5.66 MB (100.0%) ── readwrite+writecombine(segments=33) [+]
   │  ├──-5.26 MB (100.0%) ── readwrite+writecombine(segments=15) [-]
   │  ├───3.19 MB (100.0%) ── readwrite+stack(segments=140) [+]
   │  ├──-3.01 MB (100.0%) ── readwrite+stack(segments=132) [-]
   │  ├───1.37 MB (100.0%) ── readwrite+guard(segments=140) [+]
   │  ├──-1.29 MB (100.0%) ── readwrite+guard(segments=132) [-]
   │  ├──-0.88 MB (100.0%) ── execute-read(segments=14) [-]
   │  ├───0.63 MB (100.0%) ── execute-read(segments=10) [+]
   │  └───0.01 MB (100.0%) ── noaccess(segments=2) [+]
   ├──10.79 MB (100.0%) -- mapped
   │  ├──211.88 MB (100.0%) ── readonly(segments=78) [+]
   │  ├──-201.40 MB (100.0%) ── readonly(segments=74) [-]
   │  ├───7.50 MB (100.0%) ── readwrite(segments=38) [+]
   │  └──-7.19 MB (100.0%) ── readwrite(segments=34) [-]
   └───3.53 MB (100.0%) -- image
       ├──63.59 MB (100.0%) ── readonly(segments=415) [+]
       ├──-63.36 MB (100.0%) ── readonly(segments=414) [-]
       ├──-5.75 MB (100.0%) ── writecopy(segments=97) [-]
       ├──5.73 MB (100.0%) ── writecopy(segments=93) [+]
       ├──3.25 MB (100.0%) ── execute-read(segments=190)
       ├──2.13 MB (100.0%) ── readwrite(segments=232) [+]
       └──-2.07 MB (100.0%) ── readwrite(segments=233) [-]

-0.57 MB (100.0%) -- decommitted
└──-0.57 MB (100.0%) ── js-non-window/gc-heap/decommitted-arenas

24 (100.0%) -- event-counts
└──24 (100.0%) -- window-objects
   ├──-198 (-825.00%) -- top(about:newtab, id=20)/active/window(about:newtab)/dom
   │  ├──-197 (-820.83%) ── event-listeners [-]
   │  └────-1 (-4.17%) ── event-targets [-]
   ├──198 (825.00%) -- top(about:newtab, id=35)/active/window(about:newtab)/dom
   │  ├──197 (820.83%) ── event-listeners [+]
   │  └────1 (04.17%) ── event-targets [+]
   ├──21 (87.50%) ── top(chrome://browser/content/browser.xul, id=3)/active/window(chrome://browser/content/browser.xul)/dom/event-targets
   └───3 (12.50%) -- top(resource://gre-resources/hiddenWindow.html, id=1)/active/window(chrome://global/content/mozilla.xhtml)/dom
       ├──2 (08.33%) ── event-listeners [+]
       └──1 (04.17%) ── event-targets [+]

0 (100.0%) -- file-blob-urls
└──0 (100.0%) -- owner unknown
   ├──-1 (100.0%) ── blob:null/02243ddb-faa1-42d7-b4a7-b9eb456d8714 [-]
   └──1 (100.0%) ── blob:null/e5cecab5-8255-4a2a-a292-d083e2a511a3 [+]

15.59 MB (100.0%) -- heap-committed
├──14.98 MB (96.13%) ── allocated
└───0.60 MB (03.87%) ── overhead

0.39 MB (100.0%) -- images
├──0.35 MB (89.48%) -- chrome
│  ├──0.24 MB (61.69%) -- raster/used
│  │  ├──0.22 MB (56.54%) ── decoded-nonheap [+]
│  │  ├──0.02 MB (05.09%) ── source
│  │  └──0.00 MB (00.07%) ── decoded-heap
│  └──0.11 MB (27.79%) -- vector/used
│     ├──0.11 MB (27.72%) ── source
│     └──0.00 MB (00.07%) ── decoded-heap
└──0.04 MB (10.52%) -- content/raster/used
   ├──0.04 MB (10.35%) ── source
   └──0.00 MB (00.17%) ── decoded-heap

1.18 MB (100.0%) -- js-main-runtime
├──0.81 MB (68.73%) -- compartments
│  ├──0.41 MB (34.69%) -- classes/objects
│  │  ├──0.34 MB (28.78%) ── gc-heap
│  │  └──0.07 MB (05.92%) -- malloc-heap
│  │     ├──0.07 MB (05.79%) ── slots
│  │     └──0.00 MB (00.13%) ++ (2 tiny)
│  ├──0.14 MB (12.03%) -- baseline
│  │  ├──0.11 MB (09.24%) ── fallback-stubs
│  │  └──0.03 MB (02.79%) ── data [+]
│  ├──0.10 MB (08.08%) -- scripts
│  │  ├──0.07 MB (05.67%) ── gc-heap
│  │  └──0.03 MB (02.41%) ── malloc-heap/data
│  ├──0.07 MB (05.79%) ── cross-compartment-wrapper-table
│  ├──0.04 MB (03.10%) ++ (7 tiny)
│  ├──0.03 MB (02.56%) ── saved-stacks-set
│  └──0.03 MB (02.47%) -- type-inference
│     ├──0.02 MB (01.76%) ── type-scripts
│     └──0.01 MB (00.71%) ++ (3 tiny)
├──0.41 MB (35.04%) -- zones
│  ├──0.23 MB (19.28%) -- shapes
│  │  ├──0.16 MB (13.46%) -- gc-heap
│  │  │  ├──0.11 MB (09.37%) ── tree
│  │  │  ├──0.04 MB (03.67%) ── dict
│  │  │  └──0.01 MB (00.42%) ── base
│  │  └──0.07 MB (05.82%) -- malloc-heap
│  │     ├──0.04 MB (03.42%) ── tree-tables
│  │     ├──0.02 MB (01.29%) ── tree-kids
│  │     └──0.01 MB (01.10%) ── dict-tables
│  ├──0.11 MB (09.17%) -- strings
│  │  ├──0.08 MB (06.85%) -- malloc-heap
│  │  │  ├──0.08 MB (06.72%) ── latin1
│  │  │  └──0.00 MB (00.12%) ── two-byte
│  │  └──0.03 MB (02.32%) -- gc-heap
│  │     ├──0.03 MB (02.19%) ── latin1
│  │     └──0.00 MB (00.13%) ── two-byte
│  ├──-0.06 MB (-5.46%) ── unused-gc-things
│  ├──0.05 MB (04.07%) -- scopes
│  │  ├──0.03 MB (02.69%) ── malloc-heap
│  │  └──0.02 MB (01.38%) ── gc-heap
│  ├──0.03 MB (02.21%) -- object-groups
│  │  ├──0.03 MB (02.21%) ── gc-heap
│  │  └──0.00 MB (00.00%) ── malloc-heap
│  ├──0.02 MB (01.98%) ── baseline/optimized-stubs
│  ├──0.02 MB (01.98%) ── type-pool
│  └──0.02 MB (01.81%) ++ (6 tiny)
├──-0.03 MB (-2.77%) ── runtime
└──-0.01 MB (-0.99%) ── gc-heap/unused-arenas

8 (100.0%) -- js-main-runtime-compartments
└──8 (100.0%) -- system
   ├──1 (12.50%) ── [System Principal], chrome://global/content/mozilla.xhtml [+]
   ├──1 (12.50%) ── [System Principal], resource:///modules/sessionstore/TabStateFlusher.jsm [+]
   ├──1 (12.50%) ── [System Principal], resource://gre/modules/BookmarkJSONUtils.jsm [+]
   ├──1 (12.50%) ── [System Principal], resource://gre/modules/Bookmarks.jsm [+]
   ├──1 (12.50%) ── [System Principal], resource://gre/modules/DownloadUtils.jsm [+]
   ├──1 (12.50%) ── [System Principal], resource://gre/modules/PlacesRemoteTabsAutocompleteProvider.jsm [+]
   ├──1 (12.50%) ── [System Principal], resource://gre/modules/PlacesSearchAutocompleteProvider.jsm [+]
   └──1 (12.50%) ── [System Principal], resource://services-sync/main.js [+]

0.57 MB (100.0%) -- js-main-runtime-gc-heap-committed
├──0.65 MB (113.29%) -- used
│  ├──0.64 MB (112.17%) -- gc-things
│  │  ├──0.34 MB (59.31%) ── objects
│  │  ├──0.15 MB (26.87%) ── shapes
│  │  ├──0.07 MB (11.69%) ── scripts
│  │  ├──0.03 MB (04.79%) ── strings
│  │  ├──0.03 MB (04.55%) ── object-groups
│  │  ├──0.02 MB (02.84%) ── scopes
│  │  ├──0.01 MB (01.24%) ── jitcode
│  │  └──0.01 MB (00.89%) ++ (2 tiny)
│  └──0.01 MB (01.13%) ── arena-admin
└──-0.08 MB (-13.29%) -- unused
   ├──-0.06 MB (-11.25%) -- gc-things
   │  ├──-0.05 MB (-7.95%) ── jitcode
   │  ├──-0.02 MB (-2.85%) ── objects
   │  ├──-0.01 MB (-2.53%) ── object-groups
   │  ├───0.01 MB (01.96%) ── strings
   │  └───0.00 MB (00.12%) ++ (5 tiny)
   └──-0.01 MB (-2.04%) ── arenas

6 (100.0%) -- message-manager
└──6 (100.0%) ── referent/global-manager/strong

83 (100.0%) -- observer-service
└──83 (100.0%) -- referent
   ├──72 (86.75%) ── strong
   └──11 (13.25%) ── weak/alive

34 (100.0%) -- observer-service-suspect
├──22 (64.71%) ── referent(topic=memory-pressure)
└──12 (35.29%) ── referent(topic=xpcom-shutdown)

2 (100.0%) -- preference-service
└──2 (100.0%) ── referent/strong

0.35 MB (100.0%) -- window-objects
├──0.18 MB (50.31%) -- dom
│  ├──0.16 MB (46.87%) ── orphan-nodes
│  ├──0.01 MB (01.80%) ++ (4 tiny)
│  └──0.01 MB (01.63%) ── other
├──0.17 MB (49.13%) -- layout
│  ├──0.06 MB (16.75%) ── style-structs
│  ├──0.05 MB (13.97%) ── frames
│  ├──0.03 MB (07.24%) ── pres-shell
│  ├──0.01 MB (03.45%) ── style-contexts
│  ├──0.01 MB (03.34%) ── pres-contexts
│  ├──0.01 MB (01.84%) ── line-boxes
│  ├──0.01 MB (01.46%) ++ (2 tiny)
│  └──0.00 MB (01.09%) ── rule-nodes
└──0.00 MB (00.57%) ++ (2 tiny)

  3.86 MB ── d3d11-shared-textures
  0.13 MB ── gfx-surface-win32
  3.34 MB ── gpu-committed
  6.65 MB ── gpu-shared
 14.98 MB ── heap-allocated
131.00 MB ── heap-mapped
  0.11 MB ── imagelib-surface-cache-estimated-locked
  0.11 MB ── imagelib-surface-cache-estimated-total
 51.23 MB ── private
 53.27 MB ── resident
 49.27 MB ── resident-unique
  2.35 MB ── system-heap-allocated
279.36 MB ── vsize
"./mach run --dmd" with my local dmd enabled build (on windows) does not bring up firefox's window.
(In reply to Ting-Yu Chou [:ting] from comment #22)
> "./mach run --dmd" with my local dmd enabled build (on windows) does not
> bring up firefox's window.

Try the first patch in bug 1241684.
Attached file memory-report4.json.gz
Another test by me.

Firefox version: 52.0a1 (2016-09-22) (32-bit), Windows 7 x64

Memory commit size at start as shown in Task Manager, graph.html opened:
 child process: 60 MB
 parent process: 175 MB

Memory commit size after 5.5h:
 child process: 70 MB
 parent process: 385 MB

After opening a new tab and closing the HTML+JavaScript tab (child process disappeared) and clicking all 3 "Free memory" options in "about:memory" -> parent process still uses 328MB (see memory-report4.json.gz)

Another 2 hours later the parent process uses 360MB with a single empty tab open.
I have a feeling that with the newer Nightly version the memory usage increase is slower, still it should not increase at all. Is there an internal log in the Nightly version that accumulates more and more data over time?
(In reply to Nicholas Nethercote [:njn] from comment #23)
> (In reply to Ting-Yu Chou [:ting] from comment #22)
> > "./mach run --dmd" with my local dmd enabled build (on windows) does not
> > bring up firefox's window.
> 
> Try the first patch in bug 1241684.

Still the same. :(
Tried also :aklotz's patch in bug 1241684, the window comes up after a while but runs very slowly, I can't even enter the url...
As this doesn't happen on Linux, also DMD on Windows is too slow to use, I guess that the tool I am creating (bug 1299747) which intercept NtAllocateVirtualMemory() is probably my best bet.
:dmajor noticed me that Windows Performance Analyzer can log VirtualAlloc(), I'll try to use it to diagnose this issue.
I used WPR to record for an hour (nightly with only one tab for graph.html) with "Heap usage" selected in the options. The system-heap-allocated (sum of the size of heaps from GetProcessHeaps()) grows from 4.8MB to 5.83MB, but the memory graph doesn't contain "Heap Allocations" that I expect.
Flags: needinfo?(dmajor)
We use jemalloc which completely bypasses the system heap. You'll need to select "VirtualAlloc usage". Note that this will sometimes blame the wrong allocations, because jemalloc can re-use blocks without WPA's knowledge, but if you let the leak build up then its stacks will eventually stand out nonetheless.

Keep in mind that the tracing and analysis don't have to be on the same machine. If you are unable to reproduce the issue locally, maybe Mark would be willing to collect a trace for you.
Flags: needinfo?(dmajor)
Whiteboard: [MemShrink] → [MemShrink:P2]
From WPA, this is the only stack contains heap allocation and with impacting size, which seems is for growing a vector:

ntoskrnl.exe! ?? ::NNGAKEGL::`string'
ntoskrnl.exe!NtAllocateVirtualMemory
ntoskrnl.exe!KiSystemServiceCopyEnd
ntdll.dll!NtAllocateVirtualMemory
wow64.dll!whNtAllocateVirtualMemory
wow64.dll!Wow64SystemServiceEx
wow64cpu.dll!ServiceNoTurbo
wow64.dll!Wow64KiUserCallbackDispatcher
wow64.dll!Wow64LdrpInitialize
ntdll.dll!_LdrpInitialize
ntdll.dll!LdrInitializeThunk
ntdll.dll!ZwAllocateVirtualMemory
ntdll.dll!RtlpAllocateHeap
ntdll.dll!RtlpAllocateHeapInternal
ntdll.dll!RtlAllocateHeap
d3d11.dll!operator new
d3d11.dll!std::vector<NDXGI::CKMResource *,std::allocator<NDXGI::CKMResource *> >::_Reallocate
d3d11.dll!NDXGI::CDevice::SharedResourceCreationPreamble
d3d11.dll!CResource<ID3D11Texture2D1>::CLS::FinalConstruct
d3d11.dll!CDevice::CreateLayeredChild
d3d11.dll!NDXGI::CDeviceChild<IDXGIResource1,IDXGISwapChainInternal>::FinalConstruct
d3d11.dll!NDXGI::CResource::FinalConstruct
d3d11.dll!NDXGI::CDevice::CreateLayeredChild
d3d11.dll!NOutermost::CDevice::CreateLayeredChild
d3d11.dll!CDevice::CreateTexture2D_Worker
d3d11.dll!CDevice::OpenSharedResourceInternalImpl
d3d11.dll!CDevice::OpenSharedResourceInternal_Worker
d3d11.dll!CDevice::OpenSharedResource
xul.dll!mozilla::layers::DXGITextureHostD3D11::OpenSharedHandle
xul.dll!mozilla::layers::DXGITextureHostD3D11::LockInternal
xul.dll!mozilla::layers::ImageHost::Lock
xul.dll!mozilla::layers::ImageHost::Composite
xul.dll!<lambda_f8ca4c562a69e28655c41d64843e58d9>::operator()
xul.dll!mozilla::layers::RenderWithAllMasks<<lambda_f8ca4c562a69e28655c41d64843e58d9> >
xul.dll!mozilla::layers::CanvasLayerComposite::RenderLayer
xul.dll!mozilla::layers::RenderLayers<mozilla::layers::ContainerLayerComposite>
xul.dll!mozilla::layers::ContainerRender<mozilla::layers::ContainerLayerComposite>
xul.dll!mozilla::layers::ContainerLayerComposite::RenderLayer
xul.dll!mozilla::layers::RenderLayers<mozilla::layers::RefLayerComposite>
xul.dll!mozilla::layers::ContainerRender<mozilla::layers::RefLayerComposite>
xul.dll!mozilla::layers::RefLayerComposite::RenderLayer
xul.dll!mozilla::layers::RenderLayers<mozilla::layers::ContainerLayerComposite>
xul.dll!mozilla::layers::ContainerRender<mozilla::layers::ContainerLayerComposite>
xul.dll!mozilla::layers::ContainerLayerComposite::RenderLayer
xul.dll!mozilla::layers::LayerManagerComposite::Render
xul.dll!mozilla::layers::LayerManagerComposite::UpdateAndRender
xul.dll!mozilla::layers::LayerManagerComposite::EndTransaction
xul.dll!mozilla::layers::CompositorBridgeParent::CompositeToTarget
xul.dll!mozilla::layers::CompositorVsyncScheduler::Composite
xul.dll!mozilla::detail::RunnableMethodImpl<void (__thiscall mozilla::layers::CompositorVsyncScheduler::*)(mozilla::TimeStamp),1,1,mozilla::TimeStamp>::Run
xul.dll!MessageLoop::RunTask
xul.dll!MessageLoop::DeferOrRunPendingTask
xul.dll!MessageLoop::DoWork
xul.dll!base::MessagePumpForUI::DoRunLoop
xul.dll!base::MessagePumpWin::Run
xul.dll!MessageLoop::RunHandler
xul.dll!MessageLoop::Run
xul.dll!base::Thread::ThreadMain
xul.dll!`anonymous namespace'::ThreadFunc
kernel32.dll!BaseThreadInitThunk
ntdll.dll!__RtlUserThreadStart
ntdll.dll!_RtlUserThreadStart
[Root]
:milan, do you have any ideas for comment 31?
Flags: needinfo?(milan)
WPA showed count 726 and impacting size 5.574MB for the stack in comment 31, note the graph has duration ~250s and I ran WPR for ~4h15m.
Summary: Multi-GiB "system-heap-allocated" memory usage → Slowly growing "system-heap-allocated" memory usage
:nical, you've been looking at canvas recently; take a look at comment 31 in particular.
Flags: needinfo?(milan) → needinfo?(nical.bugzilla)
If the allocations are happening in the driver and Ting-Yu can't reproduce the issue, we'd need to trace allocations on a computer that reproduces the bug in order to have meaningful data. So let's be careful when considering the stack in comment 31, whatever is allocating that much memory could be elsewhere.

That said, I dug a bit around OpenSharedHandle and it could be (wild guess) that the content process keeps sending new textures to the compositor and the later does not composite the frames because the window is not presented by the OS. D3D/DXGI don't deallocate resources when the reference count that we can observe goes to zero, but rather when that happens *and* there isn't some internal operations queued in the driver which may refer to the resource. So opening new DXGI handles and letting go of them without present or flush could be causing these resources to accumulate (until the next present/flush). We should make sure to flush the device if we are receiving transactions and not presenting them.
Again, just a guess. The investigation should continue with other possible causes until we have verified that this is what's happening.
Flags: needinfo?(nical.bugzilla)
(In reply to Mark from comment #24)
> I have a feeling that with the newer Nightly version the memory usage
> increase is slower, still it should not increase at all.

Mark, what I am seeing now is system-heap-allocated grows slowly with the latest nightly. For example, system-heap-allocated grows from 3.65 MB to 4.87 MB after running for 1h29m with only one tab (graph.html) opened in foreground (nightly 20161006030208). I'd like to double confirm if this is also what you observe now, so we can know whether the stack in comment 31 is something we need to work on.

Please be noted what :nical said in comment 35, it causes differences if you have the browser present or not (e.g., minimized).

Thank you.
Flags: needinfo?(ulli.luftpumpe)
I will test.

I have an NVIDIA GeForce GTX 750 Ti graphics card and the newest driver on Windows 7 x64.
I prefer to do a suspend to disk (hibernation) when turning off my computer without closing all applications. So Firefox typically runs for multiple days (2-4) until it crashes. The stable Firefox version regularly quits with a crash on my system after it ran "too long".

Firefox allocates a lot of memory for HTML5 videos on YouTube (on other video sites probably too) without releasing all of it. After playing a lot of videos there is an increase in frame skipping. I think this problem is related. The browser becomes more and more laggy and CPU usage increases on a single core to the maximum, Firefox still works for a while but suddenly starts to freeze around 3GB allocated process memory. First for short periods and then completely after a while. Around that time pages start to become partly or completely black sometimes (failed allocation requests?), hovering with the mouse around or selecting the invisible text often makes it partly visible again. This didn't occure that often when Youtube used a Flash based video player.
It appears to be a graphics/texture allocation problem. The graphics memory (backed by normal RAM) is not released fully when closing a tab. I assume new allocations take increasingly longer until they start to fail, resulting in black pages and freezes. The task manager for example starts to appear much slower than before when Firefox becomes laggy, the problem effects other application's graphics elements too despite more than 50% free CPU usage. After quitting/crashing Firefox the issues with other applications disappear.

On Windows XP similar things happened. While Firefox was open and used for a longer time newly started applications reported "out of memory" despite enough free RAM. GUI elements like menu bars/trees where incomplete then, widgets were missing. This indicates a problem with graphics allocations on Windows. After quitting Firefox the problem disappeared.

On Windows 7 x64 when the browser starts to become very laggy, "sessionstore" doesn't record new actions anymore. It stops working at some time, while the global history has no issues. When restoring the session I'm often missing a few hours of opened and closed tabs if Firefox ran "too long". Maybe this problem is related. I don't know if this still happens with the current Nightly.
(In reply to Mark from comment #37)
> Firefox allocates a lot of memory for HTML5 videos on YouTube (on other
> video sites probably too) without releasing all of it. After playing a lot
> of videos there is an increase in frame skipping. I think this problem is
> related. The browser becomes more and more laggy and CPU usage increases on

Mark, thanks for the summary.

Your guess makes sense and that could be related to the issue here. But it'd be better to file a new bug for it because it has different steps to reproduce and different results. We can set "Depends on" or "Blocks" to connect the two bugs if we find out they're really related. 

Let's focus on the increasing system-heap-allocated with your graph.html here.
My "system-heap-allocated" measurement.
 Process: Main Process

While keeping tab "graph.html" visible:
 at start: 22.45 MB
 1 hour later: 116.08 MB

While keeping tab "graph.html" not visible, tab "about:memory" visible:
 at start: 14.44 MB
 5 hours later: 14.40 MB
Flags: needinfo?(ulli.luftpumpe)
(In reply to Mark from comment #39)
> While keeping tab "graph.html" visible:
>  at start: 22.45 MB
>  1 hour later: 116.08 MB

This is much worse than what I saw on my computer.

Could you use Windows Performance Recorder to record this and upload the files somewhere? Note you need to have "Heap usage" and "VirtualAlloc usage" on in the options of "Resource Analysis". Thank you for your help.
Flags: needinfo?(ulli.luftpumpe)
> Could you use Windows Performance Recorder to record this and upload the
> files somewhere? Note you need to have "Heap usage" and "VirtualAlloc usage"
> on in the options of "Resource Analysis".

I've sent you (Ting-Yu Chou [:ting]) an Email with a link to my WPR log.
Flags: needinfo?(ulli.luftpumpe)
I've got the email, thank you. I'll download and check the record on Wednesday when I am in the office.
The WPR files have commit stack "n/a" on WPA, it could be the "Detail level" of WPR was set to "Light". Sorry I didn't mention the setting of this in comment 40 as I left it as default, but would you mind to do it again and with following settings in WPR for another 1 hour run?

  Performance scenario: General
  Detail level: Verbose
  Logging mode: Memory
  First level triage: selected
  Heap usage: selected
  VirtualAlloc usage: selected
Flags: needinfo?(ulli.luftpumpe)
I used "Verbose" level previously already but not "First level triage". Where can I see this "commit stack" in WPA?

I've realized the update check using the NVIDIA helper tool for my graphics  driver always reported that there is no newer version, which was incorrect. So now I really have the newest driver installed, the previous one actually was over a year old. Sorry.

With the new driver "system-heap-allocated" only increases with ca. 8 MB for 10 hours.
Do you still want me to run WPR? Logging one hour is probably not enough.
Flags: needinfo?(ulli.luftpumpe) → needinfo?(janus926)
(In reply to Mark from comment #44)
> I used "Verbose" level previously already but not "First level triage".
> Where can I see this "commit stack" in WPA?

After you open the record file with WPA, right click on the "VirtualAlloc Commit LifeTimes" graph in the "Graph Explorer" panel at left, select "Add graph to Analysis View", you'll see the graph and a table at right. Right click on the table header which probably has "Line", "Process", "Address", "Count", etc., it will show a list of attributes that can be shown on the table, select "Commit Stack", then you'll see it.

> With the new driver "system-heap-allocated" only increases with ca. 8 MB for
> 10 hours.
> Do you still want me to run WPR? Logging one hour is probably not enough.

Yes, please. This number looks similar to what I saw in comment 29 and 36, but it'd be better to have your record to confirm we're facing the same issue.
Flags: needinfo?(janus926)
Mark sent me an email with a link to another WPR record file at 11/13. In that file, there're 3 firefox processes, I assume the one that uses largest size of memory is the parent process which is pid 4920.

In the process, there're 2 stacks contribute to impacting size, one has 0.891MB in the impacting size and "n/a" in the commit stack, the other one has 5.379MB and stack:

ntoskrnl.exe!NtAllocateVirtualMemory
ntoskrnl.exe!KiSystemServiceCopyEnd
ntoskrnl.exe!KiServiceLinkage
dxgmms1.sys!<PDB not found>
dxgmms1.sys!<PDB not found>
dxgmms1.sys!<PDB not found>
dxgmms1.sys!<PDB not found>
dxgkrnl.sys!DXGDEVICE::CreateVidMmAllocations<_DXGK_ALLOCATIONINFO>
dxgkrnl.sys!DXGDEVICE::CreateAllocation
dxgkrnl.sys!DXGDEVICE::CreateStandardAllocation
dxgkrnl.sys!DXGPRESENT::PrepareStagingBuffer
dxgkrnl.sys!DXGCONTEXT::Present
dxgkrnl.sys!DxgkPresent
win32k.sys!NtGdiDdDDIPresent
ntoskrnl.exe!KiSystemServiceCopyEnd
wow64win.dll!NtGdiDdDDIPresent
wow64win.dll!whNtGdiDdDDIPresent
wow64.dll!Wow64SystemServiceEx
wow64cpu.dll!ServiceNoTurbo
wow64.dll!RunCpuSimulation
wow64.dll!Wow64LdrpInitialize
ntdll.dll! ?? ::FNODOBFM::`string'
ntdll.dll!LdrInitializeThunk
gdi32.dll!NtGdiDdDDIPresent
d3d11.dll!NDXGI::CDevice::PresentCB
nvwgf2um.dll!<PDB not found>
nvwgf2um.dll!<PDB not found>
nvwgf2um.dll!<PDB not found>
nvwgf2um.dll!<PDB not found>
nvwgf2um.dll!<PDB not found>
nvwgf2um.dll!<PDB not found>
d3d11.dll!NDXGI::CDevice::Present
dxgi.dll!CDXGISwapChain::PresentImpl
dxgi.dll!CDXGISwapChain::Present
xul.dll!mozilla::layers::CompositorD3D11::EndFrame
xul.dll!mozilla::layers::LayerManagerComposite::Render
xul.dll!mozilla::layers::LayerManagerComposite::UpdateAndRender
xul.dll!mozilla::layers::LayerManagerComposite::EndTransaction
xul.dll!mozilla::layers::CompositorBridgeParent::CompositeToTarget
xul.dll!mozilla::layers::CompositorVsyncScheduler::Composite
xul.dll!mozilla::detail::RunnableMethodImpl<void (__thiscall mozilla::layers::CompositorVsyncScheduler::*)(mozilla::TimeStamp),1,1,mozilla::TimeStamp>::Run
xul.dll!MessageLoop::RunTask
xul.dll!MessageLoop::DeferOrRunPendingTask
xul.dll!MessageLoop::DoWork
xul.dll!base::MessagePumpForUI::DoRunLoop
xul.dll!base::MessagePumpWin::Run
xul.dll!MessageLoop::RunHandler
xul.dll!MessageLoop::Run
xul.dll!base::Thread::ThreadMain
xul.dll!`anonymous namespace'::ThreadFunc
kernel32.dll!BaseThreadInitThunk
ntdll.dll!__RtlUserThreadStart
ntdll.dll!_RtlUserThreadStart
[Root]

Note the stack in comment 31 is also recorded but it has impacting size 0.

So I am going to close this as WFM because 1) the original issue "having 2+ GiB of system-heap-allocated" goes away after Mark upgraded to the latest Nvidia driver, and 2) the stack in comment 31 isn't an issue from his latest WPR record file.

Mark, please reopen if you think there's anything else we can do and thank you for all your help. :)
Status: UNCONFIRMED → RESOLVED
Closed: 8 years ago
Resolution: --- → WORKSFORME
See Also: → 1347227
Component: General → Graphics
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: