Open Bug 1280916 Opened 8 years ago Updated 2 years ago

Memory leak when using Content Security Policy and WebSockets

Categories

(Core :: DOM: Core & HTML, defect, P3)

47 Branch
defect

Tracking

()

People

(Reporter: browser_defects, Unassigned)

References

Details

(Keywords: memory-leak, Whiteboard: [MemShrink:P2])

Attachments

(1 file)

Attached file leak_CSP.zip
User Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:47.0) Gecko/20100101 Firefox/47.0
Build ID: 20160604131506

Steps to reproduce:

1. Download and install nodejs (https://nodejs.org/en/)
2. Download and unzip the attached prototype
3. Open shell and navigate to the folder extracted in point 2
4. Start the server by typing: node server.js
5. Open Firefox and navigate to http://localhost:8888
6. Let it run for several hours and monitor the memory (Task manager on Windows).


Actual results:

Memory usage increase continuously.

Video recording can be seen here:
https://youtu.be/_8qO1XceFVI
https://youtu.be/sB9lB2g8_P4
https://youtu.be/pDNZEHDBtwc


Expected results:

Memory usage should be stable.

When you open the server.js file and comment the header for Content Security Policy there is NO leak and the prototype is working as expected.
Eric, does this bug report fall into your bucket? Thanks
Flags: needinfo?(erahm)
(In reply to Pascal Chevrel:pascalc from comment #1)
> Eric, does this bug report fall into your bucket? Thanks

Thanks, I'll take a look.
Flags: needinfo?(erahm)
Whiteboard: [MemShrink]
Alright I can repro in central on Linux. Diff after 3 hours. Clearly we're leaking dom memory files:

> 1,738.26 MB (100.0%) -- explicit
> ├────870.64 MB (50.09%) -- dom/memory-file-data/large
> │    ├──435.41 MB (25.05%) ── file(length=34040, sha1=17fb416b234176ec37d1758a3f7432f7cd03a7b6) [13096]
> │    └──435.23 MB (25.04%) ── file(length=34213, sha1=9f2037f576d4fb9ba67a2890026e84faf58239d6) [13094]
> ├────757.89 MB (43.60%) ── heap-unclassified
> ├─────67.39 MB (03.88%) -- js-non-window
> │     ├──66.79 MB (03.84%) -- zones/zone(0xNNN)
> │     │  ├──24.61 MB (01.42%) ++ compartment([System Principal], file:///home/erahm/dev/mozilla-central/obj-x86_64-pc-linux-gnu-clang/dist/bin/components/ContentProcessSingleton.js)
> │     │  ├──21.75 MB (01.25%) ++ compartment([System Principal], XPConnect Privileged Junk Compartment)
> │     │  └──20.43 MB (01.18%) ++ (44 tiny)
> │     └───0.60 MB (00.03%) ++ (2 tiny)
> ├─────18.48 MB (01.06%) ++ window-objects/top(http://localhost:8888/, id=NNN)
> ├─────17.73 MB (01.02%) -- heap-overhead
> │     ├──18.25 MB (01.05%) ── bookkeeping
> │     └──-0.52 MB (-0.03%) ++ (2 tiny)
> └──────6.12 MB (00.35%) ++ (9 tiny)


It doesn't look like we're leaking object URLs:

> 0 (100.0%) -- dom-media-stream-urls
> └──0 (100.0%) ++ owner(http://localhost:8888/)

Kyle is it possible this is a dup of bug 1272078?
Flags: needinfo?(khuey)
It may not be an exact duplicate but it looks highly related. Can you tell why the blob actors are being created? (i.e. break in the different SendPBlobConstructor functions, there are several).
Flags: needinfo?(khuey)
Flags: needinfo?(erahm)
Flags: needinfo?(amarchesini)
I tried setting breakpoints on various |SendPBlobConstructor| functions in both chrome and content processes, but didn't see them getting invoked.
Flags: needinfo?(erahm)
This is interesting. The issue here is not about we leaking Blobs, is the CC is not running as it should.
There is nothing related to IPC code: in this test we create a BlobMemoryImpl for any ws.onmessage. This Blob is used to create a URL and it's shown as image.
If we do manual CC from about:memory the number decreases. It doesn't go to 0 because we receive a stream of onmessage events but yes, it works as it should.

The question is: why CC doesn't run so often here? Maybe it's correct... but maybe we should call CC more often when we have Memory blobs?
Flags: needinfo?(amarchesini) → needinfo?(bugs)
(In reply to Andrea Marchesini (:baku) from comment #6)
> The question is: why CC doesn't run so often here? Maybe it's correct... but
> maybe we should call CC more often when we have Memory blobs?

The CC is almost certainly running, if this is really leaking across multiple hours. The CC usually runs every 10 seconds. What is likely happening is that some object is leaking with incremental CC: if some object is addrefed/released in the middle of incremental CC (through a weak reference, say), then it will never be freed. Triggering a manual CC would force a non-incremental CC. Though it is a little surprising that there's no non-incremental CC over that long a period of time. But anyways, something like this happened before with WebAudio, where some timer thing kept poking at an object, keeping everything alive.
Flags: needinfo?(continuation)
Yeah, CC tends to run often, except in the rare cases when browser really is idle (meaning perhaps just some about:blank page and user not using the browser at all).

Another thing is that if we have some large blobs in memory which might be garbage, perhaps the code should call JS_updateMallocCounter to tell JS engine to call GC sooner than later.
Flags: needinfo?(bugs)
This issue is somehow related to iCC + GC. The workflow of the test is:

1 WebSocket creates a BlobMemory object using nsContentUtils::CreateBlobBuffer() and it stores it in a JSValue.
2 WebSocket dispatches a MessageEvent with this JSValue.
3 the test revokes any previous BlobURL (See next point)
4 the test creates a new BlobURL from that blob object
5 the test stores the blobURL in an image

I added a counter in the Blob CTOR and DTOR, and in BlobMemoryImpl CTOR and DTOR. What I can see is that the number of Blob objects doesn't go back to 0-ish when iCC runs. We never go back to slow numbers, but always something higher than 800. And this number increases and increases up to 22.000, then sometimes it goes down (GC+iCC?). But never to 0~. The lowest number I see is 860 objects. Everything is freed when the document is closed.

In my test I stopped the workflow at point 2 removing in the test ws.onmessage completely and we have exactly the same behavior.
I also changed the WebSocket code so that we don't dispatch onmessage events at all (just point 1), and yes, we have the same leak beahavior also in this case.
Flags: needinfo?(continuation)
You need to set it to log all CCs, then I'd do an about:memory CC so that whatever it is is cleaned up, then look at the logs to see what was holding it alive during the ICC. I can take a look next week when I'm back from PTO if you haven't had a chance in the mean time.
I would like to see how this debugging thing is done. I'll try by my own this week and in case I have problems I ping you when you are back. Thanks.
Let me know if you need help. I'll clear my needinfo for now.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: needinfo?(continuation)
(In reply to Andrea Marchesini (:baku) from comment #11)
> I would like to see how this debugging thing is done. I'll try by my own
> this week and in case I have problems I ping you when you are back. Thanks.

baku, did you get a chance to look at this? We'll hold off on triaging until there's a bit more analysis.
Flags: needinfo?(amarchesini)
Not yet. I'll ping mccr8 today if he is back from PTO.
Flags: needinfo?(amarchesini)
I am back.
baku, let us know if you need assistance from Andrew.
Flags: needinfo?(amarchesini)
Whiteboard: [MemShrink] → [MemShrink:P2]
Andrew, I tried what we discussed on IRC but without succeeding. Can you help me with this?
Flags: needinfo?(amarchesini) → needinfo?(continuation)
Andrea, do you have a test case that triggers this issue without requiring that I install Node?
Flags: needinfo?(continuation) → needinfo?(amarchesini)
(In reply to Andrew McCreight [:mccr8] from comment #18)
> Andrea, do you have a test case that triggers this issue without requiring
> that I install Node?

No. I use the example attached here with some changes. I'll upload my simplified version.
Flags: needinfo?(amarchesini)
Baku and I looked at this. There are a bunch of Blob objects that show up in the final CC log (when you have the test case open, and then shut down the browser) but yet are not actually leaked. These Blobs have a single reference, some JS reflector, which seem to be held alive in turn by events, which makes sense. The events show up in the CC log with 1 unknown reference. I told baku to ask smaug about it at some point because I don't know enough about events to debug leaks involving them when they don't show up in CC graphs.
Component: Untriaged → DOM
Product: Firefox → Core
(In reply to Andrew McCreight [:mccr8] from comment #20)
> I told baku to ask smaug about it at some point because I don't
> know enough about events to debug leaks involving them when they don't show
> up in CC graphs.

Hi :baku, do you have updates on the events part mentioned in comment 20? thanks!
Flags: needinfo?(amarchesini)
> Hi :baku, do you have updates on the events part mentioned in comment 20?
> thanks!

I debugged this issue in bug 1301863 and I NIed JS people.
Flags: needinfo?(amarchesini)
Priority: -- → P2
See Also: → 1370836
See Also: → 1301863
Moving to p3 because no activity for at least 1 year(s).
See https://github.com/mozilla/bug-handling/blob/master/policy/triage-bugzilla.md#how-do-you-triage for more information
Priority: P2 → P3
Component: DOM → DOM: Core & HTML
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: