Closed Bug 1350533 Opened 7 years ago Closed 5 years ago

Crash firefox when uploading files to cloud.mail.ru

Categories

(Core :: Networking, defect, P2)

52 Branch
defect

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: lioncub, Unassigned)

References

Details

(Whiteboard: [MemShrink:P1][necko-next])

Attachments

(1 file)

Attached file memory-report.json.gz
User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0
Build ID: 20170317213149

Steps to reproduce:

FF 52.0.1
Ubuntu 16.04 x64
RAM = 8GB
When I upload files to cloud.mail.ru, then starts to consume memory. 
In attachment memory reports (about:memory) after upload 568 files (~6-7GB).


Actual results:

Memory is not released after upload.
File Size = Memory Size.
Firefox crashes.


Expected results:

Normal use memory and normal working ff.
Severity: normal → critical
Whiteboard: [MemShrink]
Hi lioncub,

Could you also provide the crash Report_Id? It can be found in "about:crashes" page. Thanks
Flags: needinfo?(lioncub)
Sorry, but: "No reports of the fall were sent."
Flags: needinfo?(lioncub)
Explicit Allocations

7,417.79 MB (100.0%) -- explicit
├──6,633.52 MB (89.43%) ── heap-unclassified
├────224.00 MB (03.02%) -- window-objects
│    ├──124.63 MB (01.68%) ++ (20 tiny)
│    └───99.37 MB (01.34%) -- top(about:memory#end0, id=177)
│        ├──97.04 MB (01.31%) ++ active/window(about:memory#end0)
│        └───2.32 MB (00.03%) ++ js-zone(0x7f8f44b82000)
├────200.82 MB (02.71%) -- js-non-window
│    ├──136.33 MB (01.84%) -- zones
│    │  ├──106.86 MB (01.44%) ++ zone(0x7f8fe8b21000)
│    │  └───29.47 MB (00.40%) ++ (7 tiny)
│    └───64.49 MB (00.87%) ++ (2 tiny)
├────151.19 MB (02.04%) ++ (21 tiny)
├────119.04 MB (01.60%) ++ add-ons
└─────89.22 MB (01.20%) ++ heap-overhead

Something is consuming a humongous amount heap unclassified. Going to redirect to someone who's probably better suited to diagnose high memory usage.

erahm, do you have time to look at this?
Flags: needinfo?(erahm)
I'll create a test account and see if a DMD run turns up anything.
Flags: needinfo?(erahm)
Whiteboard: [MemShrink] → [MemShrink:P1]
Whelp the site is in Russian so I'm not sure there's much I can do here :( What we really need is for someone to do a DMD run while uploading files to that site.
Let me try.
How do i run a DMD startup?
ni? for adding DMD details
Flags: needinfo?(erahm)
It looks like you're on Linux, so I'll give you details for that. For DMD the easy way (for me) is if you used a local build [1]. So roughly that would be:

> echo "ac_add_options --enable-dmd" >> .mozconfig
> ./mach build
> ./mach run --dmd

We *do* offer pre-built DMD builds for nightly, but they're debug only so a bit slow.

> # Get the latest build of mozilla-central from our build server
> wget https://index.taskcluster.net/v1/task/gecko.v2.mozilla-central.latest.firefox.linux64-debug/artifacts/public/build/target.tar.bz2
> tar xjf target.tar.bz2
> cd firefox
> # Make a temporary profile
> mkdir tmp_profile
> # Now run firefox with DMD enabled using the temporary profile
>  LD_PRELOAD=libdmd.so LD_LIBRARY_PATH=. ./firefox --no-remote --profile tmp_profile

Then you would follow you're previous steps, got to 'about:memory', select 'Save' under 'Save DMD output'. This will save logs to a temp directory. The paths will be output in your console, something like the following:

> DMD[20700] opened /tmp/dmd-1493157403-20700.json.gz for writing
> DMD[20700] Dump 1 {
> DMD[20700]   Constructing the heap block list...
> DMD[20700]   Constructing the stack trace table...
> DMD[20700]   Constructing the stack frame table...
> DMD[20700] }
> DMD[20700] opened /tmp/dmd-1493157403-20749.json.gz for writing
> DMD[20749] Dump 1 {
> DMD[20749]   Constructing the heap block list...
> DMD[20749]   Constructing the stack trace table...
> DMD[20749]   Constructing the stack frame table...
> DMD[20749] }

You can upload those files here.

[1] https://developer.mozilla.org/en-US/docs/Mozilla/Performance/DMD#Desktop_Firefox_(Linux)
Flags: needinfo?(erahm)
Hey lioncub, is comment 8 enough to get you started to help us diagnose this?
Flags: needinfo?(lioncub)
Hey! Yes. One moment, i build firefox with --enable-dmd.
Flags: needinfo?(lioncub)
> $ LD_PRELOAD=libdmd.so LD_LIBRARY_PATH=/usr/lib/firefox firefox --no-remote -P
> DMD[26850] $DMD is undefined
> DMD[26850] $DMD is undefined
> DMD[26950] $DMD is undefined
> DMD[26996] $DMD is undefined
> DMD[27068] $DMD is undefined
> 1493963708824	addons.update-checker	WARN	Update manifest for firefox@getpocket.com did not contain an updates property
> 1493963708900	addons.update-checker	WARN	Update manifest for aushelper@mozilla.org did not contain an updates property
> 1493963708926	addons.update-checker	WARN	Update manifest for webcompat@mozilla.org did not contain an updates property
> 1493963708952	addons.update-checker	WARN	Update manifest for e10srollout@mozilla.org did not contain an updates property
> 1493963709049	addons.update-checker	WARN	Update manifest for {972ce4c6-7e08-4474-a285-3208198ce6fd} did not contain an updates property
> 1493963709813	addons.xpi	WARN	Add-on firefox-hotfix@mozilla.org is not compatible with application version.
> 1493963710143	addons.xpi	WARN	Add-on firefox-hotfix@mozilla.org is not compatible with application version.
> DMD[26850] opened /tmp/dmd-1493966961-26850.json.gz for writing
> DMD[26850] Dump 1 {
> DMD[26850]   Constructing the heap block list...
> DMD[26850]   Constructing the stack trace table...
> DMD[26850]   Constructing the stack frame table...
> DMD[26850] }
> DMD[26850] opened /tmp/dmd-1493966961-26996.json.gz for writing
> DMD[26996] Dump 1 {
> DMD[26996]   Constructing the heap block list...
> DMD[26996]   Constructing the stack trace table...
> DMD[26996]   Constructing the stack frame table...
> DMD[26996] }

Files here https://cloud.mail.ru/public/nfQj/TJVPz4k1i
Thanks for the DMD data! Hey erahm, anything actionable there?
Flags: needinfo?(erahm)
This is super helpful! Unfortunately the DMD report isn't symbolicated, so we can see that all the bytes are coming from the same stack but not what that is:

> Unreported {
>   517,626 blocks in heap block record 1 of 7,571
>   2,120,196,096 bytes (2,120,196,096 requested / 0 slop)
>   Individual block sizes: 4,096 x 517,626
>   97.50% of the heap (97.50% cumulative)
>   98.98% of unreported (98.98% cumulative)
>   Allocated at {
>     #01: replace_malloc[/usr/lib/firefox/libdmd.so +0x6653]
>     #02: ???[/usr/lib/firefox/libxul.so +0x9bbec8]
>     #03: ???[/usr/lib/firefox/libxul.so +0x9bbfd2]
>     #04: ???[/usr/lib/firefox/libxul.so +0x9b93d0]
>     #05: ???[/usr/lib/firefox/libxul.so +0x9ae9fc]
>     #06: ???[/usr/lib/firefox/libxul.so +0xa239c7]
>     #07: ???[/usr/lib/firefox/libxul.so +0x9b65c3]
>     #08: ???[/usr/lib/firefox/libxul.so +0x9b8cc2]
>     #09: ???[/usr/lib/firefox/libxul.so +0x9bc8e5]
>     #10: ???[/usr/lib/firefox/libxul.so +0x9bdaa2]
>     #11: ???[/usr/lib/firefox/libxul.so +0x9d4266]
>     #12: ???[/usr/lib/firefox/libxul.so +0x9d284d]
>     #13: ???[/usr/lib/firefox/libxul.so +0x9eb8e9]
>     #14: ???[/usr/lib/firefox/libxul.so +0xcd266b]
>     #15: ???[/usr/lib/firefox/libxul.so +0xcbc3f6]
>     #16: ???[/usr/lib/firefox/libxul.so +0x9d03c3]
>     #17: ???[/usr/lib/firefox/libnspr4.so +0x23f45]
>     #18: ???[/lib/x86_64-linux-gnu/libpthread.so.0 +0x76ba]
>     #19: clone[/lib/x86_64-linux-gnu/libc.so.6 +0x10682d]
>     #20: ??? (???:???)
>   }
> }

But since this is a local build, lioncub should be able to run the report locally. Can you do a local run of dmd.py [1] and paste the first entry here?

So essentially I want you to run the following from your firefox checkout:

> $OBJDIR/dist/bin/dmd.py dmd-1493966961-26996.json.gz

Here '$OBJDIR' is the directory where we put build artifacts. This is often starts with 'obj-'. Please let me know if you need help running the command.

[1] https://developer.mozilla.org/en-US/docs/Mozilla/Performance/DMD#Processing_the_output_2
Flags: needinfo?(erahm) → needinfo?(lioncub)
$ /data/firefox/firefox-53.0+build6/obj-x86_64-linux-gnu/dist/bin/dmd.py dmd-1493966961-26996.json.gz
Traceback (most recent call last):
  File "/data/firefox/firefox-53.0+build6/obj-x86_64-linux-gnu/dist/bin/dmd.py", line 889, in <module>
    main()
  File "/data/firefox/firefox-53.0+build6/obj-x86_64-linux-gnu/dist/bin/dmd.py", line 881, in main
    digest = getDigestFromFile(args, args.input_file)
  File "/data/firefox/firefox-53.0+build6/obj-x86_64-linux-gnu/dist/bin/dmd.py", line 257, in getDigestFromFile
    fixStackTraces(inputFile, isZipped, opener)
  File "/data/firefox/firefox-53.0+build6/obj-x86_64-linux-gnu/dist/bin/dmd.py", line 242, in fixStackTraces
    for line in inputFile:
  File "/usr/lib/python2.7/gzip.py", line 464, in readline
    c = self.read(readsize)
  File "/usr/lib/python2.7/gzip.py", line 268, in read
    self._read(readsize)
  File "/usr/lib/python2.7/gzip.py", line 303, in _read
    self._read_gzip_header()
  File "/usr/lib/python2.7/gzip.py", line 197, in _read_gzip_header
    raise IOError, 'Not a gzipped file'
IOError: Not a gzipped file
Sorry, Comment 14 is not correct.

On the computer where firefox is installed:
>$ ./dmd.py dmd-1493966961-26996.json.gz 
>Traceback (most recent call last):
>  File "./dmd.py", line 889, in <module>
>    main()
>  File "./dmd.py", line 881, in main
>    digest = getDigestFromFile(args, args.input_file)
>  File "./dmd.py", line 257, in getDigestFromFile
>    fixStackTraces(inputFile, isZipped, opener)
>  File "./dmd.py", line 213, in fixStackTraces
>    import fix_linux_stack as fixModule
>ImportError: No module named fix_linux_stack

On the computer where the deb package was created (Firefox is not installed):
https://cloud.mail.ru/public/Ao1c/37JWBs8at
Flags: needinfo?(lioncub)
(In reply to lioncub from comment #15)
> Sorry, Comment 14 is not correct.
> 
> On the computer where firefox is installed:
> >$ ./dmd.py dmd-1493966961-26996.json.gz 
> >Traceback (most recent call last):
> >  File "./dmd.py", line 889, in <module>
> >    main()
> >  File "./dmd.py", line 881, in main
> >    digest = getDigestFromFile(args, args.input_file)
> >  File "./dmd.py", line 257, in getDigestFromFile
> >    fixStackTraces(inputFile, isZipped, opener)
> >  File "./dmd.py", line 213, in fixStackTraces
> >    import fix_linux_stack as fixModule
> >ImportError: No module named fix_linux_stack
> 
> On the computer where the deb package was created (Firefox is not installed):
> https://cloud.mail.ru/public/Ao1c/37JWBs8at

Are you running 'dmd.py' from within the object directory where firefox was built? ie this command:

> $OBJDIR/dist/bin/dmd.py dmd-1493966961-26996.json.gz

That's where fix_linux_stack.py is.
Flags: needinfo?(lioncub)
Yes. On the computer where the deb package was built (Firefox is not installed):
>$ /data/firefox/firefox-53.0+build6/obj-x86_64-linux-gnu/dist/bin/dmd.py ~/dmd-1493966961-26996.json.gz >dmd.txt
>$ tar czf ~/dmd.tgz ~/dmd.txt
dmd.tgz this -> https://cloud.mail.ru/public/Ao1c/37JWBs8at
Flags: needinfo?(lioncub)
> dmd.tgz this -> https://cloud.mail.ru/public/Ao1c/37JWBs8at

erahm, Comment? Thanks
Flags: needinfo?(erahm)
We think this is probably backpressure while uploading, bug 1277681 fixed us outright crashing but now we most likely are reading data faster than we can transfer it.

Patrick, bkelly thinks this is probably a blob upload causing backpressure. Do we have any bugs on file to fix that? Also are there any bugs on file to report this memory? It would help a lot in identifying these issues in the future.
Component: Untriaged → Networking
Flags: needinfo?(erahm) → needinfo?(mcmanus)
Product: Firefox → Core
I think valentin might be in a good position to help you.. kbelly's suggestion rings true to me fwiw.
Flags: needinfo?(mcmanus) → needinfo?(valentin.gosu)
Well, AFAIK we don't support chunked encoding uploads, right?  So necko needs the entire body in the parent before it will send the request so that it can set the content-length correctly.  So I don't think any amount of back pressure will work here.

We might want to store exceptionally large upload data in temp files during upload.
(In reply to Ben Kelly [reviewing, but slowly][:bkelly] from comment #21)
> Well, AFAIK we don't support chunked encoding uploads, right?  So necko
> needs the entire body in the parent before it will send the request so that
> it can set the content-length correctly.  So I don't think any amount of
> back pressure will work here.

It does seem so. I'll start working on this next week.

> We might want to store exceptionally large upload data in temp files during
> upload.

Why would that help?
Assignee: nobody → valentin.gosu
Depends on: 1367375
Flags: needinfo?(valentin.gosu)
Whiteboard: [MemShrink:P1] → [MemShrink:P1][necko-active]
(In reply to Valentin Gosu [:valentin] from comment #22)
> > We might want to store exceptionally large upload data in temp files during
> > upload.
> 
> Why would that help?

From talking with Patrick in the past my impression is chunked encoded uploads is only possible for h2.  Its not a general solution because we can't support it on h1.x.

Storing a large upload body in a file would simply be a way to get a fixed length stream that doesn't consume application memory.  I was thinking we could trigger this mode if we exceeded some threshold (100MB?) while trying to accumulate a fixed-length upload stream.
See Also: → 1380105
Bulk priority update: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258
Priority: -- → P1
Assignee: valentin.gosu → nobody
Priority: P1 → P2
Whiteboard: [MemShrink:P1][necko-active] → [MemShrink:P1][necko-next]
Valentin, do you know if this is still an issue we need to look into?
Flags: needinfo?(valentin.gosu)
I don't think the situation has changed here.
Junior, any chance the backpressure mechanism from bug 1280629 could be applied for uploads as well (in reverse)?
Flags: needinfo?(valentin.gosu) → needinfo?(juhsu)
(In reply to Eric Rahm [:erahm] (ni? for review in phab) from comment #26)
> Valentin, do you know if this is still an issue we need to look into?

Although this might not be an issue anymore, as per: Bug 1380105 comment #6
> I've supposed it is solved. There is not a bug anymore since version 56.*.

I'd also ask the reporter if they could reproduce the bug, but their account seems to be disabled.
(In reply to Valentin Gosu [:valentin] from comment #27)
> I don't think the situation has changed here.
> Junior, any chance the backpressure mechanism from bug 1280629 could be
> applied for uploads as well (in reverse)?

Per Comment 21, the memory consumption is not caused by back pressure.
Moreover, back pressure control doens't mitigate the leakage, which seems gone.
Flags: needinfo?(juhsu)
Closing this WORKSFORME per comment #28.
Status: UNCONFIRMED → RESOLVED
Closed: 5 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: