Closed Bug 1318262 Opened 8 years ago Closed 7 years ago

Memory leak and OOM crash in ASAN Nightly while idling on drudgereport.com

Categories

(Core :: Memory Allocator, defect)

x86_64
Linux
defect
Not set
critical

Tracking

()

RESOLVED INCOMPLETE
Tracking Status
firefox53 --- affected

People

(Reporter: geeknik, Unassigned)

Details

(Keywords: crash, Whiteboard: [MemShrink])

Attachments

(1 file, 2 obsolete files)

ASAN Nightly Build ID 20161117010803

STR:
1) Launch ASAN Nightly with a clean profile
2) Visit http://drudgereport.com
3) Walk away for approx 90 minutes and come back to a crash

Drudge Report refreshes the page every X seconds and over time Nightly continues using more and more memory with each successive page refresh until...

==49744==ERROR: AddressSanitizer failed to allocate 0x10000 (65536) bytes of memory at address 0x6250018e0 (error code: 12)
ERROR: Failed to mmap
[Child 49783] ###!!! ABORT: Aborting on channel error.: file /home/worker/workspace/build/src/ipc/glue/MessageChannel.cpp, line 2148
[Child 49783] ###!!! ABORT: Aborting on channel error.: file /home/worker/workspace/build/src/ipc/glue/MessageChannel.cpp, line 2148
ASAN:DEADLYSIGNAL
=================================================================
==49783==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x0000004e114b bp 0x7f1cda979090 sp 0x7f1cda979080 T2)

###!!! [Child][MessageChannel] Error: (msgtype=0x2,name=PAPZCTreeManager::Msg_ContentReceivedInputBlock) Channel error: cannot send/recv

    #0 0x4e114a in mozalloc_abort(char const*) /home/worker/workspace/build/src/memory/mozalloc/mozalloc_abort.cpp:33:5
    #1 0x7f1cdd575df5 in Abort(char const*) /home/worker/workspace/build/src/xpcom/base/nsDebugImpl.cpp:449:3
    #2 0x7f1cdd575b9c in NS_DebugBreak /home/worker/workspace/build/src/xpcom/base/nsDebugImpl.cpp:405:7
    #3 0x7f1cde4da6ef in mozilla::ipc::MessageChannel::OnChannelErrorFromLink() /home/worker/workspace/build/src/ipc/glue/MessageChannel.cpp:2148:13
    #4 0x7f1cde4df993 in OnChannelError /home/worker/workspace/build/src/ipc/glue/MessageLink.cpp:367:5
    #5 0x7f1cde4df993 in non-virtual thunk to mozilla::ipc::ProcessLink::OnChannelError() /home/worker/workspace/build/src/ipc/glue/MessageLink.cpp:359
    #6 0x7f1cde495c3b in event_process_active_single_queue /home/worker/workspace/build/src/ipc/chromium/src/third_party/libevent/event.c:1350:4
    #7 0x7f1cde495c3b in event_process_active /home/worker/workspace/build/src/ipc/chromium/src/third_party/libevent/event.c:1420
    #8 0x7f1cde495c3b in event_base_loop /home/worker/workspace/build/src/ipc/chromium/src/third_party/libevent/event.c:1621
    #9 0x7f1cde4550d1 in base::MessagePumpLibevent::Run(base::MessagePump::Delegate*) /home/worker/workspace/build/src/ipc/chromium/src/base/message_pump_libevent.cc:364:7
    #10 0x7f1cde44f538 in RunInternal /home/worker/workspace/build/src/ipc/chromium/src/base/message_loop.cc:232:3
    #11 0x7f1cde44f538 in RunHandler /home/worker/workspace/build/src/ipc/chromium/src/base/message_loop.cc:225
    #12 0x7f1cde44f538 in MessageLoop::Run() /home/worker/workspace/build/src/ipc/chromium/src/base/message_loop.cc:205
    #13 0x7f1cde46f6e1 in base::Thread::ThreadMain() /home/worker/workspace/build/src/ipc/chromium/src/base/thread.cc:180:3
    #14 0x7f1cde47023c in ThreadFunc(void*) /home/worker/workspace/build/src/ipc/chromium/src/base/platform_thread_posix.cc:38:3
    #15 0x7f1cf814b0a3 in start_thread /build/glibc-daoqzt/glibc-2.19/nptl/pthread_create.c:309
    #16 0x7f1cf725262c in clone /build/glibc-daoqzt/glibc-2.19/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:111

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /home/worker/workspace/build/src/memory/mozalloc/mozalloc_abort.cpp:33:5 in mozalloc_abort(char const*)
Thread T2 (Chrome_ChildThr) created by T0 (Web Content) here:
    #0 0x49a869 in __interceptor_pthread_create /builds/slave/moz-toolchain/src/llvm/projects/compiler-rt/lib/asan/asan_interceptors.cc:238:3
    #1 0x7f1cde46f2fb in CreateThread /home/worker/workspace/build/src/ipc/chromium/src/base/platform_thread_posix.cc:137:14
    #2 0x7f1cde46f2fb in Create /home/worker/workspace/build/src/ipc/chromium/src/base/platform_thread_posix.cc:148
    #3 0x7f1cde46f2fb in base::Thread::StartWithOptions(base::Thread::Options const&) /home/worker/workspace/build/src/ipc/chromium/src/base/thread.cc:98
    #4 0x7f1cde4e1b77 in mozilla::ipc::ProcessChild::ProcessChild(int) /home/worker/workspace/build/src/ipc/glue/ProcessChild.cpp:24:5
    #5 0x7f1ce5c81d39 in ContentProcess /home/worker/workspace/build/src/obj-firefox/dist/include/mozilla/dom/ContentProcess.h:31:7
    #6 0x7f1ce5c81d39 in XRE_InitChildProcess /home/worker/workspace/build/src/toolkit/xre/nsEmbedFunctions.cpp:619
    #7 0x4dfb5b in content_process_main /home/worker/workspace/build/src/browser/app/../../ipc/contentproc/plugin-container.cpp:197:19
    #8 0x4dfb5b in main /home/worker/workspace/build/src/browser/app/nsBrowserApp.cpp:438
    #9 0x7f1cf718bb44 in __libc_start_main /build/glibc-daoqzt/glibc-2.19/csu/libc-start.c:287

==49783==ABORTING

real	79m22.088s
user	2m29.460s
sys	5m0.448s

Happens pretty consistently.
Component: Untriaged → Memory Allocator
Product: Firefox → Core
I'll see if I can repro. I'd guess we're going to see a ton of orphaned nodes. cc'ing mcrr8 in case CC comes into play.
Whiteboard: [MemShrink]
Can you reproduce this outside of ASan? What does the memory report saved from about:memory look like?
Attached file cc log (obsolete) —
Attached file gc log (obsolete) —
I hope these are what you were looking for.
Disregard these logs, I attached the wrong logs to the wrong bug. ): I'll get back to you on your request.
On a Windows 8.1 machine, I left the 16 November 2016 build of Nightly x64 running overnight and came back to it today and it still has Drudge Report loaded and the browser was a little sluggish, but responsive and definitely did not crash.
Attachment #8811922 - Attachment is obsolete: true
Attachment #8811923 - Attachment is obsolete: true
I did a 20 minute run and didn't see anything of notice in the memory reports, I did run into a crash about 10 minutes later that has since been fixed (bug 1317415). I'll do another test run once I get a fresh build up.
In this case, I haven't seen the OOM crash happen quicker than 60 or 70 minutes. I haven't run into that other crash on this website or any others (just while fuzzing Nightly).
Okay I ran overnight, but couldn't repro an oom (or high memory). I'm on Linux64 debug, I suppose it's possible there's some poor interaction with ASAN.
Interesting enough, while going through 10000 testcases generated with BF3 overnight, loading 1 per second, the browser OOM'd and crashed after 79m 23.406s (compared to 79m 22.088s with just leaving DrudgeReport open in the browser). If it matters, I'm running these tests in a Debian 8 x64 VMware instance with 4GB RAM, 3vCPU, 20GB disk allocated. I kind of want to call this the 79 minute bug now. (;

==12536==ERROR: AddressSanitizer failed to allocate 0x8000 (32768) bytes of SetAlternateSignalStack (error code: 12)
ERROR: Failed to mmap
[Child 12620] ###!!! ABORT: Aborting on channel error.: file /home/worker/workspace/build/src/ipc/glue/MessageChannel.cpp, line 2148
[Child 12620] ###!!! ABORT: Aborting on channel error.: file /home/worker/workspace/build/src/ipc/glue/MessageChannel.cpp, line 2148
ASAN:DEADLYSIGNAL
=================================================================
==12620==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x0000004e114b bp 0x7f89eb398090 sp 0x7f89eb398080 T2)
    #0 0x4e114a in mozalloc_abort(char const*) /home/worker/workspace/build/src/memory/mozalloc/mozalloc_abort.cpp:33:5
    #1 0x7f89edf58455 in Abort(char const*) /home/worker/workspace/build/src/xpcom/base/nsDebugImpl.cpp:449:3
    #2 0x7f89edf581fc in NS_DebugBreak /home/worker/workspace/build/src/xpcom/base/nsDebugImpl.cpp:405:7
    #3 0x7f89eeebce2f in mozilla::ipc::MessageChannel::OnChannelErrorFromLink() /home/worker/workspace/build/src/ipc/glue/MessageChannel.cpp:2148:13
    #4 0x7f89eeec20d3 in OnChannelError /home/worker/workspace/build/src/ipc/glue/MessageLink.cpp:367:5
    #5 0x7f89eeec20d3 in non-virtual thunk to mozilla::ipc::ProcessLink::OnChannelError() /home/worker/workspace/build/src/ipc/glue/MessageLink.cpp:359
    #6 0x7f89eee7837b in event_process_active_single_queue /home/worker/workspace/build/src/ipc/chromium/src/third_party/libevent/event.c:1350:4
    #7 0x7f89eee7837b in event_process_active /home/worker/workspace/build/src/ipc/chromium/src/third_party/libevent/event.c:1420
    #8 0x7f89eee7837b in event_base_loop /home/worker/workspace/build/src/ipc/chromium/src/third_party/libevent/event.c:1621
    #9 0x7f89eee37811 in base::MessagePumpLibevent::Run(base::MessagePump::Delegate*) /home/worker/workspace/build/src/ipc/chromium/src/base/message_pump_libevent.cc:364:7
    #10 0x7f89eee31c78 in RunInternal /home/worker/workspace/build/src/ipc/chromium/src/base/message_loop.cc:232:3
    #11 0x7f89eee31c78 in RunHandler /home/worker/workspace/build/src/ipc/chromium/src/base/message_loop.cc:225
    #12 0x7f89eee31c78 in MessageLoop::Run() /home/worker/workspace/build/src/ipc/chromium/src/base/message_loop.cc:205
    #13 0x7f89eee51e21 in base::Thread::ThreadMain() /home/worker/workspace/build/src/ipc/chromium/src/base/thread.cc:180:3
    #14 0x7f89eee5297c in ThreadFunc(void*) /home/worker/workspace/build/src/ipc/chromium/src/base/platform_thread_posix.cc:38:3
    #15 0x7f8a088c30a3 in start_thread /build/glibc-daoqzt/glibc-2.19/nptl/pthread_create.c:309
    #16 0x7f8a079ca62c in clone /build/glibc-daoqzt/glibc-2.19/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:111

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /home/worker/workspace/build/src/memory/mozalloc/mozalloc_abort.cpp:33:5 in mozalloc_abort(char const*)
Thread T2 (Chrome_ChildThr) created by T0 (Web Content) here:
    #0 0x49a869 in __interceptor_pthread_create /builds/slave/moz-toolchain/src/llvm/projects/compiler-rt/lib/asan/asan_interceptors.cc:238:3
    #1 0x7f89eee51a3b in CreateThread /home/worker/workspace/build/src/ipc/chromium/src/base/platform_thread_posix.cc:137:14
    #2 0x7f89eee51a3b in Create /home/worker/workspace/build/src/ipc/chromium/src/base/platform_thread_posix.cc:148
    #3 0x7f89eee51a3b in base::Thread::StartWithOptions(base::Thread::Options const&) /home/worker/workspace/build/src/ipc/chromium/src/base/thread.cc:98
    #4 0x7f89eeec42b7 in mozilla::ipc::ProcessChild::ProcessChild(int) /home/worker/workspace/build/src/ipc/glue/ProcessChild.cpp:24:5
    #5 0x7f89f6624b69 in ContentProcess /home/worker/workspace/build/src/obj-firefox/dist/include/mozilla/dom/ContentProcess.h:31:7
    #6 0x7f89f6624b69 in XRE_InitChildProcess /home/worker/workspace/build/src/toolkit/xre/nsEmbedFunctions.cpp:619
    #7 0x4dfb5b in content_process_main /home/worker/workspace/build/src/browser/app/../../ipc/contentproc/plugin-container.cpp:197:19
    #8 0x4dfb5b in main /home/worker/workspace/build/src/browser/app/nsBrowserApp.cpp:438
    #9 0x7f8a07903b44 in __libc_start_main /build/glibc-daoqzt/glibc-2.19/csu/libc-start.c:287

==12620==ABORTING

real	79m23.406s
user	10m44.976s
sys	5m27.472s
Why do you think this is an OOM? This looks like an IPC issue, maybe we're leaking shared memory?
Flags: needinfo?(geeknik)
Well, when I'm awake and running the tests, I'm steadily watching free memory on the VM drop from 3GB free to 0 while swap increases from 0 to 1GB and that's about when the crash normally happens. Here is an about:memory measurement as we creep closer to the 79 minute mark.
Flags: needinfo?(geeknik)
We think this is isolated to interaction with an ASAN build, I'm not sure there's anything we can do.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → INCOMPLETE
I don't think this is limited to just ASAN builds. While running 32-bit Windows Firefox 50.1.0 under WinDbg with ApplicationVerifier, I've managed to generate `A heap allocation was leaked` messages after 60-90 minutes of having one tab open to drudgereport.com. Working on generating more useful information.
Here's what I've got so far, looks like it might be related to D3D9, DXVA and/or how we interface with  igdumd32.dll:

=======================================
VERIFIER STOP 00000900: pid 0xE44: A heap allocation was leaked. 

	D1E4EF18 : Address of the leaked allocation. Run !heap -p -a <address> to get additional information about the allocation.
	FE367F84 : Address to the allocation stack trace. Run dps <address> to view the allocation stack.
	E7240FE0 : Address of the owner dll name. Run du <address> to read the dll name.
	6F770000 : Base of the owner dll. Run .reload <dll_name> = <address> to reload the owner dll. Use 'lm' to get more information about the loaded and unloaded modules.


=======================================
This verifier stop is continuable.
After debugging it use `go' to continue.

=======================================

(e44.1f84): Break instruction exception - code 80000003 (first chance)
eax=ebb03fc0 ebx=00000000 ecx=000001a1 edx=000000e6 esi=66a52e70 edi=d1e4ef18
eip=66af37f9 esp=0b16f160 ebp=0b16f374 iopl=0         nv up ei pl nz na po nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000202
vrfcore!VerifierStopMessageEx+0x599:
66af37f9 cc              int     3
1:072> !heap -p -a D1E4EF18
    address d1e4ef18 found in
    _DPH_HEAP_ROOT @ fe111000
    in busy allocation (  DPH_HEAP_BLOCK:         UserAddr         UserSize -         VirtAddr         VirtSize)
                                d1d008bc:         d1e4ef18               e4 -         d1e4e000             2000
    61979c2c verifier!AVrfDebugPageHeapAllocate+0x0000023c
    7727f093 ntdll!RtlDebugAllocateHeap+0x0000003c
    771e81c6 ntdll!RtlpAllocateHeap+0x00001676
    771e5c95 ntdll!RtlpAllocateHeapInternal+0x00000155
    771e5b32 ntdll!RtlAllocateHeap+0x00000032
    66afaeaa vrfcore!VfCoreRtlAllocateHeap+0x0000002a
    66a4572c vfbasics!AVrfpRtlAllocateHeap+0x000000dc
    7515d122 KERNELBASE!LocalAlloc+0x00000082
    66a46316 vfbasics!AVrfpLocalAlloc+0x000000a6
*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\WINDOWS\SysWOW64\igdumd32.dll - 
    6f771201 igdumd32!OpenAdapter+0x00000051
    6b6b822a d3d9!CreateDeviceLHDDI+0x00000459
    6b6b8ee4 d3d9!D3D9CreateDirectDrawObject+0x00000182
    6b6b6480 d3d9!FetchDirectDrawData+0x00000103
    6b6b91b9 d3d9!InternalDirectDrawCreate+0x00000180
    6b6a7274 d3d9!CEnum::CEnum+0x00000245
    6b6a1c6a d3d9!Direct3DCreate9Impl+0x00000090
    6b6a1bd0 d3d9!Direct3DCreate9Ex+0x00000020
    5d1b6b78 xul!mozilla::D3D9DXVA2Manager::Init+0x000000aa [c:\builds\moz2_slave\m-rel-w32-00000000000000000000\build\src\dom\media\platforms\wmf\dxva2manager.cpp @ 285]
    5d1b5e4d xul!mozilla::DXVA2Manager::CreateD3D9DXVA+0x0000004b [c:\builds\moz2_slave\m-rel-w32-00000000000000000000\build\src\dom\media\platforms\wmf\dxva2manager.cpp @ 500]
    5c0cd737 xul!mozilla::ipc::MessagePump::Run+0x00000072 [c:\builds\moz2_slave\m-rel-w32-00000000000000000000\build\src\ipc\glue\messagepump.cpp @ 100]

 
1:072> dps FE367F84
fe367f84  fe27b80c
fe367f88  0000a004
fe367f8c  00140000
fe367f90  61979c2c verifier!AVrfDebugPageHeapAllocate+0x23c
fe367f94  7727f093 ntdll!RtlDebugAllocateHeap+0x3c
fe367f98  771e81c6 ntdll!RtlpAllocateHeap+0x1676
fe367f9c  771e5c95 ntdll!RtlpAllocateHeapInternal+0x155
fe367fa0  771e5b32 ntdll!RtlAllocateHeap+0x32
fe367fa4  66afaeaa vrfcore!VfCoreRtlAllocateHeap+0x2a
fe367fa8  66a4572c vfbasics!AVrfpRtlAllocateHeap+0xdc
fe367fac  7515d122 KERNELBASE!LocalAlloc+0x82
fe367fb0  66a46316 vfbasics!AVrfpLocalAlloc+0xa6
fe367fb4  6f771201 igdumd32!OpenAdapter+0x51
fe367fb8  6b6b822a d3d9!CreateDeviceLHDDI+0x459
fe367fbc  6b6b8ee4 d3d9!D3D9CreateDirectDrawObject+0x182
fe367fc0  6b6b6480 d3d9!FetchDirectDrawData+0x103
fe367fc4  6b6b91b9 d3d9!InternalDirectDrawCreate+0x180
fe367fc8  6b6a7274 d3d9!CEnum::CEnum+0x245
fe367fcc  6b6a1c6a d3d9!Direct3DCreate9Impl+0x90
fe367fd0  6b6a1bd0 d3d9!Direct3DCreate9Ex+0x20
fe367fd4  5d1b6b78 xul!mozilla::D3D9DXVA2Manager::Init+0xaa [c:\builds\moz2_slave\m-rel-w32-00000000000000000000\build\src\dom\media\platforms\wmf\dxva2manager.cpp @ 285]
fe367fd8  5d1b5e4d xul!mozilla::DXVA2Manager::CreateD3D9DXVA+0x4b [c:\builds\moz2_slave\m-rel-w32-00000000000000000000\build\src\dom\media\platforms\wmf\dxva2manager.cpp @ 500]
fe367fdc  5c0cd737 xul!mozilla::ipc::MessagePump::Run+0x72 [c:\builds\moz2_slave\m-rel-w32-00000000000000000000\build\src\ipc\glue\messagepump.cpp @ 100]
fe367fe0  00000000
fe367fe4  00000000
fe367fe8  0000a001
fe367fec  00150000
fe367ff0  61979e42 verifier!AVrfDebugPageHeapFree+0xc2
fe367ff4  7727f89e ntdll!RtlDebugFreeHeap+0x3c
fe367ff8  771e995d ntdll!RtlpFreeHeap+0xd3d
fe367ffc  771e8bd8 ntdll!RtlFreeHeap+0x758
fe368000  66afaf7b vrfcore!VfCoreRtlFreeHeap+0x2b


1:072> du E7240FE0
e7240fe0  "igdumd32.dll"
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: