Closed Bug 878449 Opened 11 years ago Closed 11 years ago

Fix mismatch between CoInitialize and CoUninitialize causing crash in mozilla::widget::JumpListBuilder::AddListToBuild @ CCliModalLoop::CCliModalLoop

Categories

(Core :: Widget: Win32, defect)

23 Branch
All
Windows 7
defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla24
Tracking Status
firefox21 --- unaffected
firefox22 + unaffected
firefox23 + verified
firefox24 --- verified

People

(Reporter: scoobidiver, Assigned: bbondy)

References

()

Details

(4 keywords, Whiteboard: win7 SP1 only use URL)

Crash Data

Attachments

(1 file, 1 obsolete file)

It first showed up in 24.0a1/20130531. The regression range is:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=f66d956d188e&tochange=3c6f2394995d
It's likely a regression from bug 870529.

Signature 	CCliModalLoop::CCliModalLoop(unsigned long, unsigned long, unsigned long, int) More Reports Search
UUID	af19298b-ef69-4b17-a0d8-a1c8e2130531
Date Processed	2013-05-31 22:55:47
Uptime	127
Last Crash	12.1 hours before submission
Install Age	1.4 hours since version was first installed.
Install Time	2013-05-31 21:32:57
Product	Firefox
Version	24.0a1
Build ID	20130531031002
Release Channel	nightly
OS	Windows NT
OS Version	6.1.7601 Service Pack 1
Build Architecture	x86
Build Architecture Info	GenuineIntel family 15 model 4 stepping 9
Crash Reason	EXCEPTION_ACCESS_VIOLATION_READ
Crash Address	0x8
App Notes 	
AdapterVendorID: 0x8086, AdapterDeviceID: 0x2572, AdapterSubsysID: 00000000, AdapterDriverVersion: 6.1.7600.16385
D3D10 Layers? D3D10 Layers- D3D9 Layers? D3D9 Layers- WebGL? EGL? EGL- WebGL- 
Processor Notes 	sp-processor06_phx1_mozilla_com_2935:2012
EMCheckCompatibility	True
Adapter Vendor ID	0x8086
Adapter Device ID	0x2572
Total Virtual Memory	2147352576
Available Virtual Memory	1523089408
System Memory Use Percentage	71
Available Page File	1148592128
Available Physical Memory	302206976

Bugzilla - Report this bug in Firefox, Core, Plug-Ins, or Toolkit
Crashing Thread
Frame 	Module 	Signature 	Source
0 	ole32.dll 	CCliModalLoop::CCliModalLoop 	
1 	ole32.dll 	CAptRpcChnl::SendReceive 	
2 	ole32.dll 	CCtxComChnl::SendReceive 	
3 	ole32.dll 	NdrExtpProxySendReceive 	
4 	rpcrt4.dll 	NdrpProxySendReceive 	
5 	rpcrt4.dll 	NdrClientCall2 	
6 	ole32.dll 	ObjectStublessClient 	
7 	ole32.dll 	ObjectStubless 	
8 	xul.dll 	mozilla::widget::JumpListBuilder::AddListToBuild 	widget/windows/JumpListBuilder.cpp:286
9 	xul.dll 	NS_InvokeByIndex 	xpcom/reflect/xptcall/src/md/win32/xptcinvoke.cpp:70
10 	xul.dll 	XPC_WN_CallMethod 	js/xpconnect/src/XPCWrappedNativeJSOps.cpp:1485
11 	mozjs.dll 	js::Invoke 	js/src/jsinterp.cpp:395
12 	mozjs.dll 	js::Interpret 	js/src/jsinterp.cpp:2219
13 	mozjs.dll 	js::RunScript 	js/src/jsinterp.cpp:352
14 	mozjs.dll 	js::Invoke 	js/src/jsinterp.cpp:441
15 	mozjs.dll 	JS_CallFunctionValue 	js/src/jsapi.cpp:5883
...	

More reports at:
https://crash-stats.mozilla.com/report/list?signature=CCliModalLoop%3A%3ACCliModalLoop%28unsigned+long%2C+unsigned+long%2C+unsigned+long%2C+int%29
Bug 870529 is just re-enabling the jump list functionality that was accidentally disabled by bug 755724.
Version: 24 Branch → 23 Branch
There are reports of this before as well, just not while the platform resource splitting bug has landed.
(In reply to Brian R. Bondy [:bbondy] from comment #3)
> There are reports of this before as well, just not while the platform
> resource splitting bug has landed.
They don't have the above stack trace.
There are about 30 crashes per build in 24.0a1 (#11 crasher over the last day) and in 23.0a2 (#7 crasher over the last day) so a top crasher.
Keywords: topcrash
Bug 870529 was uplifted to Beta 4.
Version: 23 Branch → 22 Branch
I see in a report comment "I added a bookmark that would allow me to launch an explorer window. When I click it, the browser crashes."
Any idea how to do this ?
I found and tried something using the IE Tab addon: http://onlineitpro.com/?p=385 ,
but no sign of crash.
It happens only on Windows 7 SP1.

It's correlated to the MP3 decoder in 23.0a2:
  CCliModalLoop::CCliModalLoop(unsigned long, unsigned long, unsigned long, int)|EXCEPTION_ACCESS_VIOLATION_READ (86 crashes)
    100% (86/86) vs.  15% (327/2125) MP3DMOD.DLL (Microsoft MP3 Decoder DMO)
        100% (86/86) vs.  13% (278/2125) 6.1.7600.16385
    100% (86/86) vs.  19% (397/2125) mlang.dll (Multi Language Support DLL)
        100% (86/86) vs.  16% (338/2125) 6.1.7600.16385
     91% (78/86) vs.   9% (201/2125) libGLESv2.dll (ANGLE libGLESv2 Dynamic Link Library)
         67% (58/86) vs.   5% (109/2125) 23.0.0.4902
         23% (20/86) vs.   1% (29/2125) 23.0.0.4903
     91% (78/86) vs.   9% (201/2125) libEGL.dll (ANGLE libEGL Dynamic Link Library)
         67% (58/86) vs.   5% (109/2125) 23.0.0.4902
         23% (20/86) vs.   1% (29/2125) 23.0.0.4903
     91% (78/86) vs.   9% (201/2125) D3DCompiler_43.dll (Direct3D HLSL Compiler 9.29.952.3111)
Summary: crash in mozilla::widget::JumpListBuilder::AddListToBuild @ CCliModalLoop::CCliModalLoop → [Win7 SP1] crash in mozilla::widget::JumpListBuilder::AddListToBuild @ CCliModalLoop::CCliModalLoop
Assignee: nobody → netzen
(In reply to Scoobidiver from comment #6)
> Bug 870529 was uplifted to Beta 4.
Apparently the real regression is in 23.0 (no crashes in 22.0b4), that bug has only showed it.

Almost all comments talk about playing the Pyramid Solitaire Saga game.
Version: 22 Branch → 23 Branch
Good news about not being in v22 so it gives us more time to fix. We're having trouble figuring out what's causing this to crash. 

I tried reproducing with staging my places.sqlite so it would generate a jump list entry for the Pyramid Solitaire Saga game (Facebook) but it didn't crash.
(In reply to Brian R. Bondy [:bbondy] from comment #10)
> I tried reproducing with staging my places.sqlite so it would generate a
> jump list entry for the Pyramid Solitaire Saga game (Facebook) but it didn't
> crash.
It happens when playing not visiting.
Ya I went to the page to get its favicon but didn't actually play.  Unless someone is confident it'll happen within some timeframe of playing I don't think I'll have time to play it in hopes I get it eventually :)
As you probably could guess almost all URLs are https://apps.facebook.com/pyramidsolitairesaga/ with different query strings.
Keywords: qawanted
(In reply to Scoobidiver from comment #9)
> (In reply to Scoobidiver from comment #6)
> > Bug 870529 was uplifted to Beta 4.
> Apparently the real regression is in 23.0 (no crashes in 22.0b4), that bug
> has only showed it.
> 
> Almost all comments talk about playing the Pyramid Solitaire Saga game.

It's really strange that you mentioned we aren't getting this crash on v22 because the diff of widget\windows between beta and aurora tips don't show any differences even close to jump list code :'(
(In reply to Brian R. Bondy [:bbondy] from comment #14)
> It's really strange that you mentioned we aren't getting this crash on v22
> because the diff of widget\windows between beta and aurora tips don't show
> any differences even close to jump list code :'(
I see eight Widget-Win32 bugs that first landed in 23.0: https://bugzilla.mozilla.org/buglist.cgi?f1=cf_status_firefox22&list_id=6858673&o1=notequals&resolution=FIXED&o2=notequals&query_format=advanced&f2=cf_status_firefox22&v1=fixed&component=Widget%3A%20Win32&v2=verified&product=Core&target_milestone=mozilla23
Thanks! Looked through those just now but came to the same conclusion as before with the diff that none of them seem to be related to jump list code. I could try backing them out locally and seeing if I can still reproduce, but I can't reproduce to begin with :)
Maybe we could try temporarily backing out bug 849647 on aurora to see if it helps?
(In reply to Brian R. Bondy [:bbondy] from comment #17)
> Maybe we could try temporarily backing out bug 849647 on aurora to see if it
> helps?

Let's try speculative fixes now, since Aurora 23 updates will be disabled as of Monday and the Beta 23 build will also get kicked off at the same time.
I'm going to attempt to reproduce this now, sorry for the delay.
QA Contact: anthony.s.hughes
I crashed almost immediately when loading https://apps.facebook.com/pyramidsolitairesaga/.

1. Installed Firefox 23.0a2 Aurora 2013-06-18
2. Started Firefox Aurora with a new profile
3. Installed Flash 11.7
4. Loaded https://apps.facebook.com/pyramidsolitairesaga/
5. After it loaded and started playing I click the X on a Facebook overlay notifying me what data the app was using.

I crashed at this point, here is the report:
> https://crash-stats.mozilla.com/report/index/bacd84f7-e4a5-46c4-8656-4444d2130618

I'm not sure if these steps are reproducible or coincidental yet. Investigating further.
Heh, I crashed again while reporting the bug comment above and letting this app run in the background:

https://crash-stats.mozilla.com/report/index/bp-8b4bffa2-335d-4470-8602-780a92130618

Let me see if I can regress this...
On Aurora it seems to regress between 2013-06-02 and 2013-06-03 for me:
> http://hg.mozilla.org/releases/mozilla-aurora/pushloghtml?fromchange=d57f75168193&tochange=2229cf072c0d

Bug 807529 or Bug 875609 seem to be jumplist related landings on Aurora within this timeframe. Brian, can you have a look?
Flags: needinfo?(netzen)
Keywords: qawanted
I think you mean Bug 870529 and Bug 875609.
Those are what enabled jump lists to work again, as it was broken on all channels before this landed.

Those are also on Beta which doesn't have a crash.

Let me know if you can reproduce consistently, then I can make a try build with the suspect patch backed out. But if you can't reproduce easily enough then that would be pointless.
Flags: needinfo?(netzen)
I can reproduce consistently but not with the same steps. On the affected builds I can usually crash within 5 minutes of using the app. I'd be happy to test any try builds you can provide.
Since you can reproduce fairly easily, could you try to reproduce with the following build? 

http://ftp.mozilla.org/pub/mozilla.org/firefox//try-builds/bbondy@mozilla.com-f04a4f8b9552/try-win32/firefox-24.0a1.en-US.win32.installer.exe

That has the suspect bug I mentioned above backed out (bug 849647 - remove message order prioritization).  If it still reproduces with that build then I'll spend some more time reproducing and try various things.
Flags: needinfo?(anthony.s.hughes)
That build crashed for me.

FWIW, you can get the VM I'm reproducing this in off the file server under QA's virtualization folder. I can point you to the VM off this bug if it helps you reproduce it.
Flags: needinfo?(anthony.s.hughes)
Even stranger, that bugs wasn't really related, but at least remotely, but the remaining ones mentioned in Comment 15 that are different between beta and aurora are completely unrelated.

Anthony, it may be worth testing on mozilla-beta while I work on this more to be sure it's not crashing there.  Maybe for some reason for example it crashes but we aren't getting reports.
Whiteboard: win7 SP1 only use URL https://apps.facebook.com/pyramidsolitairesaga/
I've been playing with this on Firefox 22.0b6 for about 20 minutes and it hasn't crashed yet (usually crash within 5 minutes). I'll keep it running for a couple hours just to be sure but I think it's safe to say Fx22 is unaffected.
OK Thanks Anthony, I just wanted to be sure in case it wasn't being reported or was coming in as a different signature.  Sounds like it's fine.
(In reply to Anthony Hughes, Mozilla QA (:ashughes) from comment #26)
> That build crashed for me.
> 
> FWIW, you can get the VM I'm reproducing this in off the file server under
> QA's virtualization folder. I can point you to the VM off this bug if it
> helps you reproduce it.

I wonder if this has something to do with single core systems, like vms. We ran into strange single core problems with oopp that seem similar.
Talked to ashughes on IRC and am able to reproduce on Win7SP1 if I have a blank profile. I think before when I staged my history I was adding only one of the favicons that get generated on the jump list and having a clean profile make the difference to reproducing.

Reproduced this with a VM as well (single core).
(In reply to Jim Mathies [:jimm] from comment #30)
> I wonder if this has something to do with single core systems, like vms.
It doesn't seem so based on correlations per core counts in 23.0a2:
     17% (8/46) vs.  22% (411/1844) x86 with 1 cores
     50% (23/46) vs.  43% (785/1844) x86 with 2 cores
      0% (0/46) vs.   1% (10/1844) x86 with 3 cores
     33% (15/46) vs.  26% (475/1844) x86 with 4 cores
      0% (0/46) vs.   1% (17/1844) x86 with 6 cores
      0% (0/46) vs.   8% (140/1844) x86 with 8 cores
      0% (0/46) vs.   0% (6/1844) x86 with 12 core
Found the cause!
The bug causing this was introduced in bug 799315 but WMF wasn't enabled until later.

The problem is this:
http://dxr.mozilla.org/mozilla-central/source/content/media/wmf/WMFReader.cpp#l163
We do a CoInitializeEx which actually fails, but then later we unconditionally CoUninitialize();

This frees COM resources early and messes things up. I think we should probably try to fix this even on beta if it's not too late.
Blocks: 799315
No longer blocks: 870529
Nice catch.

(In reply to Brian R. Bondy [:bbondy] from comment #33)
> The problem is this:
> http://dxr.mozilla.org/mozilla-central/source/content/media/wmf/WMFReader.
> cpp#l163
> We do a CoInitializeEx which actually fails, 

Why does this fail? This is happening on a non-main thread on which we use WMF, so it should be mostly unaffected by Gecko?

I imagine that if CoInitialize is failing, we'd fail to use WMF too... Which would be bad.
Most of the code calls CoInitialize and not CoInitializeEx and so that implies STA.
Just a guess but it's failing because here you're trying to do MTA which changes the concurrency model.  I'm not sure why or if this thread is being re-used but it seems to be the cause of the crash. I can no longer reproduce with the fix and could consistently reproduce before the fix.  Will submit later today.
Confirmed, it returns:
hr = 0x80010106 Cannot change thread mode after it is set.
(In reply to Chris Pearce (:cpearce) from comment #34)
> Nice catch.
> 
> (In reply to Brian R. Bondy [:bbondy] from comment #33)
> > The problem is this:
> > http://dxr.mozilla.org/mozilla-central/source/content/media/wmf/WMFReader.
> > cpp#l163
> > We do a CoInitializeEx which actually fails, 
> 
> Why does this fail? This is happening on a non-main thread on which we use
> WMF, so it should be mostly unaffected by Gecko?

Setting a breakpoint locally with a debug m-c build showed that this is being called on the main thread for me.


> I imagine that if CoInitialize is failing, we'd fail to use WMF too... Which
> would be bad.

Nope it would still work because it's already initialized.
Attached patch Fix CoInitialize/CoUninitialize mismatch - rev1 (obsolete) — — Splinter Review
Confirmed that the initialize call is succeeding now.
Attachment #765123 - Flags: review?(paul)
Comment on attachment 765123 [details] [diff] [review]
Fix CoInitialize/CoUninitialize mismatch - rev1

(In reply to Brian R. Bondy [:bbondy] from comment #37)
> (In reply to Chris Pearce (:cpearce) from comment #34)
> > Nice catch.
> > 
> > (In reply to Brian R. Bondy [:bbondy] from comment #33)
> > > The problem is this:
> > > http://dxr.mozilla.org/mozilla-central/source/content/media/wmf/WMFReader.
> > > cpp#l163
> > > We do a CoInitializeEx which actually fails, 
> > 
> > Why does this fail? This is happening on a non-main thread on which we use
> > WMF, so it should be mostly unaffected by Gecko?
> 
> Setting a breakpoint locally with a debug m-c build showed that this is
> being called on the main thread for me.

That's a thread-safety bug, WMFReader::OnDecodeThreadStart() is *not* supposed to be called on the main thread. We assert on the line above the CoInitializeEx call that WMFReader::OnDecodeThreadStart() is *not* called on the main thread.

Maybe the vtable is being corrupted somehow, and that's why we're calling this? Can you post the callstack in which WMFReader::OnDecodeThreadStart() is being called on the main thread?

Re: calling CoInitializeEx vs CoInitialize, the WMF documentation recommends that we use MTA:
http://msdn.microsoft.com/en-us/library/windows/desktop/ee892371%28v=vs.85%29.aspx

I've had no exposure to marshaling between MTA and STA... But based on my reading of the above, if we switch to using STA we'd have to marshal every WMF object we create with an MTA proxy before we passed it into WMF if we switched the decode thread to STA.
Attachment #765123 - Flags: feedback-
Implemented nit.
Attachment #765123 - Attachment is obsolete: true
Attachment #765123 - Flags: review?(paul)
Attachment #765160 - Flags: review?(paul)
(In reply to Chris Pearce (:cpearce) from comment #39)

> That's a thread-safety bug

MTA should be used if you need COM to handle the locking of access to the object.
I understand that MS is recommending you use MTA for WMF; however, you're not doing that, you're using STA on the main thread.
Sounds like you have another bug to post unrelated to this.

> , WMFReader::OnDecodeThreadStart() is *not*
> supposed to be called on the main thread. 

What it is supposed to do, and what it does are 2 different things.  This patch is about fixing a mismatch between CoInitialize and CoUninitialize. I've changed this to MTA CoInitializeEx, but the effect are exactly same when it's on the main thread, the thread is already initialized as STA. And that won't change with any amount of extra CoInitializeEx MTA calls.

WMF is not working as you expect so please post a different bug about that, this bug is not related to that.  This bug is about fixing the side effects and mass chaos it's causing outside of WMFReader.  I would be very surprised if the bug I was trying to fix is the only crash caused by this.

> We assert on the line above the
> CoInitializeEx call that WMFReader::OnDecodeThreadStart() is *not* called on
> the main thread.

Actually you assert this:
> NS_ASSERTION(mDecoder->OnDecodeThread(), "Should be on decode thread.");

And from what I understand that can sometimes be the main thread and sometimes not. In particular see this code in MediaBufferDecoder:

"> MOZ_ASSERT(!mThreadPool == NS_IsMainThread(),
 "We should be on the main thread only if we don't have a thread pool");

---

> 
> Maybe the vtable is being corrupted somehow, and that's why we're calling
> this? 

No, that's not the case.

> Can you post the callstack in which WMFReader::OnDecodeThreadStart()
> is being called on the main thread?

Sure, but as mentioned, please handle that in a different bug:

On thread with name: "Main Thread"	
> xul.dll!mozilla::WMFReader::OnDecodeThreadStart()  Line 83	C++
> xul.dll!mozilla::MediaDecodeTask::Decode()  Line 482	C++
> xul.dll!mozilla::MediaDecodeTask::Run()  Line 382	C++
> xul.dll!mozilla::MediaBufferDecoder::SyncDecodeMedia(const char * aContentType, unsigned char * aBuffer, unsigned int aLength, mozilla::WebAudioDecodeJob & aDecodeJob)  Line 769	C++
> xul.dll!mozilla::dom::AudioContext::CreateBuffer(JSContext * aJSContext, mozilla::dom::TypedArray<unsigned char,&JS_GetArrayBufferData,&JS_GetObjectAsArrayBuffer,&JS_NewArrayBuffer> & aBuffer, bool aMixToMono, mozilla::ErrorResult & aRv)  Line 176 + 0x2d bytes	C++
> xul.dll!mozilla::dom::AudioContextBinding::createBuffer(JSContext * cx, JS::Handle<JSObject *> obj, mozilla::dom::AudioContext * self, const JSJitMethodCallArgs & args)  Line 219 + 0x25 bytes	C++
> xul.dll!mozilla::dom::AudioContextBinding::genericMethod(JSContext * cx, unsigned int argc, JS::Value * vp)  Line 911 + 0x26 bytes	C++
> mozjs.dll!js::CallJSNative(JSContext * cx, int (JSContext *, unsigned int, JS::Value *)* native, const JS::CallArgs & args)  Line 349 + 0x19 bytes	C++
> mozjs.dll!js::Invoke(JSContext * cx, JS::CallArgs args, js::MaybeConstruct construct)  Line 388 + 0x16 bytes	C++
> mozjs.dll!js::Interpret(JSContext * cx, js::StackFrame * entryFrame, js::InterpMode interpMode, bool useNewType)  Line 2212 + 0x2a bytes	C++
> mozjs.dll!js::RunScript(JSContext * cx, js::StackFrame * fp)  Line 345 + 0x11 bytes	C++
> mozjs.dll!js::Invoke(JSContext * cx, JS::CallArgs args, js::MaybeConstruct construct)  Line 401 + 0x12 bytes	C++
> mozjs.dll!js::Invoke(JSContext * cx, const JS::Value & thisv, const JS::Value & fval, unsigned int argc, JS::Value * argv, JS::Value * rval)  Line 434 + 0x33 bytes	C++
> mozjs.dll!JS_CallFunctionValue(JSContext * cx, JSObject * objArg, JS::Value fval, unsigned int argc, JS::Value * argv, JS::Value * rval)  Line 5886 + 0x36 bytes	C++
> xul.dll!mozilla::dom::EventHandlerNonNull::Call(JSContext * cx, JS::Handle<JSObject *> aThisObj, nsDOMEvent & event, mozilla::ErrorResult & aRv)  Line 41 + 0x46 bytes	C++
> xul.dll!mozilla::dom::EventHandlerNonNull::Call<nsISupports *>(nsISupports * const & thisObj, nsDOMEvent & event, mozilla::ErrorResult & aRv, mozilla::dom::CallbackObject::ExceptionHandling aExceptionHandling)  Line 58 + 0x2b bytes	C++
> xul.dll!nsJSEventListener::HandleEvent(nsIDOMEvent * aEvent)  Line 248	C++
> xul.dll!nsEventListenerManager::HandleEventSubType(nsListenerStruct * aListenerStruct, const mozilla::dom::CallbackObjectHolder<mozilla::dom::EventListener,nsIDOMEventListener> & aListener, nsIDOMEvent * aDOMEvent, mozilla::dom::EventTarget * aCurrentTarget, nsCxPusher * aPusher)  Line 937 + 0x1d bytes	C++
> xul.dll!nsEventListenerManager::HandleEventInternal(nsPresContext * aPresContext, nsEvent * aEvent, nsIDOMEvent * * aDOMEvent, mozilla::dom::EventTarget * aCurrentTarget, nsEventStatus * aEventStatus, nsCxPusher * aPusher)  Line 1009 + 0x1e bytes	C++
> xul.dll!nsEventListenerManager::HandleEvent(nsPresContext * aPresContext, nsEvent * aEvent, nsIDOMEvent * * aDOMEvent, mozilla::dom::EventTarget * aCurrentTarget, nsEventStatus * aEventStatus, nsCxPusher * aPusher)  Line 329	C++
> xul.dll!nsEventTargetChainItem::HandleEvent(nsEventChainPostVisitor & aVisitor, bool aMayHaveNewListenerManagers, nsCxPusher * aPusher)  Line 204	C++
> xul.dll!nsEventTargetChainItem::HandleEventTargetChain(nsEventChainPostVisitor & aVisitor, nsDispatchingCallback * aCallback, bool aMayHaveNewListenerManagers, nsCxPusher * aPusher)  Line 331	C++
> xul.dll!nsEventDispatcher::Dispatch(nsISupports * aTarget, nsPresContext * aPresContext, nsEvent * aEvent, nsIDOMEvent * aDOMEvent, nsEventStatus * aEventStatus, nsDispatchingCallback * aCallback, nsCOMArray<mozilla::dom::EventTarget> * aTargets)  Line 635 + 0x19 bytes	C++
> xul.dll!nsEventDispatcher::DispatchDOMEvent(nsISupports * aTarget, nsEvent * aEvent, nsIDOMEvent * aDOMEvent, nsPresContext * aPresContext, nsEventStatus * aEventStatus)  Line 693 + 0x1d bytes	C++
> xul.dll!nsDOMEventTargetHelper::DispatchDOMEvent(nsEvent * aEvent, nsIDOMEvent * aDOMEvent, nsPresContext * aPresContext, nsEventStatus * aEventStatus)  Line 331 + 0x19 bytes	C++
> xul.dll!nsXHREventTarget::DispatchDOMEvent(nsEvent * aEvent, nsIDOMEvent * aDOMEvent, nsPresContext * aPresContext, nsEventStatus * aEventStatus)  Line 71 + 0x1f bytes	C++
> xul.dll!nsXMLHttpRequest::DispatchDOMEvent(nsEvent * aEvent, nsIDOMEvent * aDOMEvent, nsPresContext * aPresContext, nsEventStatus * aEventStatus)  Line 232 + 0x1f bytes	C++
> xul.dll!nsXMLHttpRequest::DispatchProgressEvent(nsDOMEventTargetHelper * aTarget, const nsAString_internal & aType, bool aLengthComputable, unsigned __int64 aLoaded, unsigned __int64 aTotal)  Line 1486	C++
> xul.dll!nsXMLHttpRequest::ChangeStateToDone()  Line 2226	C++
> xul.dll!nsXMLHttpRequest::OnStopRequest(nsIRequest * request, nsISupports * ctxt, tag_nsresult status)  Line 2182	C++
> xul.dll!nsCORSListenerProxy::OnStopRequest(nsIRequest * aRequest, nsISupports * aContext, tag_nsresult aStatusCode)  Line 579 + 0x28 bytes	C++
> xul.dll!nsStreamListenerTee::OnStopRequest(nsIRequest * request, nsISupports * context, tag_nsresult status)  Line 52 + 0x28 bytes	C++
> xul.dll!mozilla::net::nsHttpChannel::OnStopRequest(nsIRequest * request, nsISupports * ctxt, tag_nsresult status)  Line 5070	C++
> xul.dll!nsInputStreamPump::OnStateStop()  Line 556	C++
> xul.dll!nsInputStreamPump::OnInputStreamReady(nsIAsyncInputStream * stream)  Line 375 + 0xb bytes	C++
> xul.dll!nsInputStreamReadyEvent::Run()  Line 83	C++
> xul.dll!nsThread::ProcessNextEvent(bool mayWait, bool * result)  Line 626 + 0x19 bytes	C++
> xul.dll!NS_ProcessNextEvent(nsIThread * thread, bool mayWait)  Line 238 + 0x17 bytes	C++
> xul.dll!mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate * aDelegate)  Line 82 + 0xe bytes	C++
> xul.dll!MessageLoop::RunInternal()  Line 220	C++
> xul.dll!MessageLoop::RunHandler()  Line 213	C++
> xul.dll!MessageLoop::Run()  Line 187	C++
> xul.dll!nsBaseAppShell::Run()  Line 165	C++
> xul.dll!nsAppShell::Run()  Line 113 + 0x9 bytes	C++
> xul.dll!nsAppStartup::Run()  Line 269 + 0x1c bytes	C++
> xul.dll!XREMain::XRE_mainRun()  Line 3856 + 0x25 bytes	C++

---

> 
> Re: calling CoInitializeEx vs CoInitialize, the WMF documentation recommends
> that we use MTA:
> http://msdn.microsoft.com/en-us/library/windows/desktop/ee892371%28v=vs.
> 85%29.aspx

That sounds like you should stop using the main thread since it's STA.
Or evaluate changing the main thread to MTA outside of this bug.
If you'd like to disable WMF completely on Beta, Aurora and Nightly because you are afraid of it using STA then please post a different bug for that as well.

> I've had no exposure to marshaling between MTA and STA... But based on my
> reading of the above, if we switch to using STA....

Let me stop you there, you're already using STA when you're on the main thread.
From what I understand in general btw you use STA when you don't need COM to handle the locking of the object you're using.

> we'd have to marshal every
> WMF object we create with an MTA proxy before we passed it into WMF if we
> switched the decode thread to STA.

So don't use the main thread.
Summary: [Win7 SP1] crash in mozilla::widget::JumpListBuilder::AddListToBuild @ CCliModalLoop::CCliModalLoop → Fix mismatch between CoInitialize and CoUninitialize causing crash in mozilla::widget::JumpListBuilder::AddListToBuild @ CCliModalLoop::CCliModalLoop
Whiteboard: win7 SP1 only use URL https://apps.facebook.com/pyramidsolitairesaga/ → win7 SP1 only use URL
Thanks for explaining, and posting that call stack, it's very enlightening.

That's a WebAudio callstack, which is why I didn't anticipate that caller. I will file a follow up in WebAudio about this.
Comment on attachment 765160 [details] [diff] [review]
Fix CoInitialize/CoUninitialize mismatch - rev2

Review of attachment 765160 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/media/wmf/WMFReader.cpp
@@ +80,5 @@
>  WMFReader::OnDecodeThreadStart()
>  {
>    NS_ASSERTION(mDecoder->OnDecodeThread(), "Should be on decode thread.");
> +
> +  // XXX This is sometimes called on the main thread so will definitely fail.

I'd say "WebAudio can call this on the main thread..."
Attachment #765160 - Flags: feedback+
(In reply to Chris Pearce (:cpearce) from comment #42)
> Thanks for explaining, and posting that call stack, it's very enlightening.
> 
> That's a WebAudio callstack, which is why I didn't anticipate that caller. I
> will file a follow up in WebAudio about this.

great, thanks!

-- 

I'll do the comment nit after the review but before pushing.
Filed bug 885185 for tracking the Web Audio main thread WMF issue.
Blocks: 885185
No longer depends on: 885185
Comment on attachment 765160 [details] [diff] [review]
Fix CoInitialize/CoUninitialize mismatch - rev2

Review of attachment 765160 [details] [diff] [review]:
-----------------------------------------------------------------

r+ for now, but we should definitely followup in bug 885185.

Also, r=padenot, r=paul being another person at Mozilla (Paul Rouget).

::: content/media/wmf/WMFReader.cpp
@@ +80,5 @@
>  WMFReader::OnDecodeThreadStart()
>  {
>    NS_ASSERTION(mDecoder->OnDecodeThread(), "Should be on decode thread.");
> +
> +  // XXX This is sometimes called on the main thread so will definitely fail.

To be specific, this is caused by calling something like |AudioContext.createBuffer(ArrayBuffer)| with ArrayBuffer being compressed (say, mp3) data. This is a terrible synchronous API that is in the spec, but even the spec says that you should prefer the async version.

Having a NS_ASSERTION that we know can fail with a specific API usage is not great, but this is more in the scope of bug 885185, I guess.
Attachment #765160 - Flags: review?(paul) → review+
(In reply to Paul Adenot (:padenot) from comment #46)
> Comment on attachment 765160 [details] [diff] [review]
> Fix CoInitialize/CoUninitialize mismatch - rev2
> 
> Review of attachment 765160 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Having a NS_ASSERTION that we know can fail with a specific API usage is not
> great, but this is more in the scope of bug 885185, I guess.

The assertion doesn't fail, OnDecodeThread() returns true if the current thread is equal to mDecodeThread, which is in this case the main thread.
(In reply to Chris Pearce (:cpearce) from comment #42)
> That's a WebAudio callstack

Ah! That also explains why this is a problem that's starting in 23, as AFAIK we only started WebAudio support there (WMF support by itself is older than that).
Are there any other cases where it uses the main thread? I'm asking to see if I should request uplifting to beta.
https://hg.mozilla.org/mozilla-central/rev/a67425aa4728
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla24
Comment on attachment 765160 [details] [diff] [review]
Fix CoInitialize/CoUninitialize mismatch - rev2

[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 799315, but we didn't know about the crash(es) until WebAudio landed. 
User impact if declined: COM resources can be cleaned up early on the main thread, causing any component that uses COM on the main thread to fail. 
Testing completed (on m-c, etc.): Not yet but I tested with a local build and confirmed the crash no longer happens.
Risk to taking this patch (and alternatives if risky): Very low
String or IDL/UUID changes made by this patch: none
Attachment #765160 - Flags: approval-mozilla-aurora?
(In reply to Brian R. Bondy [:bbondy] from comment #49)
> I'm asking to see if I should request uplifting to beta.

There is no real "beta" branch right now. 22 is already on -release and we will only respin the 22 candidates for really grave issues. I think we don't have any evidence of crashes on 22 due to this, so it wouldn't qualify for a candidate respin, I'd think.
That said, 23 will go from aurora to beta within a week, and it's very good we are able to fix this before we build beta 1 there.
Sounds good, padenot could you still confirm about if it's ever called on the main thread anyway in any other case? If it is we can search for other crashes on beta but not aurora and link them here.  I suspect that's the only case on the main thread though?
> If it is we can search for other crashes on beta but not *only on* aurora+ and link them here
This bug was introduced by WebAudio and WebAudio is preffed off in Firefox 22, so if we can't get the fix here into 22 it's not the end of the world.

The only calls to OnDecodeThreadStart() are these:

http://mxr.mozilla.org/mozilla-central/search?string=-%3Eondecodethreadstart%28%29&find=&findi=&filter=^[^\0]*%24&hitlimit=&tree=mozilla-central

And only the OnDecodeThreadStart() call made by WebAudio can happen on the main thread.
Actually, bug 885505 was just filed; so Web Audio isn't actually preffed on in Firefox 23 (Aurora).
Also note, that the "media.webaudio.enabled" pref is guarded by a #ifndef RELEASE_BUILD define in {all,firefox,etc}.js, so it's enabled in Aurora 23, but won't ship in Firefox 23 Release builds.
Attachment #765160 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Keywords: verifyme
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:23.0) Gecko/20100101 Firefox/23.0
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:25.0) Gecko/20130626 Firefox/25.0
Mozilla/5.0 (Windows NT 6.1; rv:23.0) Gecko/20100101 Firefox/23.0
Mozilla/5.0 (Windows NT 6.1; rv:25.0) Gecko/20130626 Firefox/25.0

Verified as fixed on Firefox 23 Beta 1 (buildID: 20130625125232) and latest Nightly (buildID: 20130625031238).
Mozilla/5.0 (Windows NT 6.1; rv:24.0) Gecko/20100101 Firefox/24.0
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Firefox/24.0

Verified as fixed on Firefox 24 beta 1 (buildID: 20130806170643).
Status: RESOLVED → VERIFIED
Keywords: verifyme
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: