Open Bug 1142799 Opened 9 years ago Updated 2 years ago

Make NS_DispatchToMainThread infallible

Categories

(Core :: XPCOM, defect)

x86_64
Linux
defect

Tracking

()

People

(Reporter: erahm, Unassigned)

References

(Blocks 1 open bug)

Details

Attachments

(3 files, 1 obsolete file)

|NS_DispatchToMainThread| returns an nsresult, presumably it can fail. A majority of code *does not* check this (although a fair amount of code does). Doing some rough grepping showed that there are ~542 calls where the rv is not checked, ~160 where it is checked (this is a pretty rough count, it seems to miss about 170 calls).

There is also |NS_INLINE_DECL_THREADSAFE_REFCOUNTING_WITH_MAIN_THREAD_DESTRUCTION| used in gfx/layers/ipc which has a call to |NS_DispatchToMainThread| and does not check it's rv (it shows up 4 times).

It's possible this doesn't matter and nothing should check the rv (that seems wrong, but I'm open to discussion). If that's the case we could at least add some documentation to the effect of: "this rv is pointless, move along."

On the other hand we might consider adding an annotation indicating the rv should be checked and fixing up the calling locations. The most straightforward way this is dealt with currently is:

> MOZ_ALWAYS_TRUE(NS_SUCCEEDED(NS_DispatchToMainThread(r)));

Some code uses an if:

> if (NS_FAILED(NS_DispatchToMainThread(runnable))) {

Other code stores the rv, and presumably does something with it:

> nsresult rv = NS_DispatchToMainThread(r);

For background, this issue came to my attention due to static analysis pointing out the potential for memleaks if the call to |NS_DispatchToMainThread| fails.
Calls to |NS_DispatchToMainThread| found with:

> hg st -mac0n | xargs -0 grep -n -E 'NS_DispatchToMainThread\('
Calls that seem to check the rv of |NS_DispatchToMain| found by running:

>  hg st -mac0n | xargs -0 grep -n -E 'NS_DispatchToMainThread\(' | grep -E 'if |=|TRUE'
Unchecked |NS_DispatchToMain| calls found with:

> hg st -mac0n | xargs -0 grep -n -E 'NS_DispatchToMainThread\(' | grep -v -E 'if |=|TRUE|return '
So it's been a little while since I looked at this in the context of bug 715376, but what I remember from that bug is:

- Some calls there's simply nothing useful to be done on failure (e.g. dispatching tasks from nsIRunnable::Run) or similar;
- Some calls can legitimately fail because they're being run post thread/xpcom shutdown (!), and there's no main thread event loop to dispatch them to.  (This is, in fact, the only way that dispatching can fail, short of passing in a null event.)

I guess those questions are separate from the memory leaks that can happen.

We could try to make NS_DispatchToMainThread follow the rules that RegisterStrongMemoryReporter and friends take, which is to hold a strong reference to the passed-in argument during the execution of NS_DispatchToMainThread.  I'm not totally keen on this, as it introduces another bending of the XPCOM reference-counting rules.  But it might be easier than fixing up all the current pass-the-result-of-|new| uses that we already have--and somehow making sure people don't screw up future code.

Perhaps a static analysis pass that pattern-matches:

  NS_DispatchToMainThread(new T(...))

would be worthwhile there.  You'd probably also have to catch:

  NS_DispatchToMainThread(NS_NewRunnableMethod{,With*}...)

and NS_DispatchToCurrentThread variants, and so forth...
I've added bug 1142831 to track the discussion of memleaks specifically.
See Also: → 1142831
Please see prior history also, bug 1038955 at least and I think there are a few others. I think we should make this infallible.
In history we have cases where if the dispatch fails (maybe cause it's late on shutdown, or for other reasons) we don't care. Those calls are marked with a cast to (void). Could you please filter (void) casts out, we often used those to mark we don't care about the result.
I also don't think it's a good idea to always abort if the dispatching fails, unless there is also a version that doesn't abort, to be used where we don't care.
(In reply to Marco Bonardo [::mak] from comment #7)
> In history we have cases where if the dispatch fails (maybe cause it's late
> on shutdown, or for other reasons) we don't care. Those calls are marked
> with a cast to (void). Could you please filter (void) casts out, we often
> used those to mark we don't care about the result.
> I also don't think it's a good idea to always abort if the dispatching
> fails, unless there is also a version that doesn't abort, to be used where
> we don't care.

There are 7/542 instances that use void casts. I agree that aborting in |NS_DispatchToMainThread| on failure probably isn't the best solution (my understanding is that it can fail during shutdown, which seems legit).

I believe in a previous thread :bsmedberg suggested asserting if xpcom had never been properly initialized, which does seem to make sense.
Unchecked calls updated to remove explicit void casts.
Attachment #8576995 - Attachment is obsolete: true
I possibly agree that there are currently bugs where event dispatch can fail during shutdown. However, all of those cases are incorrect by design and should be fixed: the end state we want to see here is where we can make event dispatch infallible. Here are the cases I know about:

* Dispatching an event to a non-main thread which is shutting down or has already shut down. This is a logic error, and we should make those into asserts, fix them, and then make it infallible.
* Dispatching an event to the main thread before XPCOM has started up or after we hit the point of no return in shutdown (xpcom-shutdown-threads, basically). This is also a pretty serious logic error, worthy of the same sort of assert/fix/abort handling.

I'm strongly opposed to the notion that we should be going through the tree and trying to fix all of the call sites to handle error cases. In many cases good error handling is impossible or more dangerous than leaking.
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #10)
> I possibly agree that there are currently bugs where event dispatch can fail
> during shutdown. However, all of those cases are incorrect by design and
> should be fixed: the end state we want to see here is where we can make
> event dispatch infallible. Here are the cases I know about:
> 
> * Dispatching an event to a non-main thread which is shutting down or has
> already shut down. This is a logic error, and we should make those into
> asserts, fix them, and then make it infallible.
> * Dispatching an event to the main thread before XPCOM has started up or
> after we hit the point of no return in shutdown (xpcom-shutdown-threads,
> basically). This is also a pretty serious logic error, worthy of the same
> sort of assert/fix/abort handling.
> 
> I'm strongly opposed to the notion that we should be going through the tree
> and trying to fix all of the call sites to handle error cases. In many cases
> good error handling is impossible or more dangerous than leaking.

We discussed this further during the memshrink meeting and there was general consensus that we should make |NS_DispatchToMainThread| infallible. I'm happy to start with adding a release assertion and filing follow ups for any test related failures.
Assignee: nobody → erahm
Depends on: 1145893
Try run w/ asserts, looks like most bustage is related to nsCertVerificationThread: https://treeherder.mozilla.org/#/jobs?repo=try&revision=4923edb3fd8a
Depends on: 1146580
Summary: Fix inconsistent checking of NS_DispatchToMainThread's return value → Make NS_DispatchToMainThread infallible
void NS_DispatchToMainThread(nsIRunnable* aEvent,
                             uint32_t aDispatchFlags = NS_DISPATCH_NORMAL,
                             nsresult* aError = nullptr);

would provide that the method can assert when aError = nullptr and otherwise assume that callers will handle *aError.
I'm not actively pursuing this.
Assignee: erahm → nobody
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: