Closed Bug 1449637 Opened 6 years ago Closed 3 years ago

Allow nsAppExitEvent to set main process exit code

Categories

(Toolkit :: Startup and Profile System, enhancement, P3)

enhancement

Tracking

()

RESOLVED DUPLICATE of bug 1675329

People

(Reporter: ato, Unassigned)

References

Details

Marionette calls Services.startup.quit(Ci.nsIAppStartup.eForceQuit) if
it is unable to bind to the designated port on startup.  Internally,
this calls nsAppStartup::Quit which despatches a nsAppExitEvent
[1] to the curren thread.

As far as I can tell this eventually ends up in the IPC subsystem
instructing various different subprocesses and threads to finish
their work and join.

As whimboo points out in [2] it would be rather useful if we could use
an exit code to communicate to the parent process of Firefox, in this
case geckodriver, that Firefox failed to start because it failed to
bind to a port.  An exit code is the perfect IPC mechanism for doing this.

An API such as Servics.startup.exitCode = N or .setExitCode(N)
would be sufficient.  This could be picked up by nsAppExitEvent and
be propagated to the appropriate exit point.  I didn’t actually
find the eventual exit point in Gecko.

  [1] https://searchfox.org/mozilla-central/rev/028cd8342899357d80fafba2d528a0fd5819e316/toolkit/components/startup/nsAppStartup.cpp#103-121
  [2] https://github.com/mozilla/geckodriver/issues/979#issuecomment-376941655
Would this be a good idea in principle?
Flags: needinfo?(dtownsend)
Component: General → Startup and Profile System
A previous variant of this discussion occurred in bug 894697.
See Also: → 894697
Thanks, jryans for doing some archeology.

Quoting bsmedberg in https://bugzilla.mozilla.org/show_bug.cgi?id=894697#c24:

> As noted several times, my issue with providing this functionality
> is that it's likely to be incorrect in some edge cases (various
> kinds of profile restarts), and AFAIK we don't have any automated
> test suites which can be used to test this behavior. So unless
> we're going to fix this in all the cases and have automated tests,
> I continue to oppose adding an API which is an untested footgun.

The first point about identifying the edge cases is still valid.

The second about tests is important, but the situation since this
comment has changed and we now have Marionette which can use to test this.
(In reply to Andreas Tolfsen ‹:ato› from comment #3)
> Thanks, jryans for doing some archeology.
> 
> Quoting bsmedberg in https://bugzilla.mozilla.org/show_bug.cgi?id=894697#c24:
> 
> > As noted several times, my issue with providing this functionality
> > is that it's likely to be incorrect in some edge cases (various
> > kinds of profile restarts), and AFAIK we don't have any automated
> > test suites which can be used to test this behavior. So unless
> > we're going to fix this in all the cases and have automated tests,
> > I continue to oppose adding an API which is an untested footgun.
> 
> The first point about identifying the edge cases is still valid.

I'm not sure there are many edge cases left here, unless you're testing profile selection through the startup profile manager.

> The second about tests is important, but the situation since this
> comment has changed and we now have Marionette which can use to test this.

Yeah definitely valuable.

I think my only question is whether we've explored other options for getting this data in another method, maybe by logging to stdout/stderr and having marionette capture that? Something like that would likely be a lot easier than feeding the exit code through to the eventual main return.
Flags: needinfo?(dtownsend)
(In reply to Dave Townsend [:mossop] from comment #4)
> I think my only question is whether we've explored other options for getting
> this data in another method, maybe by logging to stdout/stderr and having
> marionette capture that? Something like that would likely be a lot easier
> than feeding the exit code through to the eventual main return.

That would not help other processes which started the Firefox process and use the exit code to determine the quit reason of the process. Also the shell would still see a 0, which in case of bug 1370520 wouldn't be appropriate.

So Marionette is just one of those tools, but can be used for testing this proposed feature because it also evaluates the exit code of the handled child process.
(In reply to Henrik Skupin (:whimboo) from comment #5)
> (In reply to Dave Townsend [:mossop] from comment #4)
> > I think my only question is whether we've explored other options for getting
> > this data in another method, maybe by logging to stdout/stderr and having
> > marionette capture that? Something like that would likely be a lot easier
> > than feeding the exit code through to the eventual main return.
> 
> That would not help other processes which started the Firefox process and
> use the exit code to determine the quit reason of the process.

Presumably there aren't any other such processes currently since we always exit with code 0 right now.

> Also the
> shell would still see a 0, which in case of bug 1370520 wouldn't be
> appropriate.

Why not appropriate?
(In reply to Dave Townsend [:mossop] from comment #6)
> > That would not help other processes which started the Firefox process and
> > use the exit code to determine the quit reason of the process.
> 
> Presumably there aren't any other such processes currently since we always
> exit with code 0 right now.
>
> > Also the
> > shell would still see a 0, which in case of bug 1370520 wouldn't be
> > appropriate.
> 
> Why not appropriate?

Currently other processes falsely assume that Firefox quit without problems in case of content crashes and MOZ_CRASHREPORTER_SHUTDOWN set. So quitting with a more appropriate exit code, would help to better diagnose problems. Or should we use a different approach as normally shutting down Firefox in those cases? Maybe a MOZ_CRASH? Please see bug 1370520.
(In reply to Henrik Skupin (:whimboo) from comment #7)
> (In reply to Dave Townsend [:mossop] from comment #6)
> > > That would not help other processes which started the Firefox process and
> > > use the exit code to determine the quit reason of the process.
> > 
> > Presumably there aren't any other such processes currently since we always
> > exit with code 0 right now.
> >
> > > Also the
> > > shell would still see a 0, which in case of bug 1370520 wouldn't be
> > > appropriate.
> > 
> > Why not appropriate?
> 
> Currently other processes falsely assume that Firefox quit without problems
> in case of content crashes and MOZ_CRASHREPORTER_SHUTDOWN set. So quitting
> with a more appropriate exit code, would help to better diagnose problems.
> Or should we use a different approach as normally shutting down Firefox in
> those cases? Maybe a MOZ_CRASH? Please see bug 1370520.

If a main process crash is detectable in the ways you need then that's probably going to be by far the simplest thing to implement.

That said, this request for setting the exit code does seem to keep coming up (and I bet headless mode could make use of it) so I don't have objections to seeing it implemented, I just expect it to be hairy.

My suggestion for the API would be to just add an optional second argument to nsIAppStartup.quit for the exit code. I would also  add a new constant to nsIAppStartup for each exit code you want to use, this will help different components not use conflicting exit codes.
Probably worth getting Nathan's input here too as the owner of XPCOM he may have opinions.
Flags: needinfo?(nfroyd)
(In reply to Dave Townsend [:mossop] from comment #9)
> Probably worth getting Nathan's input here too as the owner of XPCOM he may
> have opinions.

The proposed API sounds completely reasonable and obviously useful to people today.  If we can have some tests for it, I don't see why we shouldn't add it.
Flags: needinfo?(nfroyd)
Priority: -- → P3

So I assume bug 1675329 fixed that, right? So it looks like we can dupe this bug.

Flags: needinfo?(dtownsend)

I completely forgot about this bug. Looks like I even implemented the same API I suggested two years ago.

Status: NEW → RESOLVED
Closed: 3 years ago
Flags: needinfo?(dtownsend)
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.