Closed Bug 1434376 Opened 6 years ago Closed 6 years ago

Make BrowserUtils.promiseReflowed and BrowserUtils.promiseLayoutFlushed reliable

Categories

(Firefox :: General, enhancement, P1)

enhancement

Tracking

()

RESOLVED FIXED
Firefox 60
Tracking Status
firefox60 --- fixed

People

(Reporter: mconley, Assigned: mconley)

References

(Blocks 3 open bugs)

Details

(Whiteboard: [fxperf])

Attachments

(3 files)

These two helpers were added in bug 1383367.

Their intent is to make it easier to schedule JavaScript to run after the next reflow has occurred, if a reflow is scheduled. If there is no reflow scheduled, it should run the JavaScript immediately. The hope was to create some functions that would make it easier for our front-end developers to avoid synchronous layout flushes from JavaScript code that needs to query layout or style information from the UI.

Unfortunately, it turns out that one of the underlying assumptions powering these helpers is incorrect. See dbaron's bug 1424823 comment 8 for details.

This bug is about making BrowserUtils.promiseReflowed and BrowserUtils.promiseLayoutFlushed do what is expected of them. We should do this work in cooperation with folks from Layout to make sure our assumptions are correct along the line.
Whiteboard: [fxperf]
Blocks: 1356532
Blocks: 1424823
I'm pretty sure this bug is a P1 regardless of which component it ends up into.

We just can't build what is a significant portion of the performance work we planned for the next six months on an unreliable foundation.

Panos, you mentioned you could help with making sure this bug gets the attention it deserves, can you still help here?
Flags: needinfo?(past)
Priority: -- → P1
Summary: Make BrowserUtils.promiseReflowed and BrowserUtils.promiseLayoutFlushed more reliable → Make BrowserUtils.promiseReflowed and BrowserUtils.promiseLayoutFlushed reliable
Carrying over dbaron's bug 1424823 comment 8 for reference.

---

So from a quick look, this code is pretty far away from doing what I think it intends to do.

promiseLayoutFlushed takes a flushType argument that can be one of three things:  "style", "layout", or "display".

The underling code calls PresShell::NeedsFlush:
https://searchfox.org/mozilla-central/rev/4611b9541894e90a421debb57ddbbcff55c2f369/layout/base/nsIPresShell.h#603-616
which:
 * if the argument is "style", returns true if a style flush is pending
 * if the argument is "layout", returns true if a style or layout flush is pending
 * if the argument is "display", always returns true

It then expects a reflow observer to be called later.  A reflow observer will be called here, I think:
https://searchfox.org/mozilla-central/rev/4611b9541894e90a421debb57ddbbcff55c2f369/layout/base/PresShell.cpp#8804
which means when a reflow actually happens.

Now a style flush being pending doesn't mean that reflow will happen.  It could cause reflow to happen, or it might not -- it depends on what the style changes do, and whether they trigger reflow.

It's perhaps a plausible claim that a layout flush being pending means reflow will happen.  On the other hand, it might be easily refutable (for example, by checking that all callers of https://searchfox.org/mozilla-central/search?q=SetNeedLayoutFlush&case=true&path= actually do something that causes a reflow).  And proving it to be true would require thinking through a lot of edge cases.

However, despite that, it's easy to demonstrate that promiseLayoutFlushed is broken for all three argument types:
 * "style", since there's no guarantee that style being dirty will cause a reflow (for example, if only a color has been changed)
 * "layout", for the same reason as "style", since needsFlush returns true for "layout" if a style flush is needed
 * "display", since needsFlush *always* returns true, and there's certainly no guarantee there will always be a reflow.

I think if you need a general mechanism for this, it needs to be baked much further into layout code.  It's worth figuring out what the use cases are and how general that mechanism needs to be, though, before doing a lot of work.
Also my question from bug 1424823 comment 10:

In the meantime, to fix the immediate shortcomings of this API, is it possible to change something in the DOM or call an API that will _guarantee_ an asynchronous reflow, even if not necessary, whenever we think we _might_ have one pending? This would probably make things slower in some cases, but is better than the current situation because these are the cases where now we have deadlocks.
Flags: needinfo?(dbaron)
I'm not sure about the semantics these functions want, but I'd naively suggest something like:

  if (!needFlush()) { resolve() } else { requestAnimationFrame(() => resolve()) }

That gives you a very naive implementation of that, that guarantees to eventually run the animation frame callback with clean layout, with a gotcha:

Of course another caller that schedules another animation frame callback, or listener for an event that runs in between the flushes from the refresh driver and the frame callbacks (I think only resize events? I'd need to dig up the code) could cause a sync flush to be needed. What to do in those cases is up to you, you could schedule another frame callback, or assert it doesn't happen, etc...
(In reply to Emilio Cobos Álvarez [:emilio] from comment #4)
> I'm not sure about the semantics these functions want, but I'd naively
> suggest something like:
> 
>   if (!needFlush()) { resolve() } else { requestAnimationFrame(() =>
> resolve()) }
> 
> That gives you a very naive implementation of that, that guarantees to
> eventually run the animation frame callback with clean layout, with a gotcha:
> 
> Of course another caller that schedules another animation frame callback, or
> listener for an event that runs in between the flushes from the refresh
> driver and the frame callbacks (I think only resize events? I'd need to dig
> up the code) could cause a sync flush to be needed. What to do in those
> cases is up to you, you could schedule another frame callback, or assert it
> doesn't happen, etc...

Err, I had misread our code initially, and Florian correctly pointed out that we call frame request callbacks before layout:

  https://searchfox.org/mozilla-central/rev/c56f656febb1aa58c72b218699e24a76425bd284/layout/base/nsRefreshDriver.cpp#1902

But that conversation was interesting, here's a bit more context about what's the point and requirements of these functions:

  https://mozilla.logbot.info/fx-team/20180201#c14226498
The rest of bug 1424823 comment 10 for context:

Basically we want to avoid synchronous reflows. So, my understanding is that whenever we need to call .getBoundingClientRect() we have to do it in the promiseLayoutFlushed("layout") callback. I think the "style" case exists for when we need to call .getComputedStyle(). I'm not sure what the "display" case is about.
Just summarizing a long discussion in #fx-team[1] here:

Ultimately, it sounds like what we want here is a way for BrowserUtils to know that a refresh driver tick resulting in either a style or layout flush is:

1) Coming up because the DOM has been dirtied
2) If (1) is true, we need to know when that flush (either style OR layout) has completed.

I think this would give us what we need to be sure that the DOM is safe to query layout/style information from.

[1]: https://mozilla.logbot.info/fx-team/20180201
Relevant bug that smaug filed back in the day with some interesting commentary: https://www.w3.org/Bugs/Public/show_bug.cgi?id=28644

Apparently, smaug is looking for some help in convincing folks that this is a web API worth standardizing and shipping.

Also interesting is this PresShell method: https://searchfox.org/mozilla-central/rev/f41017f5cd7b5f3835d282fbe011d2899574fc2b/layout/base/PresShell.cpp#9373

which might give us half of what we want here.
Hey dbaron,

Would you be opposed to someone exposing a new mechanism on nsIDOMWindowUtils (or anywhere available to privileged script, really) that takes a JS callback, registers a nsAPostRefreshObserver on the associated PresShell[1], then in the observer's DidRefresh, has the observer deregister itself and fire the callback?

Also, supposing the above is acceptable, are there any RAII primitives lying around that we could use to prevent DOM writes while the JS callback is executing?

[1]: https://searchfox.org/mozilla-central/rev/f41017f5cd7b5f3835d282fbe011d2899574fc2b/layout/base/nsIPresShell.h#1619
Comment on attachment 8948058 [details]
Bug 1434376 - Introduce ChromeOnly window.promiseDocumentFlushed to detect when refresh driver ticks have completed.

This appears to give me what I wanted here, save for protection from DOM manipulation within the callback (which I might fix in a follow-up with both linting and crashing).

I'm happy to bikeshed on the name - requestNotifyDidRefresh was the first thing I came up with and I'm not attached to it by any means.

Hey bz, would something like this be acceptable in our DOM code?
Attachment #8948058 - Flags: feedback?(bzbarsky)
For the record, I considered using the pre-existing MozAfterPaint instead, but from my examination of the code (and someone feel free to correct me here), MozAfterPaint only* fires after invalidations, which might not occur even after a style and layout calculation.

* There are also cases where MozAfterPaint fires after a timer notification, but mantaroh is looking to get rid of that case in bug 1419226.
Comment on attachment 8948058 [details]
Bug 1434376 - Introduce ChromeOnly window.promiseDocumentFlushed to detect when refresh driver ticks have completed.

https://reviewboard.mozilla.org/r/217692/#review223466


Static analysis found 1 defect in this patch.
 - 1 defect found by clang-tidy

You can run this analysis locally with:
 - `./mach static-analysis check path/to/file.cpp` (C/C++)


If you see a problem in this automated review, please report it here: http://bit.ly/2y9N9Vx


::: dom/base/nsGlobalWindowInner.cpp:7352
(Diff revision 1)
> +    : mShell(aShell)
> +    , mCallback(&aCallback)
> +  {
> +  }
> +
> +  virtual ~NotifyDidRefreshListener()

Warning: Use '= default' to define a trivial destructor [clang-tidy: modernize-use-equals-default]
Comment on attachment 8948058 [details]
Bug 1434376 - Introduce ChromeOnly window.promiseDocumentFlushed to detect when refresh driver ticks have completed.

https://reviewboard.mozilla.org/r/217692/#review223470

::: dom/base/nsGlobalWindowInner.cpp:7362
(Diff revision 1)
> +    ErrorResult rv;
> +    mCallback->Call(rv);
> +    rv.SuppressException();

Just do:

    mCallback->Call();
  
if you don't care about exceptions from it.

::: dom/base/nsGlobalWindowInner.cpp:7367
(Diff revision 1)
> +    ErrorResult rv;
> +    mCallback->Call(rv);
> +    rv.SuppressException();
> +
> +    if (!mShell->RemovePostRefreshObserver(this)) {
> +      MOZ_ASSERT_UNREACHABLE("Unable to unregister post-refresh observer! Leaking it instead of leaving garbage registered");

This isn't unreachable at all.  The JS callback could have torn down the presshell, for example.

Observers that call code for which they know what all the side-effects are have it easy... ;)

We could remove ourselves _before_ calling mCallback.  But even that can fail if an earlier NotifyDidRefreshListener killed the presshell.

Maybe we should just take the approach  CounterStyleCleaner does: store a reference to the refresh driver in our class, so we can guarantee we remove ourselves.

That said, I don't understand how these "delete self when refresh driver ticks" things avoid leaking if it never ticks before it dies....  That's not a problem your patch is making worse, per se, but I really don't like the idea of leaking things that entrain JS here.

::: dom/base/nsGlobalWindowInner.cpp:7404
(Diff revision 1)
> +  if (!shell) {
> +    aError.Throw(NS_ERROR_FAILURE);
> +    return;
> +  }
> +
> +  shell->AddPostRefreshObserver(new NotifyDidRefreshListener(shell, aCallback));

If that returns false (though it generally shouldn't if we just got the shell from GetShell(), I'd think) this would leak...
Attachment #8948058 - Flags: feedback?(bzbarsky) → feedback+
Comment on attachment 8948059 [details]
Bug 1434376 - Switch over all uses of BrowserUtils.promiseLayoutFlushed to window.promiseDocumentFlushed.

https://reviewboard.mozilla.org/r/217694/#review223504

::: toolkit/modules/BrowserUtils.jsm:648
(Diff revision 1)
> +        win.requestNotifyDidRefresh(() => {
>            resolve(callback());
> +        });

Note that this Promise is not fulfilled soon without the patch for bug 1193394.  I *think* the Promise is fulfilled after other script run.  So it might be possible that somthing that makes layout information stale sneaks into there.

Also we resolve FontFaceSet.ready promise after flushing layout [1], so with the patch for 1193394, this ready promise might sneak into requestNotifyDidRefresh if there is no micro task checkpoint between them.  I did search code in browser/, fortunately it seems that we haven't used the ready promise there.

Another thing I noticed is that gfxSVGGlyphsDocument also is a nsAPostRefreshObserver, and it seems to me that it might end up calling FrameNeedsReflow(). 


[1] https://hg.mozilla.org/mozilla-central/file/ef1fefe4c6d1/layout/base/nsRefreshDriver.cpp#l1921

[2] https://searchfox.org/mozilla-central/source/gfx/thebes/gfxSVGGlyphs.cpp#182

[3] https://searchfox.org/mozilla-central/source/layout/generic/nsTextFrame.cpp#681
Comment on attachment 8948059 [details]
Bug 1434376 - Switch over all uses of BrowserUtils.promiseLayoutFlushed to window.promiseDocumentFlushed.

https://reviewboard.mozilla.org/r/217694/#review223504

> Note that this Promise is not fulfilled soon without the patch for bug 1193394.  I *think* the Promise is fulfilled after other script run.  So it might be possible that somthing that makes layout information stale sneaks into there.
> 
> Also we resolve FontFaceSet.ready promise after flushing layout [1], so with the patch for 1193394, this ready promise might sneak into requestNotifyDidRefresh if there is no micro task checkpoint between them.  I did search code in browser/, fortunately it seems that we haven't used the ready promise there.
> 
> Another thing I noticed is that gfxSVGGlyphsDocument also is a nsAPostRefreshObserver, and it seems to me that it might end up calling FrameNeedsReflow(). 
> 
> 
> [1] https://hg.mozilla.org/mozilla-central/file/ef1fefe4c6d1/layout/base/nsRefreshDriver.cpp#l1921
> 
> [2] https://searchfox.org/mozilla-central/source/gfx/thebes/gfxSVGGlyphs.cpp#182
> 
> [3] https://searchfox.org/mozilla-central/source/layout/generic/nsTextFrame.cpp#681

Thanks, hiro! I appreciate you helping me find these issues. How about this for a solution:

1. I alter my nsAPostRefreshObserver to check `nsIPresShell::NeedFlush` for `FlushType::Style` before running the JS callback. If `NeedFlush` returns true, I bail out and wait for the next tick (since one is queued up due to the flush being required).
2. I wait for bug 1193394 to land before I land this code so that we ensure that the Promise is fulfilled soon enough to not let any intermediate script run between the callback firing and the Promise resolving

Would that approach address your concerns?
ni?ing hiro for comment 17. I think I can safely clear the ni?'s on dbaron and past.
Flags: needinfo?(past)
Flags: needinfo?(hikezoe)
Flags: needinfo?(dbaron)
(In reply to Mike Conley (:mconley) (:⚙️) from comment #17)
> Comment on attachment 8948059 [details]
> Bug 1434376 - Switch BrowserUtils.promiseReflowed to the new
> requestNotifyDidRefresh API.
> 
> https://reviewboard.mozilla.org/r/217694/#review223504
> 
> > Note that this Promise is not fulfilled soon without the patch for bug 1193394.  I *think* the Promise is fulfilled after other script run.  So it might be possible that somthing that makes layout information stale sneaks into there.
> > 
> > Also we resolve FontFaceSet.ready promise after flushing layout [1], so with the patch for 1193394, this ready promise might sneak into requestNotifyDidRefresh if there is no micro task checkpoint between them.  I did search code in browser/, fortunately it seems that we haven't used the ready promise there.
> > 
> > Another thing I noticed is that gfxSVGGlyphsDocument also is a nsAPostRefreshObserver, and it seems to me that it might end up calling FrameNeedsReflow(). 
> > 
> > 
> > [1] https://hg.mozilla.org/mozilla-central/file/ef1fefe4c6d1/layout/base/nsRefreshDriver.cpp#l1921
> > 
> > [2] https://searchfox.org/mozilla-central/source/gfx/thebes/gfxSVGGlyphs.cpp#182
> > 
> > [3] https://searchfox.org/mozilla-central/source/layout/generic/nsTextFrame.cpp#681
> 
> Thanks, hiro! I appreciate you helping me find these issues. How about this
> for a solution:
> 
> 1. I alter my nsAPostRefreshObserver to check `nsIPresShell::NeedFlush` for
> `FlushType::Style` before running the JS callback. If `NeedFlush` returns
> true, I bail out and wait for the next tick (since one is queued up due to
> the flush being required).

Yeah, it's similar to what I was going to suggest (but forgot).  What I was going to suggest is to add MOZ_ASSERT(!NeedFlush()) somehow because I thought we can allow stale layout in the case where we have multiple promiseReflowed observers.  But it seems not?  (As for FontFaceSet.ready and gfxSVGGlyphsDocument, I guess we haven't used in browser code and will not allow using in future?)

> 2. I wait for bug 1193394 to land before I land this code so that we ensure
> that the Promise is fulfilled soon enough to not let any intermediate script
> run between the callback firing and the Promise resolving

We've been working hard on bug 1193394, but we are not yet able to guarantee when it can be landed.  I think we can land this bug if there is no trouble on try. :)
Flags: needinfo?(hikezoe)
Comment on attachment 8948059 [details]
Bug 1434376 - Switch over all uses of BrowserUtils.promiseLayoutFlushed to window.promiseDocumentFlushed.

https://reviewboard.mozilla.org/r/217694/#review223886

::: toolkit/modules/BrowserUtils.jsm:648
(Diff revision 1)
> +        win.requestNotifyDidRefresh(() => {
>            resolve(callback());
> +        });

You'll need a second try-catch block inside the requestNotifyDidRefresh callback to propagate errors.
Attachment #8948059 - Flags: feedback?(hikezoe)
Comment on attachment 8948058 [details]
Bug 1434376 - Introduce ChromeOnly window.promiseDocumentFlushed to detect when refresh driver ticks have completed.

https://reviewboard.mozilla.org/r/217692/#review223470

> This isn't unreachable at all.  The JS callback could have torn down the presshell, for example.
> 
> Observers that call code for which they know what all the side-effects are have it easy... ;)
> 
> We could remove ourselves _before_ calling mCallback.  But even that can fail if an earlier NotifyDidRefreshListener killed the presshell.
> 
> Maybe we should just take the approach  CounterStyleCleaner does: store a reference to the refresh driver in our class, so we can guarantee we remove ourselves.
> 
> That said, I don't understand how these "delete self when refresh driver ticks" things avoid leaking if it never ticks before it dies....  That's not a problem your patch is making worse, per se, but I really don't like the idea of leaking things that entrain JS here.

> That's not a problem your patch is making worse, per se, but I really don't like the idea of leaking things that entrain JS here.

I suspect we don't want to hand lifetime management over to the RefreshDriver or PresShell... right? But perhaps we can add a new function to `nsAPostRefreshObserver` that gets called before RefreshDriver destruction, so that these things have a chance to tear themselves down. I could file a follow-up bug for that if the idea is worth pursuing.
> But perhaps we can add a new function to `nsAPostRefreshObserver` that gets called before RefreshDriver
> destruction, so that these things have a chance to tear themselves down.

That would work, I think...
See Also: → 1436152
(In reply to Boris Zbarsky [:bz] (no decent commit message means r-) from comment #24)
> That would work, I think...

Cool - filed bug 1436152.
Comment on attachment 8948059 [details]
Bug 1434376 - Switch over all uses of BrowserUtils.promiseLayoutFlushed to window.promiseDocumentFlushed.

This patch looks fine to me.

As for the other patch, :bz's comment noticed that we probably need to check that the presshell is still alive after calling DidRefresh().
Attachment #8948059 - Flags: feedback?(hikezoe) → feedback+
(In reply to Hiroyuki Ikezoe (:hiro) from comment #27)
> As for the other patch, :bz's comment noticed that we probably need to check
> that the presshell is still alive after calling DidRefresh().

Ah, right! Thanks.
Comment on attachment 8948059 [details]
Bug 1434376 - Switch over all uses of BrowserUtils.promiseLayoutFlushed to window.promiseDocumentFlushed.

https://reviewboard.mozilla.org/r/217694/#review224244

::: toolkit/modules/BrowserUtils.jsm:644
(Diff revision 3)
>     *
>     * @param {Document} doc
>     * @param {function} callback
>     * @returns {Promise}
>     */
>    promiseReflowed(doc, callback) {

There are no external consumers of promiseReflowed, so I would just inline this entire function.

::: toolkit/modules/BrowserUtils.jsm:646
(Diff revision 3)
> -      observer = new ReflowObserver(doc);
> -      reflowObservers.set(doc, observer);
> -    }
> -
>      return new Promise((resolve, reject) => {
> -      observer.callbacks.push(() => {
> +      let win = doc.ownerGlobal;

nit: we can inline this variable.

::: toolkit/modules/BrowserUtils.jsm:661
(Diff revision 3)
>    /**
>     * Calls the given function as soon as a layout flush of the given
>     * type is not necessary, and returns a promise which resolves to the
>     * callback's return value after it executes.
>     *
>     * The function *must not trigger any reflows*, or make any changes
>     * which would require a layout flush.
>     *
>     * @param {Document} doc
>     * @param {string} flushType
>     *        The flush type required. Must be one of:
>     *
>     *          - "style"
>     *          - "layout"
>     *          - "display"
>     * @param {function} callback
>     * @returns {Promise}
>     */

This comment needs updating with a bit more context and/or a reference to the MDN article. In particular, this needs a description of when the "style" and "layout" flush types should be used.

Apparently, needsFlush for the "display" case always returns true, and there is only one consumer, which probably wants the "layout" case instead:

https://dxr.mozilla.org/mozilla-central/rev/0d806b3230fe4767fa70cdee57db87d1e9a5ba49/browser/components/customizableui/CustomizeMode.jsm#2426

We can probably rewrite this single consumer and stop supporting this argument type. We can then likely remove the FLUSH_TYPES array.

::: toolkit/modules/BrowserUtils.jsm:679
(Diff revision 3)
>     *          - "layout"
>     *          - "display"
>     * @param {function} callback
>     * @returns {Promise}
>     */
>    async promiseLayoutFlushed(doc, flushType, callback) {

nit: in a separate changeset, we could consider renaming promiseLayoutFlushed to promiseDidRefresh or something, to align the naming with the underlying API.
Attachment #8948059 - Flags: review?(paolo.mozmail)
I've added some tests now, and I'm pushing to try. I'm going to wait for a green try run before re-requesting review.
Comment on attachment 8948059 [details]
Bug 1434376 - Switch over all uses of BrowserUtils.promiseLayoutFlushed to window.promiseDocumentFlushed.

https://reviewboard.mozilla.org/r/217694/#review224350

::: commit-message-41963:8
(Diff revision 4)
> +Note that promiseDocumentFlushed was chosen over promiseDidRefresh or promiseRefreshed
> +to avoid potential confusion over the actual network-level refresh of browsers or
> +documents.

Makes sense!

::: toolkit/modules/BrowserUtils.jsm:666
(Diff revision 4)
> -    });
> -  },
> +      } catch (e) {
> +        reject(e);

Uh, sorry for not noticing it earlier, but the _outer_ try-catch block is actually not necessary because any exception in the executor function will already reject the new Promise.
Assignee: nobody → mconley
Status: NEW → ASSIGNED
Comment on attachment 8948059 [details]
Bug 1434376 - Switch over all uses of BrowserUtils.promiseLayoutFlushed to window.promiseDocumentFlushed.

https://reviewboard.mozilla.org/r/217694/#review225146

We should report to the console any error excluding the one thrown when the document goes away, in particular to report the NS_ERROR_DOM_NO_MODIFICATION_ALLOWED_ERR case.

::: browser/components/customizableui/PanelMultiView.jsm:400
(Diff revision 6)
>          // binding for the <panelmultiview> element may still be disconnected.
>          // In this case, give the layout code a chance to run.
>          if (!this.connected) {
> -          await BrowserUtils.promiseLayoutFlushed(this.document, "layout",
> -                                                  () => {});
> +          try {
> +            await this.window.promiseDocumentFlushed(() => {});
> +          } catch (e) {

nit: "catch (ex)" instead of "catch (e)". (I didn't bother to mention this before when you were touching a single file, but you're adding many instances here.)
Attachment #8948059 - Flags: review?(paolo.mozmail)
Comment on attachment 8948058 [details]
Bug 1434376 - Introduce ChromeOnly window.promiseDocumentFlushed to detect when refresh driver ticks have completed.

https://reviewboard.mozilla.org/r/217692/#review225150

Is the reason for this not being implemented as "document.promiseFlushed" that it is simpler or more logical to have this on the window object?
Comment on attachment 8949157 [details]
Bug 1434376 - Add basic tests for window.promiseDocumentFlushed.

https://reviewboard.mozilla.org/r/218558/#review225152

::: dom/base/test/browser_promiseDocumentFlushed.js:70
(Diff revision 3)
> +  await Assert.rejects(window.promiseDocumentFlushed(() => {
> +    dirtyStyleAndLayout();
> +  }));

Add an argument to Assert.rejects to check for the specific exception message that is expected in this case. This is the most common way to use it:

Assert.rejects(fn, /substring of the expected message/);

Same for all the other calls.
Comment on attachment 8949157 [details]
Bug 1434376 - Add basic tests for window.promiseDocumentFlushed.

https://reviewboard.mozilla.org/r/218558/#review225152

> Add an argument to Assert.rejects to check for the specific exception message that is expected in this case. This is the most common way to use it:
> 
> Assert.rejects(fn, /substring of the expected message/);
> 
> Same for all the other calls.

Would it be sufficient to check for error codes? I'll post a new patch to show you what I mean.
Attachment #8948058 - Flags: review?(bzbarsky)
Attachment #8949157 - Flags: review?(bzbarsky)
Blocks: 1437512
There are three more edge cases that occurred to me. We may consider addressing those immediately or in a follow-up shortly after. They don't strictly block the feature, but if unaddressed they may cause headaches in the future.

Firstly, we have to add a test that makes sure that two or more consecutive invocations of promiseDocumentFlushed result in all the provided callbacks being called first, and all the promise resolution callbacks being called afterwards. This is not explicitly coded this way in the patch, and it might be the case now because of current Promise scheduling, but if I remember correctly we're adding more microtask checkpoints, meaning that this behavior may break in the future.

Secondly, if one of the callbacks required a flush, we currently raise an exception for all the subsequent callbacks. To avoid this we may either check whether NeedFlush is already true before invoking each callback, change the code structure and keep a corresponding flag around while we cycle through the invocations, or pause the invocations and wait for the next refresh driver tick when this situation occurs. We should have a test for this too.

Lastly, we should either forbid invoking promiseDocumentFlushed from within the callback, or ensure we handle re-entrancy properly. Also with a test :-)
Hi Mike, is there anything you want to do before asking for review on the patches? The current implementation of promiseLayoutFlushed is blocking some improvement work to PanelMultiView. As I mentioned, we may address comment 47 in a follow-up.
Flags: needinfo?(mconley)
(In reply to :Paolo Amadini from comment #51)
> Hi Mike, is there anything you want to do before asking for review on the
> patches? The current implementation of promiseLayoutFlushed is blocking some
> improvement work to PanelMultiView. As I mentioned, we may address comment
> 47 in a follow-up.

Hey Paolo, sorry about the radio-silence here. I haven't done a good job of updating this bug with my current status.

I'm struggling with a test failure right now. One of the changes I made (as per our discussion) was to resolve the promiseDocumentFlushed Promise without running the callback if the window is closed before the refresh driver tick. We were rejecting before, but if we resolve, that means we can stop wrapping all usage with try/catch (and we'll only reject when something has gone horribly wrong, like the developer has dirtied the DOM inside the callback).

The problem is that in my test for this window-closing scenario, the Promise resolves, but for some reason, the await inside the add_task doesn't "notice" and resume the test function allowing it to finish. I think that also means that, in practice, any async function that uses window.promiseDocumentFlushed won't resume if it awaits and the window closes before the refresh driver tick.

I've been working with till and arai in #jsapi to figure this out. arai made a discovery last night:

<arai> till, mconley: the promise callback job is not executed because `global->IsDying()` is true here: https://searchfox.org/mozilla-central/rev/9011be0a172711bc243e50dfca16d42e877bf4ec/xpcom/base/CycleCollectedJSContext.cpp#207
7:18 AM so, the job is popped from the queue, and just discarded
7:18 AM <arai> I wonder if the situation becomes different if bug 1193394 gets fixed
7:19 AM <arai> the Run above is called here https://searchfox.org/mozilla-central/rev/9011be0a172711bc243e50dfca16d42e877bf4ec/dom/promise/Promise.cpp#531
7:19 AM the job is popped in previous line
7:20 AM so, at least with current implementation, promise callback in dying global is not guaranteed to be executed
7:24 AM <arai> even if the promise is already fulfilled
7:26 AM (haven't yet confirmed but I think the global is dying because the window is closed by `await BrowserTestUtils.closeWindow(win);`, just before `await promise`
7:33 AM the check for `global->IsDying()` is not touched by https://bugzilla.mozilla.org/attachment.cgi?id=8950788&action=diff
7:33 AM so, I think the same thing will happen even after bug 1193394
7:45 AM <arai> at least this works  https://pastebin.mozilla.org/9078002
7:45 AM but I'm not sure if that's the right fix
7:46 AM <arai> I guess I should check what the spec says about dying global

So I think I need to clear this up before I request review.
Flags: needinfo?(mconley)
Okay, I think I have this working now. I think I've found a way of connecting the Promise's lifetime to the caller's global.

New patches coming.
Comment on attachment 8948058 [details]
Bug 1434376 - Introduce ChromeOnly window.promiseDocumentFlushed to detect when refresh driver ticks have completed.

https://reviewboard.mozilla.org/r/217692/#review226532

::: dom/base/nsGlobalWindowInner.cpp:875
(Diff revision 8)
> +  void Call(nsIPresShell* aShell = nullptr)
> +  {
> +    MOZ_ASSERT(nsContentUtils::IsSafeToRunScript());
> +
> +    if (aShell) {
> +      MOZ_ASSERT(!aShell->NeedFlush(FlushType::Style));

You can't assert that.  If you had multiple PromiseDocumentFlushedCallbacks, an earlier one could have messed up.

::: dom/base/nsGlobalWindowInner.cpp:888
(Diff revision 8)
> +    ErrorResult error;
> +    JS::Rooted<JS::Value> returnVal(RootingCx());
> +    mCallback->Call(&returnVal, error);
> +
> +    if (aShell && aShell->NeedFlush(FlushType::Style)) {
> +      error.ThrowDOMException(NS_ERROR_DOM_NO_MODIFICATION_ALLOWED_ERR,

error may already have an exception on it.  You can't just clobber it like that....  Probably OK to just check for !error.Failed() around this?

::: dom/base/nsGlobalWindowInner.cpp:1678
(Diff revision 8)
>      tmp->DisconnectAndClearGroupMessageManagers();
>      NS_IMPL_CYCLE_COLLECTION_UNLINK(mChromeFields.mGroupMessageManagers)
>    }
>  
>    NS_IMPL_CYCLE_COLLECTION_UNLINK(mPendingPromises)
> +  NS_IMPL_CYCLE_COLLECTION_UNLINK(mDocumentFlushedResolvers)

Don't you need to traverse it too?

::: dom/base/nsGlobalWindowInner.cpp:7461
(Diff revision 8)
> +    return nullptr;
> +  }
> +
> +  nsIPresShell* shell = mDoc->GetShell();
> +  if (!shell) {
> +    aError.Throw(NS_ERROR_FAILURE);

What if the shell just hasn't been created yet?  What's the expected behavior in that case?

::: dom/base/nsGlobalWindowInner.cpp:7466
(Diff revision 8)
> +    aError.Throw(NS_ERROR_FAILURE);
> +    return nullptr;
> +  }
> +
> +  // We need to associate the lifetime of the Promise to the lifetime
> +  // of the callers global. That way, if the window we're observing for

"caller's"

::: dom/base/nsGlobalWindowInner.cpp:7466
(Diff revision 8)
> +    aError.Throw(NS_ERROR_FAILURE);
> +    return nullptr;
> +  }
> +
> +  // We need to associate the lifetime of the Promise to the lifetime
> +  // of the callers global. That way, if the window we're observing for

s/for//

::: dom/base/nsGlobalWindowInner.cpp:7484
(Diff revision 8)
> +
> +  RefPtr<PromiseDocumentFlushedResolver> flushResolver =
> +    new PromiseDocumentFlushedResolver(resultPromise, aCallback);
> +
> +  if (!shell->NeedFlush(FlushType::Style)) {
> +    flushResolver->Call(shell);

Um... Having this sync callback here is pretty weird from my point of view.

::: dom/base/nsGlobalWindowInner.cpp:7521
(Diff revision 8)
> +  }
> +
> +  nsIPresShell* shell = mDoc->GetShell();
> +  MOZ_ASSERT(shell);
> +  if (!shell) {
> +    // Is this possible?

Probably not, in a DidRefresh notification.

::: dom/base/nsGlobalWindowInner.cpp:7541
(Diff revision 8)
> +  }
> +
> +  rejectionGuard.release();
> +
> +  for (const auto& documentFlushedCallback : mDocumentFlushedResolvers) {
> +    documentFlushedCallback->Call(shell);

Each of those calls runs script synchronously.  After the first one, you have no guarantee that "shell" (or for that matter "this") is alive.

::: dom/webidl/Window.webidl:466
(Diff revision 8)
> +   *
> +   * @param {function} callback
> +   * @returns {Promise}
> +   */
> +  [Throws, Func="nsGlobalWindowInner::IsPrivilegedChromeWindow"]
> +  Promise<any> promiseDocumentFlushed(PromiseDocumentFlushedCallback callback);

So why do we have the sync callback?

Would it not make more sense to just return a Promise and if people want callbacks to run, they can .then() it.

If we want to differentiate between "we're being torn down" and "refresh driver ticked" we can encode that in the resolution value for the Promise.

::: layout/style/nsStyleStruct.h:2335
(Diff revision 8)
>    {
>      MOZ_ASSERT(mType == StyleBasicShapeType::Polygon, "expected polygon");
>      mFillRule = aFillRule;
>    }
>  
> -  Position& GetPosition() {
> +  mozilla::Position& GetPosition() {

Seems unrelated?  Why is this needed?
Attachment #8948058 - Flags: review?(bzbarsky) → review-
Comment on attachment 8948058 [details]
Bug 1434376 - Introduce ChromeOnly window.promiseDocumentFlushed to detect when refresh driver ticks have completed.

https://reviewboard.mozilla.org/r/217692/#review226532

> You can't assert that.  If you had multiple PromiseDocumentFlushedCallbacks, an earlier one could have messed up.

This is true, but I also want to make it next to (if not totally) impossible to accidentally land code that queues up flushes inside any of the callbacks. I'm hoping that a debug-build crash will be enough to catch anything that accidentally slips through.

For example, a consumer of the API might wrap promiseDocumentFlushed in a try/catch, and just swallow the NS_ERROR_DOM_NO_MODIFICATION_ALLOWED_ERR without doing anything. I'd really like to avoid that kind of abuse.

Given the above, is the MOZ_ASSERT still unreasonable? Should I use something else?

> What if the shell just hasn't been created yet?  What's the expected behavior in that case?

I don't think I'm fully understanding here. If the shell hasn't been created yet, won't `shell` be `nullptr`? And if so, well, we throw an NS_ERROR_FAILURE. I thought that seemed like a reasonably reaction.

I don't think that's the right interpretation of your question - but I don't know how else to read it. Can you please re-phrase it?

> Um... Having this sync callback here is pretty weird from my point of view.

I guess I just really want to avoid running the JS at any time where we're not 100% sure that style and layout flushes aren't queued up. FWIW, this is how BrowserUtils.promiseLayoutFlushed worked as well.

Is this prohibitively weird?

> Each of those calls runs script synchronously.  After the first one, you have no guarantee that "shell" (or for that matter "this") is alive.

The script running within these callbacks should only ever be _reading_ style and layout values, so if the script somehow causes the shell or window to go away, something's gone wrong. I'm happy to diagnostic-assert that the shell and window are still present after each call. Or would you recommend something else? Holding a (kung-fu-death) grip on the shell / this while running each callback?

> So why do we have the sync callback?
> 
> Would it not make more sense to just return a Promise and if people want callbacks to run, they can .then() it.
> 
> If we want to differentiate between "we're being torn down" and "refresh driver ticked" we can encode that in the resolution value for the Promise.

I want to ensure that the callback() runs at a very particular time - _right_ after the DidRefresh driver tick, to avoid the possibility that other Promises in the microtask queue end up resolving first, and queueing up style or layout flushes before we get a chance to run.

Is that concern unfounded?

> Seems unrelated?  Why is this needed?

When I imported nsRefreshDriver.h into nsGlobalInnerWindow, there was apparently enough ambiguity over things called "Position" over here in this other header (which is presumably imported as a result of nsRefreshDriver.h), that I had to disambiguate.
Thanks for the fast and thorough review, bz! I have a few return questions - please see above in comment 58.
Flags: needinfo?(bzbarsky)
> I also want to make it next to (if not totally) impossible to accidentally
> land code that queues up flushes inside any of the callbacks

I guess my past experience with C++ asserting things that depended on JS maintaining some invariant is ... not hopeful.  Typically a light application of fuzzing demonstrated that the JS had or could be cajoled into having side effects it did not intend.

> I'm hoping that a debug-build crash will be enough to catch anything that accidentally slips through.

I guess we can try....  I wonder how many people even run debug builds.  And whether we'll have enough test coverage to catch anything useful there.

> Should I use something else?

MOZ_DIAGNOSTIC_ASSERT would probably be better, if we're trying to really catch problems with it.  Not sure we want to go as far as MOZ_RELEASE_ASSERT.

The lack of information about the misbehaving callee might end up annoying.

> If the shell hasn't been created yet, won't `shell` be `nullptr`?

Yes.  Let me give a concrete example:

   <iframe style="display: none"></iframe>
   <script>
     var i = document.querySelector("iframe");
     i.style.display = "";
     i.contentWindow.promiseDocumentFlushed(() => 1);
   </script>

Do you expect that promiseDocumentFlushed call to throw?  Because with this patch it will: we toggled style on the iframe, but haven't processed that style change yet, so there is no presshell in there yet.  As soon as we flush styles on ourselves that presshell in the child will appear....

> I just really want to avoid running the JS at any time where we're not 100% sure that style and layout flushes aren't queued up.

I guess I just feel like things that sometimes reenter you sync and sometimes async are pretty hard to work with.

At the very least, this behavior needs to be clearly documented.

> The script running within these callbacks should only ever be _reading_ style and layout values,

OK, but _reading_ style in general flushes layout on the parent, which can trigger resize events and arbitrary script execution.

Not that the parent flush happens even if our presshell needs no style flush, precisely because changes to parent layout can cause our styles to change (and cause us to need a style flush).

> to avoid the possibility that other Promises in the microtask queue end up resolving first

OK.  This should be very clearly documented, because it's non-obvious.  ;)
Flags: needinfo?(bzbarsky)
Blocks: 1383367
(In reply to Boris Zbarsky [:bz] (no decent commit message means r-) from comment #60)
> > to avoid the possibility that other Promises in the microtask queue end up resolving first
> 
> OK.  This should be very clearly documented, because it's non-obvious.  ;)

Yes, and in particular, as I mentioned in comment 47, all the pending callbacks of promiseDocumentFlushed will be called first, and all the Promise resolution handlers will be invoked afterwards. In fact the promise resolution handlers will very likely modify the DOM. That's why we need both a synchronous callback and a resolution Promise.

Mike, can you add a very simple test case for execution order?
(In reply to Boris Zbarsky [:bz] (no decent commit message means r-) from comment #60)
>    <iframe style="display: none"></iframe>
>    <script>
>      var i = document.querySelector("iframe");
>      i.style.display = "";
>      i.contentWindow.promiseDocumentFlushed(() => 1);
>    </script>

Looks like this can become a test case too. Thanks :-)
Comment on attachment 8948058 [details]
Bug 1434376 - Introduce ChromeOnly window.promiseDocumentFlushed to detect when refresh driver ticks have completed.

https://reviewboard.mozilla.org/r/217692/#review226766

::: dom/base/nsGlobalWindowInner.cpp:875
(Diff revision 8)
> +  void Call(nsIPresShell* aShell = nullptr)
> +  {
> +    MOZ_ASSERT(nsContentUtils::IsSafeToRunScript());
> +
> +    if (aShell) {
> +      MOZ_ASSERT(!aShell->NeedFlush(FlushType::Style));

In any case, if you have three promiseDocumentFlushed callbacks scheduled and it is the first one that dirtied the DOM, you don't want to assert again for the other two calls, or even raise an exception for what it's worth.

If you want an assertion it should be logically placed near the error.ThrowDOMException call, and it should be fired even if the callback raised a different exception. Unfortunately, I don't know if an assertion would allow you to write tests that pass...

::: dom/base/nsGlobalWindowInner.cpp:7540
(Diff revision 8)
> +    return;
> +  }
> +
> +  rejectionGuard.release();
> +
> +  for (const auto& documentFlushedCallback : mDocumentFlushedResolvers) {

nit: the for loop variable should be called documentFlushedResolver, to avoid confusion with the synchronous callback stored inside the resolver.
(In reply to :Paolo Amadini from comment #63)
> If you want an assertion it should be logically placed near the
> error.ThrowDOMException call, and it should be fired even if the callback
> raised a different exception. Unfortunately, I don't know if an assertion
> would allow you to write tests that pass...

Also note that if the callback dirtied the DOM and _then_ called getBoundingClientRect, the exception would not be thrown and the assertion would not fire. This would be easily caught by tests for synchronous reflows, though.
Comment on attachment 8948058 [details]
Bug 1434376 - Introduce ChromeOnly window.promiseDocumentFlushed to detect when refresh driver ticks have completed.

https://reviewboard.mozilla.org/r/217692/#review226774

::: dom/base/nsGlobalWindowInner.cpp:877
(Diff revision 8)
> +      // If no shell was passed, that means that the window was in the process
> +      // of shutting down. We'll just resolve without running the callbacks.
> +      mPromise->MaybeResolveWithUndefined();
> +      return;

Can we still call the callbacks in this case, which may likely raise exceptions but maybe not, and then resolve or reject the returned Promise?

It seems to me we may just want to keep the natural code flow in the caller, instead of adding a special case which is to resolve the Promise to "undefined". The behavior is likely broken in both cases, but probably easier to follow when debugging.
Comment on attachment 8949157 [details]
Bug 1434376 - Add basic tests for window.promiseDocumentFlushed.

https://reviewboard.mozilla.org/r/218558/#review226770

::: dom/base/test/browser_promiseDocumentFlushed.js:1
(Diff revision 5)
> +"use strict";

Public domain header?

::: dom/base/test/browser_promiseDocumentFlushed.js:32
(Diff revision 5)
> +/**
> + * Returns true if exception ex is a NO_MODIFICATION_ALLOWED_ERR
> + * DOMException, meaning that the callback passed to
> + * promiseDocumentFlushed possibly caused style or layout flushing to be
> + * queued.
> + *
> + * @param {Exception} An exception that promiseDocumentFlushed rejected
> + *        its Promise with.
> + * @returns {bool} true if the exception is NO_MODIFICATION_ALLOWED_ERR.
> + */
> +function isNotModifiedException(ex) {
> +  return (ex instanceof DOMException &&
> +          ex.code == DOMException.NO_MODIFICATION_ALLOWED_ERR);
> +}

I think we can just inline the function, with no comment.

The documentation of the exception should be added to the description in "Window.webidl" instead.

::: dom/base/test/browser_promiseDocumentFlushed.js:52
(Diff revision 5)
> +add_task(async function setup() {
> +  registerCleanupFunction(cleanTheDOM);
> +});
> +
> +/**
> + * Tests that if the DOM is dirty, that BTU.promiseDocumentFlushed will

Obsolete references to BrowserTestUtils.

::: dom/base/test/browser_promiseDocumentFlushed.js:55
(Diff revision 5)
> +
> +/**
> + * Tests that if the DOM is dirty, that BTU.promiseDocumentFlushed will
> + * resolve once layout and style have been flushed.
> + */
> +add_task(async function test_pDF() {

nit: Remove _pDF from test function names and call this "test_basic", as it's redundant with the test file name.

Also, I would just always spell out the function name fully in code comments, since PDF definitely isn't a unique enough acronym for text searches ;-)

::: dom/base/test/browser_promiseDocumentFlushed.js:59
(Diff revision 5)
> +    let utils = document.defaultView
> +                        .QueryInterface(Ci.nsIInterfaceRequestor)
> +                        .getInterface(Ci.nsIDOMWindowUtils);
> +    Assert.ok(!utils.needsFlush(Ci.nsIDOMWindowUtils.FLUSH_STYLE),
> +              "No flushes are required.");

Just add an assertNoFlushPending() helper. You can probably also set up gDOMWindowUtils utils once.

::: dom/base/test/browser_promiseDocumentFlushed.js:113
(Diff revision 5)
> +/**
> + * Tests that dirtying just the style of the window still lets
> + * BTU.promiseDocumentFlushed be resolved quickly.
> + */
> +add_task(async function test_dirtying_only_style_pDF() {
> +  await window.promiseDocumentFlushed(() => {});
> +
> +  dirtyStyle();
> +
> +  let utils = document.defaultView
> +                      .QueryInterface(Ci.nsIInterfaceRequestor)
> +                      .getInterface(Ci.nsIDOMWindowUtils);
> +  Assert.ok(utils.needsFlush(Ci.nsIDOMWindowUtils.FLUSH_STYLE),
> +            "Style needs to be flushed.");
> +
> +  await window.promiseDocumentFlushed(() => {});
> +
> +  Assert.ok(!utils.needsFlush(Ci.nsIDOMWindowUtils.FLUSH_STYLE),
> +            "No flushes are required.");
> +
> +  await cleanTheDOM();
> +});

Looks like this wants to be similar to "test_basic", probably you should just place this next to it, and do the same check inside the callback instead of after the Promise is resolved.

::: dom/base/test/browser_promiseDocumentFlushed.js:153
(Diff revision 5)
> +    // This shouldn't happen.
> +    called = true;

See my previous comment in the bug, we may want to test "return 5" instead and that "await promise" is 5.

::: dom/base/test/browser_promiseDocumentFlushed.js:166
(Diff revision 5)
> +
> +/**
> + * Test that values returned by the callback passed to rDF get
> + * passed down through the Promise resolution.
> + */
> +add_task(async function test_can_get_results_from_callback() {

I would move this right after "test_basic" and add another test that calls element.getBoundingClientRect() inside the callback.

These are the main use cases and having them at the beginning of the file provides a demonstration of how the function should be used.

Can we test that this doesn't trigger a synchronous reflow in the same way we do in the dedicated test in the browser window?
Attachment #8949157 - Flags: review?(paolo.mozmail)
Comment on attachment 8948059 [details]
Bug 1434376 - Switch over all uses of BrowserUtils.promiseLayoutFlushed to window.promiseDocumentFlushed.

https://reviewboard.mozilla.org/r/217694/#review226776

::: browser/components/places/content/browserPlacesViews.js:1605
(Diff revision 8)
> -      // TODO: This should use promiseLayoutFlushed("layout"), so that
> +      // TODO: This should use promiseDocumentFlushed, so that
>        // _updateNodesVisibilityTimerCallback can use getBoundsWithoutFlush. But
>        // for yet unknown reasons doing that causes intermittent failures,
>        // apparently due the flush not happening in a meaningful time.
>        window.requestAnimationFrame(this._updateNodesVisibilityTimerCallback.bind(this));

nit: Maybe file a Places bug and reference it here? We likely know the reasons why this disn't work now.

::: toolkit/modules/BrowserUtils.jsm:390
(Diff revision 8)
>      }
>      let bounds = dwu.getBoundsWithoutFlushing(toolbarItem);
>      if (!bounds.height) {
> -      let document = element.ownerDocument;
> +      await window.promiseDocumentFlushed(() => {
> -      await BrowserUtils.promiseLayoutFlushed(document, "layout", () => {
>          bounds = dwu.getBoundsWithoutFlushing(toolbarItem);

nit: We should probably also have a bug on file for converting getBoundsWithoutFlushing calls inside promiseDocumentFlushed callbacks into standards-compliant getBoundingClientRect calls.

(Might not make sense here because we call the function anyways outside the callback, but may help us remove references to "dwu" in other parts of the code.)
Attachment #8948059 - Flags: review?(paolo.mozmail)
Comment on attachment 8948059 [details]
Bug 1434376 - Switch over all uses of BrowserUtils.promiseLayoutFlushed to window.promiseDocumentFlushed.

https://reviewboard.mozilla.org/r/217694/#review226778
Attachment #8948059 - Flags: review+
(In reply to Boris Zbarsky [:bz] (no decent commit message means r-) from comment #60)
> MOZ_DIAGNOSTIC_ASSERT would probably be better, if we're trying to really
> catch problems with it.  Not sure we want to go as far as MOZ_RELEASE_ASSERT.
> 
> The lack of information about the misbehaving callee might end up annoying.

That's true. I suppose we have to find a balance here - hopefully developers who use this API get the message that they shouldn't dirty the DOM inside of it. I'm inclined to start with a blunter instrument like MOZ_DIAGNOSTIC_ASSERT, and only consider relaxing it if it turns out that there's real developer pain here.

So I'll switch to MOZ_DIAGNOSTIC_ASSERT, and then fallback to causing the Promise to reject if we're in release builds (though it really shouldn't get that far).

> 
> Do you expect that promiseDocumentFlushed call to throw?  Because with this
> patch it will: we toggled style on the iframe, but haven't processed that
> style change yet, so there is no presshell in there yet.  As soon as we
> flush styles on ourselves that presshell in the child will appear....

I see - thanks for the concrete example. In this case, I think it's fine for us to throw - I would expect in this case that we'd need to use:

i.style.display = "";

window.promiseDocumentFlushed(() => {}).then(() => {
  i.contentWindow.promiseDocumentFlushed(() => {
    // Take some measurement
    return measurement;
  }).then((measurement) => {
    // Do something with the measurement.
  });
});

And I think that's okay. I don't expect us to need to use this API for iframe contentWindows, fwiw.

> I guess I just feel like things that sometimes reenter you sync and
> sometimes async are pretty hard to work with.
> 
> At the very least, this behavior needs to be clearly documented.

Agreed. I'll try to make this clearer in my next revision.

> 
> OK, but _reading_ style in general flushes layout on the parent, which can
> trigger resize events and arbitrary script execution.
> Not that the parent flush happens even if our presshell needs no style
> flush, precisely because changes to parent layout can cause our styles to
> change (and cause us to need a style flush).

Hm. Just so we're clear, am I understanding that it's possible for there to be nested PresShells, where the parent PresShell reports that it needs a flush but the child PresShell reports that it does not, and that attempts to query for style or layout information in the child will cause a flush to occur, despite the fact that the child said that it doesn't need to flush?

> 
> OK.  This should be very clearly documented, because it's non-obvious.  ;)

Agreed.


ni?ing for the question about the nested PresShells.
Flags: needinfo?(bzbarsky)
> hopefully developers who use this API get the message that they shouldn't dirty the DOM inside of it.

My point is that most of the time JS authors have no way to know whether they're dirtying the DOM.  It depends on what they do, exactly, and on what JS ran before them.  :(

> In this case, I think it's fine for us to throw

OK.

> it's possible for there to be nested PresShells, where the parent PresShell reports
> that it needs a flush but the child PresShell reports that it does not

Correct.

> and that attempts to query for style or layout information in the child
> will cause a flush to occur, despite the fact that the child said that it doesn't need to flush?

Correct.  More precisely, they will cause a flush on the parent, which can cause the child to need a flush, which will then be performed.

Again, if we're not using this API on subframes, then this is a non-issue.

In fact, we could just have this API always throw for subframes, so we don't have to worry about those cases....
Flags: needinfo?(bzbarsky)
(In reply to Boris Zbarsky [:bz] (no decent commit message means r-) from comment #70)
> In fact, we could just have this API always throw for subframes, so we don't
> have to worry about those cases....

Yeah, let's do that. Thanks!
Let me get one more take on this assertion thing. paolo, do you think crashing with a MOZ_DIAGNOSTIC_ASSERT is the right move here? Part of me really likes the idea of really making it next to impossible to use this thing incorrectly going into the future - but I'm also aware that crashing like this gives zero feedback to a developer beyond the fact that something's gone wrong.

What behaviour do you think is most appropriate here?
Flags: needinfo?(paolo.mozmail)
I wouldn't use MOZ_DIAGNOSTIC_ASSERT here. Especially if we start using promiseDocumentFlushed extensively, it might be possible that some offending code is added that isn't triggered except for some special circumstances, for example having the pointer hovering on a certain part of the UI.

If something like this happens, it would result in apparently random crashes during normal usage of the browser, making it unusable for people who maybe are working on a completely different area, and with little information for a front-end developer to figure out what is going on. It would likely appear as an annoying platform bug.

We can also recover perfectly from any such error, so crashing doesn't seem necessary.

Instead, we could fail regression tests when we detect an issue. We don't have a helper function for this, unfortunately, but creating a rejected Promise without any handler would have this effect. We can have a stack trace and a helpful message linking to the MDN article. We can also log a message unconditionally to the Browser Console during normal execution, so that even if callers swallow the exception we return, we still have a track of what has happened.

Separately, we might consider having a preference that can be set during regression tests, and it would use a mutation observer and/or a reflow observer to check for issues during the execution of the callback. This would be able to catch an additional class of misbehavior like the one I mentioned in comment 64, and provide more details on which mutation caused the issue.

I actually recommend filing a separate bug for adding these diagnostics, and land a first version of the basic functionality that doesn't check for correct callback behavior at all. We don't have this anyways in promiseLayoutFlushed.
Flags: needinfo?(paolo.mozmail)
Blocks: 1440070
Okay, I've tried to scope down what I want to do in this bug so that we can get a minimal behaviour landed, and file follow-ups.

The big changes are:

1. I've changed how mDocumentFlushedResolvers is cycle collected. This is my interpretation of what I should do based on a conversation I had with smaug. It's possible my interpretation is wrong, inefficient, or non-idiomatic.
2. I've made re-entering promiseDocumentFlushed from within a callback throw an exception, and included a test
3. The callbacks are called even when the window is going away
4. I wrote a test for execution order
5. I applied the test feedback I received
6. I fleshed out the Window.webidl documentation a bit more.

Here are things that I have _excluded_ from this patch, and will file follow-ups to do instead:

1. Throwing and/or otherwise spewing out hard-to-miss diagnostic information if the DOM is dirtied inside a callback, and making sure that information is visible and useful to front-end developers
2. Throwing if promiseDocumentFlushed is called upon a subframe
3. Detecting and / or throwing if a flush is triggered from within a callback

Try push is up, and I'll request review if and when it comes back satisfactorily green: https://treeherder.mozilla.org/#/jobs?repo=try&revision=a831128253b9a1a55722b3969c543caee30c4d6d
Comment on attachment 8948059 [details]
Bug 1434376 - Switch over all uses of BrowserUtils.promiseLayoutFlushed to window.promiseDocumentFlushed.

Sorry paolo, I think I accidentally cleared your r+ here. :/ And you might need to (re)-issue it in MozReview if I'm to use autoland.
Attachment #8948059 - Flags: review?(paolo.mozmail)
Attachment #8949157 - Flags: review?(paolo.mozmail)
ni?ing bz for a review on the first patch in this series when you have time.
Flags: needinfo?(bzbarsky)
Comment on attachment 8948059 [details]
Bug 1434376 - Switch over all uses of BrowserUtils.promiseLayoutFlushed to window.promiseDocumentFlushed.

https://reviewboard.mozilla.org/r/217694/#review228270
Attachment #8948059 - Flags: review?(paolo.mozmail) → review+
Comment on attachment 8949157 [details]
Bug 1434376 - Add basic tests for window.promiseDocumentFlushed.

https://reviewboard.mozilla.org/r/218558/#review228272

This looks great. Thanks!

::: dom/base/test/browser_promiseDocumentFlushed.js:65
(Diff revision 6)
> +/**
> + * Tests that if the DOM is dirty, that promiseDocumentFlushed will
> + * resolve once layout and style have been flushed.
> + */
> +add_task(async function test_basic() {
> +  dirtyStyleAndLayout();

nit: assertFlushesRequired() here too

::: dom/base/test/browser_promiseDocumentFlushed.js:148
(Diff revision 6)
> +});
> +
> +/**
> + * Test that if rDF is requested on a window that closes before it
> + * gets a chance to do a refresh driver tick, the rDF Promise is
> + * still resolved, even though the callback is not called.

...and the callback is still called.
Attachment #8949157 - Flags: review?(paolo.mozmail) → review+
Comment on attachment 8948058 [details]
Bug 1434376 - Introduce ChromeOnly window.promiseDocumentFlushed to detect when refresh driver ticks have completed.

https://reviewboard.mozilla.org/r/217692/#review228698

r=me with the issues below fixed.

::: dom/base/nsGlobalWindowInner.cpp:1570
(Diff revision 9)
>    NS_IMPL_CYCLE_COLLECTION_TRAVERSE(mChromeFields.mGroupMessageManagers)
>  
>    NS_IMPL_CYCLE_COLLECTION_TRAVERSE(mPendingPromises)
> +
> +  for (size_t i = 0; i < tmp->mDocumentFlushedResolvers.Length(); i++) {
> +    NS_IMPL_CYCLE_COLLECTION_TRAVERSE(mDocumentFlushedResolvers[i]->mPromise);

You need to traverse mCallback too, no?

I assume this is not traversing mDocumentFlushedResolvers in general because we don't want to add cycle collection to those?  But then why are they refcounted at all?

::: dom/base/nsGlobalWindowInner.cpp:1668
(Diff revision 9)
>      NS_IMPL_CYCLE_COLLECTION_UNLINK(mChromeFields.mGroupMessageManagers)
>    }
>  
>    NS_IMPL_CYCLE_COLLECTION_UNLINK(mPendingPromises)
> +  for (size_t i = 0; i < tmp->mDocumentFlushedResolvers.Length(); i++) {
> +    NS_IMPL_CYCLE_COLLECTION_UNLINK(mDocumentFlushedResolvers[i]->mPromise);

Need to unlink mCallback.  And clear this list after we're done with this, so we don't accidentally walk this list of unlinked things.

::: dom/base/nsGlobalWindowInner.cpp:7476
(Diff revision 9)
> +  RefPtr<Promise> resultPromise = Promise::Create(global, aError);
> +  if (aError.Failed()) {
> +    return nullptr;
> +  }
> +
> +  RefPtr<PromiseDocumentFlushedResolver> flushResolver =

So if the resolver class were not refcounted, you'd use UniquePtr here instead and in the list and I think it should all work.

::: dom/base/nsGlobalWindowInner.cpp:7525
(Diff revision 9)
> +
> +void
> +nsGlobalWindowInner::DidRefresh()
> +{
> +  auto rejectionGuard = MakeScopeExit([&] {
> +    mObservingDidRefresh = false;

Why is the member set before walking the list here, but after other places?
Attachment #8948058 - Flags: review+
Comment on attachment 8948058 [details]
Bug 1434376 - Introduce ChromeOnly window.promiseDocumentFlushed to detect when refresh driver ticks have completed.

https://reviewboard.mozilla.org/r/217692/#review228698

> Why is the member set before walking the list here, but after other places?

Because I'm inconsistent, I guess. :) Fixed.
Comment on attachment 8948058 [details]
Bug 1434376 - Introduce ChromeOnly window.promiseDocumentFlushed to detect when refresh driver ticks have completed.

https://reviewboard.mozilla.org/r/217692/#review228724
We're sorry, Autoland could not rebase your commits for you automatically. Please manually rebase your commits and try again.

hg error in cmd: hg rebase -s a0b2ac0de159913e12089a149febc74151391c1d -d f89b5611098e: rebasing 448626:a0b2ac0de159 "Bug 1434376 - Introduce ChromeOnly window.promiseDocumentFlushed to detect when refresh driver ticks have completed. r=bz"
merging dom/base/nsGlobalWindowInner.cpp
merging dom/base/nsGlobalWindowInner.h
merging dom/webidl/Window.webidl
merging layout/style/nsStyleStruct.h
warning: conflicts while merging dom/base/nsGlobalWindowInner.h! (edit, then use 'hg resolve --mark')
unresolved conflicts (see hg resolve, then hg rebase --continue)
Pushed by mconley@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/4e6a21e50be0
Introduce ChromeOnly window.promiseDocumentFlushed to detect when refresh driver ticks have completed. r=bz
https://hg.mozilla.org/integration/autoland/rev/1b474fb6d798
Add basic tests for window.promiseDocumentFlushed. r=Paolo
https://hg.mozilla.org/integration/autoland/rev/2ed5aefc5bc2
Switch over all uses of BrowserUtils.promiseLayoutFlushed to window.promiseDocumentFlushed. r=Paolo
Thanks.

Ooof, bit by unified builds here; looks like windows.h includes a macro called GetCurrentTime, so if you end up shuffling code such that suddenly windows.h is included where another function called GetCurrentTime is used, you get an explosion like this.

It appears that elsewhere in the tree, we work around this by just undefining the macro in the file where the problem appears. Example: https://searchfox.org/mozilla-central/rev/88e89261ea81860f44a44ed068cf5839aea5def6/dom/animation/Animation.h#30-34

I'll just do the same for SystemTimeConverter.h, I guess.
Flags: needinfo?(mconley)
I'm rather annoyed at myself for not noticing that this was broken in my try pushes. I have a new Windows-only try push with the latest patches here: https://treeherder.mozilla.org/#/jobs?repo=try&revision=67d664f9319ac94cbe8b6dde61d31a912a03b1db
Flags: needinfo?(bzbarsky)
We're sorry, Autoland could not rebase your commits for you automatically. Please manually rebase your commits and try again.

hg error in cmd: hg rebase -s 2ffe618eedcd7127c6611bb6b77833a59828adc0 -d 2da204077c8f: rebasing 448789:2ffe618eedcd "Bug 1434376 - Introduce ChromeOnly window.promiseDocumentFlushed to detect when refresh driver ticks have completed. r=bz"
merging dom/base/nsGlobalWindowInner.cpp
merging dom/base/nsGlobalWindowInner.h
merging dom/webidl/Window.webidl
merging layout/style/nsStyleStruct.h
rebasing 448790:dc14eb131c3d "Bug 1434376 - Add basic tests for window.promiseDocumentFlushed. r=Paolo"
merging dom/base/test/browser.ini
rebasing 448791:8efb346d7f03 "Bug 1434376 - Switch over all uses of BrowserUtils.promiseLayoutFlushed to window.promiseDocumentFlushed. r=Paolo" (tip)
merging browser/base/content/tabbrowser.xml
merging browser/components/customizableui/CustomizeMode.jsm
merging browser/components/customizableui/PanelMultiView.jsm
merging browser/components/places/content/browserPlacesViews.js
merging toolkit/modules/BrowserUtils.jsm
warning: conflicts while merging toolkit/modules/BrowserUtils.jsm! (edit, then use 'hg resolve --mark')
unresolved conflicts (see hg resolve, then hg rebase --continue)
Pushed by mconley@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/b5128504011c
Introduce ChromeOnly window.promiseDocumentFlushed to detect when refresh driver ticks have completed. r=bz
https://hg.mozilla.org/integration/autoland/rev/fccbba9cb959
Add basic tests for window.promiseDocumentFlushed. r=Paolo
https://hg.mozilla.org/integration/autoland/rev/b636251b75ab
Switch over all uses of BrowserUtils.promiseLayoutFlushed to window.promiseDocumentFlushed. r=Paolo
Thanks apavel.

I have a patch in bug 1401072 that I think will fix this. It's on autoland. I've got another try build cooking that's doing a bunch of retriggers to make sure that this is resolved, and then I'll re-land this stuff.
Depends on: 1401072
Flags: needinfo?(mconley)
I'm sufficiently satisfied with my 10 retriggers on each platform that the test failure has gone away thanks to the patch in bug 1401072. I'm re-landing now. Third time's the charm.
Pushed by mconley@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/0961e7b6ec48
Introduce ChromeOnly window.promiseDocumentFlushed to detect when refresh driver ticks have completed. r=bz
https://hg.mozilla.org/integration/autoland/rev/deaf7aae5dbc
Add basic tests for window.promiseDocumentFlushed. r=Paolo
https://hg.mozilla.org/integration/autoland/rev/521ceb5e9dff
Switch over all uses of BrowserUtils.promiseLayoutFlushed to window.promiseDocumentFlushed. r=Paolo
Depends on: 1440973
https://hg.mozilla.org/mozilla-central/rev/0961e7b6ec48
https://hg.mozilla.org/mozilla-central/rev/deaf7aae5dbc
https://hg.mozilla.org/mozilla-central/rev/521ceb5e9dff
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 60
Depends on: 1441154
Blocks: 1441168
Blocks: 1441173
Depends on: 1442020
Depends on: 1442218
Depends on: 1450171
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: