Closed Bug 1330375 Opened 7 years ago Closed 6 years ago

interleaved innerText calls and display:none sets trigger significantly more reflow overhead than chrome

Categories

(Core :: Layout: Tables, defect)

50 Branch
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla63
Performance Impact high
Tracking Status
firefox63 --- fixed

People

(Reporter: guignot.patrick, Assigned: u480271)

References

(Blocks 1 open bug)

Details

(Keywords: perf, Whiteboard: [wptsync upstream])

Attachments

(5 files)

User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:50.0) Gecko/20100101 Firefox/50.0
Build ID: 20161209095719

Steps to reproduce:

Go to http://patrickguignot.free.fr/sf/nouvelles_sf.html

Search something in the top right search bar.


Actual results:

For example when I search for "Egan, Greg" it gave me the result page after the following delays : 

Chrome 55.0.2883.87 (on Ubuntu 16.04) : 4 seconds
Firefox 50.1.0 (on Ubuntu 16.04) : 22 seconds


Expected results:

Same performance for Chrome and Firefox.
gecko profiler : https://clptr.io/2j9ajmT
Status: UNCONFIRMED → NEW
Component: Untriaged → DOM: Core & HTML
Ever confirmed: true
Keywords: perf
Product: Firefox → Core
We're spending lots of time reflowing, according to the profile. I think this falls under Layout.
Component: DOM: Core & HTML → Layout
Component: Layout → Layout: Tables
Attached file Nouvelles SF.htm
Regarding Node.textContent, Firefox is x6 slower than chrome.
That is very different testcase. The issue this bug is about is layout flush inside innerText.
textContent doesn't deal with layout.
The new bug should be filed on XPConnect.
Or possibly DOM.
Whiteboard: [qf]
Whiteboard: [qf] → [qf:p1]
Whiteboard: [qf:p1] → [qf:p2]
Whiteboard: [qf:p2] → [qf:p1]
Whiteboard: [qf:p1] → [qf:i60][qf:p1]
Whiteboard: [qf:i60][qf:p1] → [qf:f60][qf:p1]
Whiteboard: [qf:f60][qf:p1] → [qf:f61][qf:p1]
This is still pretty slow.  It takes 10 seconds on my new MacBook Pro, almost a quarter of which is under nsTableRowFrame::GetRowIndex.

https://perfht.ml/2JEhb5w
The doSearch() function in the script takes a bit over a second because it loops over all table rows, and for each row gets its innerText (which flushes layout) then toggles a class name on the element.
Actually basically all of the time is under doSearch() (not sure why in the profile there are two paths to it).  Since we flush layout after inspecting each row, we end up reflowing the entire table many times.

Maybe we can make innerText more clever about when it needs to do its flush?
But those classes only change the background color, why are we reflowing?
If the item doesn't match the search string, then sets class="hidden", which makes the row display:none.
Ah, thought they were only the entry0 / entry1 classes.
Whiteboard: [qf:f61][qf:p1] → [qf:f64][qf:p1]
Whiteboard: [qf:f64][qf:p1] → [qf:p1:f64]
Chrome's performance on this testcase is an existence proof that we can do better - by avoiding the flushes, speeding them up, or both.
Summary: Performance problem compared to Chrome → innerleaved innerText calls and display:none sets trigger 5x more reflow overhead than chrome
Summary: innerleaved innerText calls and display:none sets trigger 5x more reflow overhead than chrome → interleaved innerText calls and display:none sets trigger 5x more reflow overhead than chrome
My local testing shows ~4s in chrome vs ~7s in firefox since comment 0, so we've improved things a fair bit. But there's clearly room for more.
Summary: interleaved innerText calls and display:none sets trigger 5x more reflow overhead than chrome → interleaved innerText calls and display:none sets trigger significantly more reflow overhead than chrome
Here's a profile of searching abcdefg in a recent nightly: https://perfht.ml/2MrXuiN

Note that there's a separate-looking section at the end, but I'm pretty sure that's just the result of us kicking over to the JIT and that messing up the stack traces in the profiler.
Also note that, while proper reflow is the lion's share of the work (57%), we spend 31% of the time computing reflow inputs in BasicTableLayoutStrategy::ComputeColumnIntrinsicISizes. The flame graph makes this easier to visualize. This could theoretically account for the delta with chrome.
Reading the spec, it is unclear to me why we need a layout flush there. A frame construction flush should be enough for innerText. The text processing steps might need factoring some code out from current reflow path, but that's probably not that bad.

The current code seems to rely on line boxes, but as far as I can see, the spec doesn't need that. All the transformation should be derivable from DOM and style without flushing layout.
Does anything in https://github.com/rocallahan/innerText-spec hint at why we might need to flush layout?
Hmmm... :first-line can definitely be a valid reason why we need layout... I wonder if we can ignore that. But even if we can, it would still be tricky if we collect text based on frame. Maybe we should make that based on DOM instead.
Do we also need to flush layout to get the results of text-transform?
I don't think so. I think that code should be pretty independent, but I cannot say for certain without figuring out the actual code. Even if it's in reflow code, we should be able to factor it out. I see no reason it needs to depend on geometry, so we can well handle it independently with just style available. That may lead to unnecessary work if we end up needing to recalculate it in reflow but I don't think that's something hot enough to worry about.
So more fundamentally, white space processing is something that happens during reflow and which must be reflected by innerText.  For example

  <div style="white-space:normal"> A     B C </div>

should give "A B C".

https://html.spec.whatwg.org/#inner-text-collection-steps iterates over the DOM tree, but is defined in terms of the CSS boxes for those elements.  For each node we call into nsIFrame::GetRenderedText, which relies on reflow having been done.

I was thinking that we need a kind of limited reflow the element we call innerText on, since we shouldn't care about changes in size coming from the parent.  That seems kind of tricky.

David had a good suggestion though: we can flush style, and then see whether we need to flush layout afterwards.  We can skip flushing layout if we can detect that the element or its descendants are not dirty.  And the way we can do that is by checking whether the element has either of the two dirty bits (NS_FRAME_IS_DIRTY or NS_FRAME_HAS_DIRTY_DESCENDANTS) or if any ancestor has NS_FRAME_IS_DIRTY.  We need to check for NS_FRAME_IS_DIRTY on ancestors since that is something that implies NS_FRAME_IS_DIRTY on all descendants.

One complication is that we might have a reflow root somewhere inside the subtree we're calling innerText on, which means that the element we're calling innerText on doesn't have NS_FRAME_HAS_DIRTY_DESCENDANTS.  This might not matter currently since we don't have any reflow roots that have text inside them that can affect innerText.  But for completeness we probably want to check whether any of the reflow roots has the target element as an ancestor.
Assignee: nobody → dglastonbury
(In reply to Cameron McCormack (:heycam) (busy until July 12) from comment #23)
> One complication is that we might have a reflow root somewhere inside the
> subtree we're calling innerText on, which means that the element we're
> calling innerText on doesn't have NS_FRAME_HAS_DIRTY_DESCENDANTS.  This
> might not matter currently since we don't have any reflow roots that have
> text inside them that can affect innerText.  But for completeness we
> probably want to check whether any of the reflow roots has the target
> element as an ancestor.

In particular, the "reflow roots" that we care about here are the ones in PresShell::mDirtyRoots.  (We don't need to care about all frames with NS_FRAME_REFLOW_ROOT set, just the dirty ones, and they're in that array.)
Yeah, that seems like a nice trick!

Please add a perf-reftest for this to make sure we don't regress it.
With the attached changes, searching for "2" goes from "hanging" firefox and causing a slow script warning (~25s) to completing 
"instantly" (225ms).
Testing with http://w3c-test.org/html/dom/elements/the-innertext-idl-attribute/getter.html results in the same 4 failures as before.
Comment on attachment 8992239 [details]
Bug 1330375 - P2: Reduce layout flushes in nsGenericHTMLElement::GetInnerText.

https://reviewboard.mozilla.org/r/257128/#review263960

Looks good.  Mostly comments about comments, plus one correctness issue.

::: dom/html/nsGenericHTMLElement.cpp:2913
(Diff revision 1)
> -  if (!GetPrimaryFrame(FlushType::Layout)) {
> +  // So more fundamentally, white space processing is something that happens
> +  // during reflow and which must be reflected by innerText.  For example:
> +  //
> +  //     <div style="white-space:normal"> A     B C </div>
> +  //
> +  // Should give "A B C".

Let's word this comment to make it clear that this is one example of why innerText depends on layout.  (My comment in Bugzilla starting "So more fundamentally" was in reference to other less common cases in previous comments.)

How about this instead:

// innerText depends on layout. For example, white space processing is
// something that happens during reflow and which must be reflected by
// innerText.  So for
//
//   <div style="white-space:normal"> A     B C </div>
//
// innerText should give "A B C".

::: dom/html/nsGenericHTMLElement.cpp:2920
(Diff revision 1)
> +  // https://html.spec.whatwg.org/#inner-text-collection-steps iterates over the
> +  // DOM tree, but is defined in terms of the CSS boxes for those elements.  For
> +  // each node we call into nsIFrame::GetRenderedText, which relies on reflow
> +  // having been done.

This probably doesn't need to be mentioned.

::: dom/html/nsGenericHTMLElement.cpp:2925
(Diff revision 1)
> +  // The approach take here to avoid the expense of reflow is to flush style and
> +  // then see whether it's necessary to flush layout afterwards. Flushing layout
> +  // can be skipped if we can detect that the element or its descendants are not
> +  // dirty.

s/take here/taken here/

::: dom/html/nsGenericHTMLElement.cpp:2930
(Diff revision 1)
> +  // The way we do that is by checking whether the element has either of the two
> +  // dirty bits (NS_FRAME_IS_DIRTY or NS_FRAME_HAS_DIRTY_DESCENDANTS) or if any
> +  // ancestor has NS_FRAME_IS_DIRTY.  We need to check for NS_FRAME_IS_DIRTY on
> +  // ancestors since that is something that implies NS_FRAME_IS_DIRTY on all
> +  // descendants.

Can you move this comment to just above the loop that does this work, and replace the comment you have there?  They're basically saying the same thing.

::: dom/html/nsGenericHTMLElement.cpp:2960
(Diff revision 1)
> +  }
> +
> +  // `NS_FRAME_HAS_DIRTY_CHILDREN` signals the sub-tree starting at `frame` is
> +  // dirty and needs reflowing.  Otherwise walk in-flow parents looking for
> +  // `NS_FRAME_IS_DIRTY`.
> +  bool dirty = frame && frame->HasAnyStateBits(NS_FRAME_HAS_DIRTY_CHILDREN);

We also need to check for whether this frame has NS_FRAME_IS_DIRTY, since it's possible to have that set but not NS_FRAME_HAS_DIRTY_CHILDREN.

It might be the fact that the only style changes we can make that set NS_FRAME_IS_DIRTY but not NS_FRAME_HAS_DIRTY_CHILDREN won't have an effect on innerText, but I think we should check it to be sure.

(One example that should set dirty bits like this is a change to say margin-top.)
Attachment #8992239 - Flags: review?(cam) → review-
Comment on attachment 8992240 [details]
Bug 1330375 - P3: Add WPT test to ensure innerText works with dynamic changes.

https://reviewboard.mozilla.org/r/257130/#review263964

::: testing/web-platform/tests/html/dom/elements/the-innertext-idl-attribute/dynamic-getter.html:47
(Diff revision 1)
> +
> +testText("<div id='target'>abc<div id='child'>def", "abc",
> +         "display: none applied to child element", function() {
> +           setStyle("child", "display", "none");
> +         });
> +testText("<div id='parent'><div id='target'>abc", "abc",

You should make this test case have a different result if parent weren't display none, e.g. have a leading space in the text child of <div id='target'>.
Attachment #8992240 - Flags: review?(cam) → review+
Comment on attachment 8992241 [details]
Bug 1330375 - P1: Helper for checking for ancestor of dirty reflow root.

https://reviewboard.mozilla.org/r/257132/#review263966

::: testing/talos/talos/tests/perf-reftest/innertext-1-ref.html:17
(Diff revision 1)
> +
> +  perf_start();
> +  let rows = document.getElementsByTagName("tr");
> +  for (r = 0; r < rows.length; r++) {
> +    let row = rows[r];
> +    s = row.innerText;

`let s = row.innerText;` or even just `row.innerText;`.

::: testing/talos/talos/tests/perf-reftest/util.js:40
(Diff revision 1)
>  }
>  
> +function build_table(rows, cols, text) {
> +  rows = rows || 1;
> +  cols = cols || 1;
> +  text = text || null;

Probably doesn't matter too much this would end up using the string "null" by default, since Array.fill() will stringify its argument.  Maybe use "" instead?
Attachment #8992241 - Flags: review?(cam) → review+
Hi Dan, can you please provide a link to push on try with the talos perf-reftest suite running on all platforms? This is just so we can ensure that this test is green on all platforms, and also we can see if the new overall duration of the perf-reftest suite is acceptable. Thank you!
Flags: needinfo?(dglastonbury)
Comment on attachment 8992241 [details]
Bug 1330375 - P1: Helper for checking for ancestor of dirty reflow root.

https://reviewboard.mozilla.org/r/257132/#review264100

Have you tried running the talos per-reftest-e10s suite with the new test added?

I applied all the patches locally on OSX, did a new build and ran the perf-reftest-e10s suite (i.e. ./mach talos-test --suite perf-reftest-e10s) and innertext-1.html times out. Even when you try to load innertext1.html directly in the browser (outside of talos) it doesn't load.

Pleae revisit the new test, thanks!
Attachment #8992241 - Flags: review?(rwood) → review-
(In reply to Robert Wood [:rwood] from comment #34)
> Hi Dan, can you please provide a link to push on try with the talos
> perf-reftest suite running on all platforms? This is just so we can ensure
> that this test is green on all platforms, and also we can see if the new
> overall duration of the perf-reftest suite is acceptable. Thank you!

I've made three pushes for all platforms while testing (oldest to latest): 
* https://treeherder.mozilla.org/#/jobs?repo=try&revision=7e24e3c1930b55f6e380c28aa7fe174679a7e131
* https://treeherder.mozilla.org/#/jobs?repo=try&revision=94b2eaed9fa0710e2b1ceb4322cc3c1d62bd6027
* https://treeherder.mozilla.org/#/jobs?repo=try&revision=1efec40eda7b36500124ed9c934f005b90b9e2df

Eg: linux64
* https://treeherder.mozilla.org/perf.html#/comparesubtest?originalProject=mozilla-central&newProject=try&newRevision=7e24e3c1930b55f6e380c28aa7fe174679a7e131&originalSignature=ba00a06e08d0481c8b387e5454d141fdf8764631&newSignature=ba00a06e08d0481c8b387e5454d141fdf8764631&framework=1&selectedTimeRange=172800
* https://treeherder.mozilla.org/perf.html#/comparesubtest?originalProject=mozilla-central&newProject=try&newRevision=94b2eaed9fa0710e2b1ceb4322cc3c1d62bd6027&originalSignature=ba00a06e08d0481c8b387e5454d141fdf8764631&newSignature=ba00a06e08d0481c8b387e5454d141fdf8764631&framework=1&selectedTimeRange=172800
* https://treeherder.mozilla.org/perf.html#/comparesubtest?originalProject=mozilla-central&newProject=try&newRevision=1efec40eda7b36500124ed9c934f005b90b9e2df&originalSignature=ba00a06e08d0481c8b387e5454d141fdf8764631&newSignature=ba00a06e08d0481c8b387e5454d141fdf8764631&framework=1&selectedTimeRange=172800

Is this the information you're looking for?
Flags: needinfo?(dglastonbury) → needinfo?(rwood)
(In reply to Robert Wood [:rwood] from comment #35)
> I applied all the patches locally on OSX, did a new build and ran the
> perf-reftest-e10s suite (i.e. ./mach talos-test --suite perf-reftest-e10s)
> and innertext-1.html times out. Even when you try to load innertext1.html
> directly in the browser (outside of talos) it doesn't load.

Curious. I've tested locally on linux. I'll try on OSX.
Comment on attachment 8992239 [details]
Bug 1330375 - P2: Reduce layout flushes in nsGenericHTMLElement::GetInnerText.

https://reviewboard.mozilla.org/r/257128/#review263960

> We also need to check for whether this frame has NS_FRAME_IS_DIRTY, since it's possible to have that set but not NS_FRAME_HAS_DIRTY_CHILDREN.
> 
> It might be the fact that the only style changes we can make that set NS_FRAME_IS_DIRTY but not NS_FRAME_HAS_DIRTY_CHILDREN won't have an effect on innerText, but I think we should check it to be sure.
> 
> (One example that should set dirty bits like this is a change to say margin-top.)

NS_FRAME_IS_DIRTY is checked on the first pass of the loop:

```C
for (nsIFrame* parent = frame;
   ...
```

Perhaps `parent` is misleading and could have a different name?
Comment on attachment 8992239 [details]
Bug 1330375 - P2: Reduce layout flushes in nsGenericHTMLElement::GetInnerText.

https://reviewboard.mozilla.org/r/257128/#review263960

> NS_FRAME_IS_DIRTY is checked on the first pass of the loop:
> 
> ```C
> for (nsIFrame* parent = frame;
>    ...
> ```
> 
> Perhaps `parent` is misleading and could have a different name?

I see!  I glossed over that.  If you want to avoid `parent` it would be fine just to just modify `frame` (and make it a while loop).
Comment on attachment 8992239 [details]
Bug 1330375 - P2: Reduce layout flushes in nsGenericHTMLElement::GetInnerText.

https://reviewboard.mozilla.org/r/257128/#review264246
Attachment #8992239 - Flags: review- → review+
Comment on attachment 8992239 [details]
Bug 1330375 - P2: Reduce layout flushes in nsGenericHTMLElement::GetInnerText.

https://reviewboard.mozilla.org/r/257128/#review264310

::: dom/html/nsGenericHTMLElement.cpp:2965
(Diff revision 1)
> +  bool dirty = frame && frame->HasAnyStateBits(NS_FRAME_HAS_DIRTY_CHILDREN);
> +  for (nsIFrame* parent = frame;
> +       parent && !dirty;
> +       parent = parent->GetInFlowParent())
> +  {
> +    dirty |= parent->HasAnyStateBits(NS_FRAME_IS_DIRTY);

We probably discussed this last week, but why don't we need to check the children?

If one of the children is a reflow root, will we have dirtied the parent chain?

We'd still need to go down the tree, right?

In any case it's probably worth a comment here.
I think it may be cheaper to traverse the pres shell's list of dirty reflow roots than to traverse the subtree.  (Particularly since it's likely to be nearly empty most of the time.)
(In reply to Dan Glastonbury (:kamidphish) | needinfo me from comment #36)

> Is this the information you're looking for?

Yes thank you it was this one that I was looking for [1]. The perf-reftest suite with the new inntertext-1 test is all green on all platforms, awesome, and the overall job duration is fine (8-12 min). I don't know why it timed out when I ran it locally though, I'll rebuild and import your patches and try it again.

[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=1efec40eda7b36500124ed9c934f005b90b9e2df
Flags: needinfo?(rwood)
Applied your patches locally on OSX on latest inbound build. Ran the talos perf-reftest suite (./mach talos-test --suite perf-reftest-e10s) and innertext-1.html times out after 60 seconds on each page cycle:

15:50:38     INFO -  PID 10337 | __WARNTimeout (1/3) exceeded on http://localhost:51865/tests/perf-reftest/innertext-1.html__WARN
15:51:08     INFO -  PID 10337 | __WARNTimeout (2/3) exceeded on http://localhost:51865/tests/perf-reftest/innertext-1.html__WARN
15:51:12     INFO -  PID 10337 | Cycle 1(-1): loaded http://localhost:51865/tests/perf-reftest/innertext-1.html (next: http://localhost:51865/tests/perf-reftest/innertext-1-ref.html)
15:51:43     INFO -  PID 10337 | __FAILTimeout in perf-reftest__FAIL
15:51:43     INFO -  PID 10337 | __FAILTimeout (3/3) exceeded on http://localhost:51865/tests/perf-reftest/innertext-1.html__FAIL

Ultimately resulting in:
16:02:46     INFO -  TEST-UNEXPECTED-ERROR | perf_reftest | Timeout in perf-reftest
...
16:02:46     INFO -  TalosError: Timeout in perf-reftest
16:02:46     INFO -  TEST-INFO took 1145485ms
16:02:46     INFO -  SUITE-END | took 1145s

I don't know why it works on your try push but not locally, unless your try push is not the latest patch perhaps? Regardless this cannot land until the perf-reftest passes locally as well. Thanks in advance for having a look at this.

Note: innertext-1-ref.html works fine locally, but it's innertext-1.html that times out.
(In reply to Robert Wood [:rwood] from comment #44)
> Note: innertext-1-ref.html works fine locally, but it's innertext-1.html
> that times out.

I tested locally on my macOS machine. Debug build timed out for both tests. In an optimized build, innertest-1.html takes ~1400ms to run. (./mach talos-test --suite perf-reftest-e10s is also fine and doesn't time out).

If innertext-1-ref.html works but innertext-1.html times out, that sounds like the optimization hasn't been applied in you local build.
Attachment #8992239 - Flags: review+ → review?(cam)
Comment on attachment 8992239 [details]
Bug 1330375 - P2: Reduce layout flushes in nsGenericHTMLElement::GetInnerText.

https://reviewboard.mozilla.org/r/257128/#review264644

::: dom/html/nsGenericHTMLElement.cpp:2950
(Diff revision 2)
> +  }
> +
> +  // Check for dirty reflow roots in the subtree from targetFrame; this requires
> +  // a reflow flush.
> -    nsIPresShell* presShell = nsContentUtils::GetPresShellForContent(this);
> +  nsIPresShell* presShell = nsContentUtils::GetPresShellForContent(this);
> +  bool dirty = presShell && presShell->FrameIsAncestorOfDirtyRoot(frame);

What guarantees that frame is non-null here? (hint: I think nothing, and this will crash on display: none subtrees wouldn't it?).

You can also use frame->PresShell() instead, if you're only using it when there's actually a frame.

::: dom/html/nsGenericHTMLElement.cpp:2971
(Diff revision 2)
> +  }
> +
> +  if (!GetPrimaryFrame()) {
>      // NOTE(emilio): We need to check the presshell is initialized in order to
>      // ensure the document is styled.
>      if (!presShell || !presShell->DidInitialize() ||

I think the shell could have actually gone away here now that you flush after getting it, so you should make it store a strong reference or get it again here...

And my note comment is no longer relevant since the implementation of IsOrHasAncestorWithDisplayNone changed.

::: layout/base/PresShell.cpp:4565
(Diff revision 2)
>    nsAutoCauseReflowNotifier reflowNotifier(this);
>    mFrameConstructor->NotifyCounterStylesAreDirty();
>  }
>  
> +bool
> +PresShell::FrameIsAncestorOfDirtyRoot(nsIFrame* aFrame) {

nit: Brace to the next line.

::: layout/base/nsIPresShell.h:508
(Diff revision 2)
>  
>    virtual void NotifyCounterStylesAreDirty() = 0;
>  
>    /**
> +   * Deterimine if aFrame is an ancestor of any dirty roots.
> +   */

nit: The comment doesn't tell much, I'd remove it.

::: layout/base/nsIPresShell.h:509
(Diff revision 2)
>    virtual void NotifyCounterStylesAreDirty() = 0;
>  
>    /**
> +   * Deterimine if aFrame is an ancestor of any dirty roots.
> +   */
> +  virtual bool FrameIsAncestorOfDirtyRoot(nsIFrame* aFrame) = 0;

Instead of making this virtual can we move mDirtyRoots to `nsIPresShell`?

There's no good reason for stuff to be virtual in PresShell other than "I haven't got to devirtualize everything".

Also should probably be const.
Attachment #8992239 - Flags: review?(emilio) → review-
(In reply to Dan Glastonbury (:kamidphish) | needinfo me from comment #45)
> (In reply to Robert Wood [:rwood] from comment #44)
> > Note: innertext-1-ref.html works fine locally, but it's innertext-1.html
> > that times out.
> 
> I tested locally on my macOS machine. Debug build timed out for both tests.
> In an optimized build, innertest-1.html takes ~1400ms to run. (./mach
> talos-test --suite perf-reftest-e10s is also fine and doesn't time out).
> 
> If innertext-1-ref.html works but innertext-1.html times out, that sounds
> like the optimization hasn't been applied in you local build.

:jmaher, please see ^. A new perf-reftest which fails unless the local build is an opt build. I know in production we only run on opt builds - but do we also want to require anyone running talos locally (at least perf-reftest) to have to use an opt build?
Flags: needinfo?(jmaher)
I think requiring opt builds to run talos or raptor is reasonable.  I am not sure if all the other performance tests work in debug mode, I suspect many do.  

I would like to ensure this is tested on opt/pgo on all platforms retriggered many times to avoid timeouts.
Flags: needinfo?(jmaher)
Comment on attachment 8992241 [details]
Bug 1330375 - P1: Helper for checking for ancestor of dirty reflow root.

https://reviewboard.mozilla.org/r/257132/#review264848

Ok, thanks guys. I applied the patches again locally (OSX) and did an opt build (instead of debug) and this time the talos perf-reftest suite passed.

Note: I updated the perf-reftest documentation on the wiki to indicate that this test now requires you to have an optimal build, otherwise it will timeout (https://wiki.mozilla.org/Performance_sheriffing/Talos/Tests#perf-reftest)

r+ on condition of multiple green try runs of perf-reftest on all platforms on both opt and pgo (as :jmaher noted in Comment 51) - with your final patches. Thanks!
Attachment #8992241 - Flags: review?(rwood) → review+
Thanks all. I agree that debug build run times for perf tests don't matter, since they might be arbitrarily long depending on debug-only assertions and whatnot.

Dan, per discussion in yesterday's meeting, it sounds like we might be able to do this as a simple mochitest using reflow statistics. Want to give that a try? It's certainly cheaper long-term than a perf-reftest, since it doesn't need to be run on physical hardware.
Flags: needinfo?(dglastonbury)
And to be clear: What I care about is that we add regression tests when we fix performance bugs. If we can do that with mochitests, that's nice because it costs us less on CI. But for situations where there's no easy mochitest solution, we also need something like perf-reftest, and for that solution to be scalable. I'm chatting with jmaher and bz tomorrow about that very topic.
Thanks to everyone for their feedback on the perf-reftest issue.

Bobby, I'll take a look at implementing a mochitest using the reflow statistics.  If we can reduce the number of iterations that would also help speed up the debug build case.
Flags: needinfo?(dglastonbury)
Comment on attachment 8992239 [details]
Bug 1330375 - P2: Reduce layout flushes in nsGenericHTMLElement::GetInnerText.

https://reviewboard.mozilla.org/r/257128/#review264876

::: dom/html/nsGenericHTMLElement.cpp:2950
(Diff revision 2)
> +  }
> +
> +  // Check for dirty reflow roots in the subtree from targetFrame; this requires
> +  // a reflow flush.
> -    nsIPresShell* presShell = nsContentUtils::GetPresShellForContent(this);
> +  nsIPresShell* presShell = nsContentUtils::GetPresShellForContent(this);
> +  bool dirty = presShell && presShell->FrameIsAncestorOfDirtyRoot(frame);

You're right. Oops.

::: dom/html/nsGenericHTMLElement.cpp:2971
(Diff revision 2)
> +  }
> +
> +  if (!GetPrimaryFrame()) {
>      // NOTE(emilio): We need to check the presshell is initialized in order to
>      // ensure the document is styled.
>      if (!presShell || !presShell->DidInitialize() ||

Does that mean that: 

```C
if (!presShell || !presShell->DidInitialize() ||
    IsOrHasAncestorWithDisplayNone(this)) {
```

can be replaced by:

```C
if (IsOrHasAncestorWithDisplayNone(this)) {
```
Comment on attachment 8992239 [details]
Bug 1330375 - P2: Reduce layout flushes in nsGenericHTMLElement::GetInnerText.

https://reviewboard.mozilla.org/r/257128/#review264962

::: dom/html/nsGenericHTMLElement.cpp:2971
(Diff revision 2)
> +  }
> +
> +  if (!GetPrimaryFrame()) {
>      // NOTE(emilio): We need to check the presshell is initialized in order to
>      // ensure the document is styled.
>      if (!presShell || !presShell->DidInitialize() ||

Yes, it can go away. Any of those imply IsOrHasAncestorWithDisplayNone will return true.
Comment on attachment 8993554 [details]
Bug 1330375 - P4: GetInnerText perf regression test.

https://reviewboard.mozilla.org/r/258246/#review265302

r=me assuming this fails without your patch.
Attachment #8993554 - Flags: review?(cam) → review+
Comment on attachment 8992241 [details]
Bug 1330375 - P1: Helper for checking for ancestor of dirty reflow root.

https://reviewboard.mozilla.org/r/257132/#review265398

Thanks!
Attachment #8992241 - Flags: review?(emilio) → review+
Comment on attachment 8992239 [details]
Bug 1330375 - P2: Reduce layout flushes in nsGenericHTMLElement::GetInnerText.

https://reviewboard.mozilla.org/r/257128/#review265400

Looks fine, thank you!

::: dom/html/nsGenericHTMLElement.cpp:2967
(Diff revision 4)
> +  // Flush layout if we determined a reflow is required.
> +  if (dirty && doc) {
> +    doc->FlushPendingNotifications(FlushType::Layout);
> +  }
> +
> +  if (!GetPrimaryFrame() && IsOrHasAncestorWithDisplayNone(this)) {

Given `IsOrHasAncestorWithDisplayNone` is fast now, we can even remove `!GetPrimaryFrame()`, which is intended just as an optimization.
Attachment #8992239 - Flags: review?(emilio) → review+
Comment on attachment 8993554 [details]
Bug 1330375 - P4: GetInnerText perf regression test.

https://reviewboard.mozilla.org/r/258246/#review265302

It certainly does. In local testing, there are 351 reflows during the iterations of the table. With the optimization, there are none.
Pushed by dglastonbury@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/b4ae7a07d29a
P1: Helper for checking for ancestor of dirty reflow root. r=emilio,heycam,rwood
https://hg.mozilla.org/integration/autoland/rev/28c07aae7d24
P2: Reduce layout flushes in nsGenericHTMLElement::GetInnerText. r=emilio,heycam
https://hg.mozilla.org/integration/autoland/rev/28200567c44f
P3: Add WPT test to ensure innerText works with dynamic changes. r=heycam
https://hg.mozilla.org/integration/autoland/rev/36a6d366b9c1
P4: GetInnerText perf regression test. r=heycam
Created web-platform-tests PR https://github.com/web-platform-tests/wpt/pull/12176 for changes under testing/web-platform/tests
Whiteboard: [qf:p1:f64] → [qf:p1:f64][wptsync upstream]
Status: NEW → ASSIGNED
Upstream web-platform-tests status checks passed, PR will merge once commit reaches central.
Performance Impact: --- → P1
Whiteboard: [qf:p1:f64][wptsync upstream] → [wptsync upstream]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: