1409981 - 2.02 - 2.06% JS (windows7-32) regression on push b63029f0a1c2 (Thu Oct 5 2017)

Reporter

Description

•

7 years ago

We have detected an awsy regression from push:

https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=c82b0e52307cd9282e7b4deba130b97611423459&tochange=b63029f0a1c2660815f21044b63259b14e113a3e

As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

  2%  JS summary windows7-32 opt      82,093,963.51 -> 83,785,201.08
  2%  JS summary windows7-32 pgo      81,908,758.16 -> 83,565,147.16


You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=10052

On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the jobs in a pushlog format.

To learn more about the regressing test(s), please see: https://developer.mozilla.org/en-US/docs/Mozilla/Performance/AWSY

Ionuț Goldan [:igoldan]

Reporter

Updated

•

7 years ago

Component: Untriaged → Marionette

Product: Firefox → Testing

Ionuț Goldan [:igoldan]

Reporter

Comment 1

•

7 years ago

:whimboo I don't see bug 1375660 modify actual Firefox code. But still changes to Marionette caused these regressions.
Considering these changes to the way we do our tests, I guess we should accept these results as the new baseline for JS summaries. Am I right or were these results unexpected?

Flags: needinfo?(hskupin)

Henrik Skupin [:whimboo][⌚️UTC+1]

Comment 2

•

7 years ago

There is certainly the risk of a possible performance regression. The code as included is used by the Selenium project, and we haven't updated it for a very long time. So it means we pulled all the changes in from the last 4 years to make Marionette more stable in interacting with elements. On the other side we are working on finally getting rid of those atoms. The work therefore is covered on bug 1354203. That's a step by step process, and it might still take a bit for us to get this work done.

Lately we got one atom removed which was done via bug 1275273 by the following two changesets:

https://hg.mozilla.org/integration/mozilla-inbound/rev/70b93f9bfa0c
https://hg.mozilla.org/integration/mozilla-inbound/rev/5a914f741768

Could someone please check if that improved our situation a bit?

On the other side I'm not that sure in how those data is collected and when. Talos tests should not make use of Marionette at all, and as such those atoms should not be loaded, and used. When I check the subtest page from the above Talos alert I see:

https://treeherder.mozilla.org/perf.html#/comparesubtest?originalProject=autoland&originalRevision=022f5922d17668fbf5f8895434ae9b1766d94916&newProject=autoland&newRevision=876b36faeae2bb860bdba58514df27f8e64f37b3&originalSignature=e85733640e1dc465ef7cc48888a17bfa406180b8&newSignature=e85733640e1dc465ef7cc48888a17bfa406180b8&framework=4

Base - 022f5922d176 (autoland) - sshih@mozilla.com - New - 876b36faeae2 (autoland) - rchien@mozilla.com -

So couldn't this also have been caused by the changes on bug 1405624?

Flags: needinfo?(hskupin)

Joel Maher ( :jmaher ) (UTC -8)

Comment 3

•

7 years ago

alerts are almost 100% inaccurate based on scheduling and in-tree oddness- in this case there was a need to backfill 10 revisions as we didn't add data for that- if you look at the graph, you can see where the alert was and the large volume of data between the regression and the alert (that is all backfilled, that is not a bug in the alert code).

As for marionette + talos- you are correct, we have no marionette code in talos, this is the AWSY benchmark and we use marionette there.  In fact we get alerts from 7 different frameworks and sheriff 5 of them, Talos is just one and makes up about half of the total metrics.

Henrik Skupin [:whimboo][⌚️UTC+1]

Comment 4

•

7 years ago

I see. So I had a look at AWSY and as it looks like there is only a single command (`findElement`) which would require the atoms. But what we currently do is to always import atoms.js in different Javascript modules of Marionette. This might indeed be an overhead, which we could reduce by using `defineLazyGetter` for specifically this module.

The bump in memory usage would only be visible if a command is executed, which itself makes use of a Selenium atom. In case of AWSY it would still be the case at least for `driver.js` but not for any `listener.js` framescript which gets injected for new tabs.

Joel, I could check if that works, but could you please tell me how to confirm improvements for the AWYS Talos job?

Also what would be an acceptable increase, and did the removal of `isElementSelect` as asked in my last comment did actually make a difference? We would really like to keep the current atoms in Marionette.

Flags: needinfo?(jmaher)

Joel Maher ( :jmaher ) (UTC -8)

Comment 5

•

7 years ago

in this case you are only seeing win7 awsy regressions, so |./mach try -p win32 -u awsy -t none --rebuild 3|

If we are changing the tool, this isn't a regression end users of Firefox are going to see.  Possibly end users of Marionette will see more resources consumed- but this test is targetting Firefox, not Marionette users.

Flags: needinfo?(jmaher)

Henrik Skupin [:whimboo][⌚️UTC+1]

Comment 6

•

7 years ago

Joel, so you are saying it would be fine to keep the new values as baseline? Just asking because I don't want to spend time on investigation and fixes which actually are not necessary given that this only occurs when Marionette is active.

Flags: needinfo?(jmaher)

Joel Maher ( :jmaher ) (UTC -8)

Comment 7

•

7 years ago

correct, feel free to close this as wontfix as this is a tooling change.

Flags: needinfo?(jmaher)

Andreas Tolfsen ❲:ato❳

Comment 8

•

7 years ago

Unfortunately, instrumenting the browser for automation necessarily
imposes a certain overhead, and I think we need to accept these new
numbers as our new baseline numbers for as long as we rely on the
Selenium atoms.

There is https://bugzil.la/1354203 to ween us off atoms, and this
will hopefully in the long term remove the performance problem.

Henrik Skupin [:whimboo][⌚️UTC+1]

Comment 9

•

7 years ago

Great to see agreement here. Going to WONTFIX this bug then. Thanks everyone involved.

Status: NEW → RESOLVED

Closed: 7 years ago

Resolution: --- → WONTFIX

BMO Automation

Updated

•

1 year ago

Product: Testing → Remote Protocol

Bugzilla

Quick Search

2.02 - 2.06% JS (windows7-32) regression on push b63029f0a1c2 (Thu Oct 5 2017)

Categories

(Remote Protocol :: Marionette, defect)

Tracking

(Not tracked)

People

(Reporter: igoldan, Unassigned)

References

Details

(Keywords: perf, regression)

Crash Data

Security

(public)

User Story

Description

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Updated