Closed Bug 1392907 Opened 7 years ago Closed 7 years ago

Intermittent /webdriver/tests/minimize_window.py | minimize_window.py::test_minimize_twice_is_idempotent

Categories

(Remote Protocol :: Marionette, defect, P5)

Version 3
defect

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: intermittent-bug-filer, Unassigned)

References

Details

(Keywords: intermittent-failure, Whiteboard: [stockwell unknown])

This is a new test which got added by Andreas via bug 1391691.

[task 2017-08-23T03:29:48.047594Z] 03:29:48     INFO - TEST-UNEXPECTED-FAIL | /webdriver/tests/minimize_window.py | minimize_window.py::test_minimize_twice_is_idempotent 
[task 2017-08-23T03:29:48.048331Z] 03:29:48     INFO - session = <webdriver.client.Session object at 0x7f76f7b40190>
[task 2017-08-23T03:29:48.049002Z] 03:29:48     INFO - 
[task 2017-08-23T03:29:48.049598Z] 03:29:48     INFO -     def test_minimize_twice_is_idempotent(session):
[task 2017-08-23T03:29:48.049940Z] 03:29:48     INFO -         assert session.execute_script("return document.hidden") is False
[task 2017-08-23T03:29:48.050269Z] 03:29:48     INFO -     
[task 2017-08-23T03:29:48.050574Z] 03:29:48     INFO -         first_response = minimize(session)
[task 2017-08-23T03:29:48.050895Z] 03:29:48     INFO -         assert_success(first_response)
[task 2017-08-23T03:29:48.051662Z] 03:29:48     INFO - >       assert session.execute_script("return document.hidden") is True
[task 2017-08-23T03:29:48.052336Z] 03:29:48     INFO - E       assert <bound method Session.execute_script of <webdriver.client.Session object at 0x7f76f7b40190>>('return document.hidden') is True
[task 2017-08-23T03:29:48.053037Z] 03:29:48     INFO - E        +  where <bound method Session.execute_script of <webdriver.client.Session object at 0x7f76f7b40190>> = <webdriver.client.Session object at 0x7f76f7b40190>.execute_script

Andreas, is there anything obvious? Otherwise we will have to wait for more data.
Flags: needinfo?(ato)
I don’t see anything obvious here.
Flags: needinfo?(ato)
the failure rate has increased this week, so far in the last 7 days we have 22 instances of this:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1392907

the failures are on linux64 debug.  Something to keep an eye on!
Andreas, you were working on those changes for Marionette. Can you please have a look at those failures?
Flags: needinfo?(ato)
I’m unable to reproduce this locally (even after ~200 reruns) so
I can only speculate towards the cause of this.  We saw originally
from OrangeFactor that the instability first manifested itself
on the autoland and mozilla-inbound trees, then later spread to
mozilla-central.

The WebDriver:MinimizeWindow command was added fairly recently and
hasn’t been touched by many commits.  When I correlate the dates
given by OrangeFactor it isn’t far fetched to think that it was
https://bugzil.la/1391691 that introduced the instability.

In particular it adds this test:

> def test_minimize_twice_is_idempotent(session):
>     assert session.execute_script("return document.hidden") is False
> 
>     first_response = minimize(session)
>     assert_success(first_response)
>     assert session.execute_script("return document.hidden") is True
> 
>     second_response = minimize(session)
>     assert_success(second_response)
>     assert session.execute_script("return document.hidden") is True

It is known that DOM properties, such as document.hidden, are not
propagated synchronously.  Earlier tests relied on the window
rect’s "state" field which we are trying to move away from because
it didn’t make it into level 1 of the WebDriver specification.
Relying on DOM properties has historically shown to be less reliable
than using internal XPCOM state such as ChromeWindow.windowState.

But it also changes the code to bring the window out
of the minimised state, meaning it no longer relies on
WebDriver:MinimizeWindow but on WebDriver:SetWindowRect to restore
the window:

> -def _restore_normal_window_state(session):
> +def _restore_window_state(session):
>      """If the window is maximized, minimized, or fullscreened it will
>      be returned to normal state.
>  
>      """
> -    state = session.window.state
> -    if state == "maximized":
> -        session.window.maximize()
> -    elif state == "minimized":
> -        session.window.minimize()
> -    elif state == "fullscreen":
> -        session.window.fullscreen()
> +    if session.window.state in ("maximized", "minimized", "fullscreen"):
> +        session.window.size = (800, 600)

However, essentially, the WebDriver:SetWindowRect command uses the
same technique as WebDriver:MinimizeWindow did before:

>   // Restore window and wait for the window state to change.
>   async function restoreWindow() {
>     return new Promise(resolve => {
>       win.addEventListener("sizemodechange", resolve, {once: true});
>       win.restore();
>     });
>   }

If we look at the failures on the autoland tree specifically, we see
a slight decline in intermittents from 31 August, but I’m not sure
if this is incidental or what might indeed have caused that trend:

	https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1392907&tree=autoland&startday=2017-08-21&endday=2017-09-05

When you look at all the trees combined, there is an entry for OS X
10.10 opt which is very surprising, considering that we don’t run
this test on that platform.  Overall, however, it looks like the trend is
that this hits debug and ccov most frequently.

It is hard to say whether it is the sizemodechange event that fires
too soon or the document.hidden property getting populated too
late.  We can theorise that sizemodechange fires at a high rate and
that we break off the promise too early, but I think a more likely
explanation is that window state operations in GTK are expressly
asynchronous.

The sizemodechange event is fired on a callback from GTK, but
because the X11 protocol is asynchronous, GTK has no way to reliably
telling _when_ the operation has completed.  Because Gecko uses this
callback for firing sizemodechange it is not advisable to rely on
this event to tell whether the window has successfully completed its
transition.

In WebDriver:SetWindowRect we try to combat this with
idleDispatchToMainThread and requestAnimationFrame, which only
through empirical experiments have proven to delay the return to an
adequate satisfaction.

>   // Throttle resize event by forcing the event queue to flush and delay
>   // until the main thread is idle.
>   function optimisedResize(resolve) {
>     return () => Services.tm.idleDispatchToMainThread(() => {
>       win.requestAnimationFrame(resolve);
>     });
>   }

Perhaps we should employ the same trick here and see if that
improves the situation.  That is, unfortunately, the best advice I
can give considering I’m unable to reproduce it.
Flags: needinfo?(ato)
Depends on: 1396618
The throttle patch didn't help here at all. It's still a high failure rate after it got landed.
There are no more failures since late Sep 13th. Maybe a patch landed by that time which fixed it?
Andreas, which bug could have fixed that failure?
Flags: needinfo?(ato)
(In reply to Henrik Skupin (:whimboo) from comment #14)
> Andreas, which bug could have fixed that failure?

It’s been a while.  If it’s important I can go digging.
Flags: needinfo?(ato)
I don't think it's necessary anymore. Lets mark the bug as WFM then.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WORKSFORME
Moving bug to Remote Protocol::Marionette component per bug 1815831.
Component: geckodriver → Marionette
Product: Testing → Remote Protocol
You need to log in before you can comment on or make changes to this bug.