Closed Bug 1334035 Opened 7 years ago Closed 5 years ago

Intermittent test_execute_script.py TestExecuteContent.test_window_set_timeout_is_not_cancelled | AssertionError: setTimeout already fired

Categories

(Testing :: Marionette Client and Harness, defect, P5)

Version 3
ARM
Android
defect

Tracking

(firefox53 disabled, firefox54 disabled, firefox55 disabled)

RESOLVED WONTFIX
mozilla55
Tracking Status
firefox53 --- disabled
firefox54 --- disabled
firefox55 --- disabled

People

(Reporter: intermittent-bug-filer, Unassigned)

References

Details

(Keywords: intermittent-failure, mobile, Whiteboard: [stockwell disabled])

This is almost Android-only, starting with the bump to tier 2 for Android Marionette tests.

:ato - Could you have a look?
Blocks: 1332205
Flags: needinfo?(ato)
Some more details:

[task 2017-02-16T03:10:27.673685Z] 03:10:27  WARNING -  TEST-UNEXPECTED-FAIL | test_execute_script.py TestExecuteContent.test_window_set_timeout_is_not_cancelled | AssertionError: setTimeout already fired
[task 2017-02-16T03:10:27.674102Z] 03:10:27     INFO -  Traceback (most recent call last):
[task 2017-02-16T03:10:27.674520Z] 03:10:27     INFO -    File "/home/worker/workspace/build/venv/local/lib/python2.7/site-packages/marionette_harness/marionette_test/testcases.py", line 166, in run
[task 2017-02-16T03:10:27.674758Z] 03:10:27     INFO -      testMethod()
[task 2017-02-16T03:10:27.675721Z] 03:10:27     INFO -    File "/home/worker/workspace/build/tests/marionette/tests/testing/marionette/harness/marionette_harness/tests/unit/test_execute_script.py", line 258, in test_window_set_timeout_is_not_cancelled
[task 2017-02-16T03:10:27.675778Z] 03:10:27     INFO -      "setTimeout already fired")

Related test code:

https://dxr.mozilla.org/mozilla-central/rev/0eef1d5a39366059677c6d7944cfe8a97265a011/testing/marionette/harness/marionette_harness/tests/unit/test_execute_script.py#242-263

It's definitely a race as described in lines 254 and 255. 

Actually I do not understand what we are testing here. Also which call to execute_script canceled it? Given that Andreas wrote this test, he might be the best person to answer that.
I suspect this is similar to what we saw over in https://bugzilla.mozilla.org/show_bug.cgi?id=1306848?  Please see my comment over there for a general explanation of why we might be seeing timing related issues in slow environments in the comment https://bugzilla.mozilla.org/show_bug.cgi?id=1306848#c21.

Does this occur in Android opt as well as Android debug?  If so, I’m inclined to say we should bump the timeouts up to see if that rectifies the problem.  If it only happens on debug, I’m somewhat inclined to disabling the test because we don’t want to negatively impact opt.

(Tangential: WPT follows a slightly different strategy, whereby all timeout durations are multiplied by 4.  After much experimentation it was found that debug builds on average are between 3-4 times slower than optimised builds.  However, it will be difficult to employ such a fix to Marionette tests because individual tests hard code a lot of numbers.)
Flags: needinfo?(ato)
(In reply to Andreas Tolfsen from comment #5)
> Does this occur in Android opt as well as Android debug?

Only Debug, but that's (maybe) because the Android Marionette test jobs are only run against Debug builds.

Interestingly, I see none of these failures on mozilla-central or integration branches over the last few days.
In that case, I would argue this is also a test we should consider disabling.

Because of the nature of Marionette, and of functional tests in general, I wonder about the benefits of running the Mn job on Android debug at all.  We run debug jobs on all other platforms which should give us sufficient coverage on the non-optimised front.  Since Android debug is so much slower, even than other debug platforms, I wonder if it is worth the time we are spending on diagnosing and adapting the Mn tests to work on Android debug.
No failures on trunk since Feb 16; I don't know why.

If this comes back, on Android, let's disable immediately, as suggested in comment 8.
Assignee: nobody → gbrown
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WORKSFORME
(In reply to Andreas Tolfsen ‹:ato› from comment #8)
> In that case, I would argue this is also a test we should consider disabling.
https://treeherder.mozilla.org/logviewer.html#?job_id=86387382&repo=autoland
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
Pushed by gbrown@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/d3fe2f6730ad
Skip Marionette test_window_set_timeout_is_not_cancelled on Android; r=me,test-only
Given the discussion here, I don't see any need to leave-open, but if anyone disagrees, please mark accordingly, or re-open.
Whiteboard: [stockwell disabled]
https://hg.mozilla.org/mozilla-central/rev/d3fe2f6730ad
Status: REOPENED → RESOLVED
Closed: 7 years ago7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
The test is still disabled.
Assignee: gbrown → nobody
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Priority: -- → P5

Lets try to unskip once Mn jobs run with the faster Android 7.0 x86 emulator.

Depends on: 1500509
OS: Unspecified → Android
No longer depends on: 1500509

We moved away from the ARM platform and this failure no longer occurs on packet.net with Android 7.0 x86_64.

Status: REOPENED → RESOLVED
Closed: 7 years ago5 years ago
Hardware: Unspecified → ARM
Resolution: --- → WONTFIX
Product: Testing → Remote Protocol
Moving bug to Testing::Marionette Client and Harness component per bug 1815831.
Component: Marionette → Marionette Client and Harness
Product: Remote Protocol → Testing
You need to log in before you can comment on or make changes to this bug.