Closed Bug 1444831 Opened 6 years ago Closed 6 years ago

Intermittent Wd Automation Error: mozprocess timed out after 1000 seconds running ['python', '-u', '/builds/worker/workspace/build/tests/web-platform/runtests.py', '--log-raw=-'] due to excessive logging of hyper crate

Categories

(Testing :: web-platform-tests, defect, P5)

Version 3
defect

Tracking

(firefox59 wontfix, firefox60 fixed, firefox61 fixed)

RESOLVED FIXED
mozilla61
Tracking Status
firefox59 --- wontfix
firefox60 --- fixed
firefox61 --- fixed

People

(Reporter: whimboo, Assigned: ato)

References

Details

(Keywords: intermittent-failure)

Bug 1407383 fixed a hang in wdspec jobs for an issue in mozlog. Here we seem to hang when starting the runner:

https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1407383&startday=2018-03-07&endday=2018-03-11&tree=all

[task 2018-03-09T22:02:26.685Z] 22:02:26     INFO - PID 2998 | Headers { Connection: close
[task 2018-03-09T22:02:26.687Z] 22:02:26     INFO - PID 2998 | , Content-Type: application/json; charset=utf-8
[task 2018-03-09T22:02:26.687Z] 22:02:26     INFO - PID 2998 | , Cache-Control: no-cache
[task 2018-03-09T22:02:26.688Z] 22:02:26     INFO - PID 2998 | , Content-Length: 13
[task 2018-03-09T22:02:26.688Z] 22:02:26     INFO - PID 2998 | , Date: Fri, 09 Mar 2018 22:02:16 GMT
[task 2018-03-09T22:02:26.688Z] 22:02:26     INFO - PID 2998 | , }]
[task 2018-03-09T22:02:26.689Z] 22:02:26     INFO - PID 2998 | 1520632936362	hyper::server::response	DEBUG	write 13 bytes
[task 2018-03-09T22:02:26.689Z] 22:02:26     INFO - PID 2998 | 1520632936362	hyper::server::response	TRACE	ending
[task 2018-03-09T22:02:26.691Z] 22:02:26     INFO - Starting runner
[task 2018-03-09T22:19:36.599Z] 22:19:36     INFO - Automation Error: mozprocess timed out after 1000 seconds running ['/builds/worker/workspace/build/venv/bin

Interestingly this is visible since we see the excessive logging of hyper which I filed as bug 1444431. Maybe that is related.
The actual failure count is low yet, so moving to P5.
Priority: P1 → P5
The patch on bug 1444431 fixed it in central, but it was missed to also uplift to beta. I requested it, and we should see a drop of the failure soon.
(In reply to OrangeFactor Robot from comment #3)
> * mozilla-release: 1

This is actually a hang which was fixed by bug 1407383 but which didn't make it to release.
I wonder if too much log output actually causes this hang. It reminds me to a warning as put into the Python subprocess `Popen.wait()` documentation:

> This will deadlock when using stdout=PIPE and/or stderr=PIPE and the child process generates enough output to a pipe such that it blocks waiting for the OS pipe buffer to accept more data. Use communicate() to avoid that. 

James and Andreas, how do we communicate with the subprocesses?
Flags: needinfo?(james)
Flags: needinfo?(ato)
For the WebDriver servers, wptrunner uses mozprocess.ProcessHandler’s
processOutputLine:

    https://searchfox.org/mozilla-central/rev/028cd8342899357d80fafba2d528a0fd5819e316/testing/web-platform/tests/tools/wptrunner/wptrunner/webdriver_server.py#55-61

It does look like mozprocesses uses subprocess.PIPE internally.
Flags: needinfo?(ato)
Blocks: 1450938
Since bug 1444431 has been uplifted to mozilla-beta those hangs are gone.
Assignee: nobody → ato
Status: NEW → RESOLVED
Closed: 6 years ago
Flags: needinfo?(james)
Resolution: --- → FIXED
Summary: Intermittent Wd Automation Error: mozprocess timed out after 1000 seconds running ['/builds/worker/workspace/build/venv/bin/python', '-u', '/builds/worker/workspace/build/tests/web-platform/runtests.py', '--log-raw=-'] due to hang in "start runner" → Intermittent Wd Automation Error: mozprocess timed out after 1000 seconds running ['python', '-u', '/builds/worker/workspace/build/tests/web-platform/runtests.py', '--log-raw=-'] due to excessive logging of hyper crate
Target Milestone: --- → mozilla61
You need to log in before you can comment on or make changes to this bug.