Open Bug 1440726 Opened 6 years ago Updated 1 year ago

Starting fixture servers takes a long time on Windows

Categories

(Testing :: Marionette Client and Harness, enhancement, P3)

Version 3
enhancement

Tracking

(Not tracked)

People

(Reporter: whimboo, Unassigned)

Details

(Keywords: perf)

Attachments

(1 file)

I'm using a Windows 7 VM for debugging Marionette issues on Windows. Whereby I use shared folders so I can re-use the Mercurial checkout from my Linux host. 

As what I noticed it takes a fair amount of time (~40s) until Marionette harness actually starts the tests. The hang here is specifically in `get_url` of serve.py. Sending the data is fine, but then it takes a while for the parent to actually receive the content. The following output comes from the recv method from both the parent and child:

********* starting server: http ****
********* started server ****
1519406837.91: got lock
1519406852.92: got lock            <==== 15s delay
1519406852.92: got payload: (('get_url', ('/',)),)
1519406852.92: got payload: ('http://127.0.0.1:60285/',)
1519406852.92: got lock
******** get schema ****

********* starting server: https ****
********* started server ****
1519406853.92: got lock
1519406874.63: got lock            <==== 21s delay
1519406874.65: got payload: (('get_url', ('/',)),)
1519406874.65: got lock
1519406874.65: got payload: ('https://127.0.0.1:60286/',)
********* get schema ****

Given that all the code runs locally inside the VM I don't think the shared folders should be the problem here. Andreas, given that you implemented that, do you have an idea?
Flags: needinfo?(ato)
The first thing that springs to mind is that there could be
something wrong with the way multiprocessing.Lock is used in
testing/marionette/harness/marionette_harness/runner/serve.py’s
BlockingChannel class.  I don’t entirely understand why it would
cause a ~15s delay though, because if there is a lockup the lock
should wait indefinitely…

I also don’t seem to be able to run the file at all, which makes
it hard to look in to:

> % ./serve.py
> Traceback (most recent call last):
>   File "./serve.py", line 21, in <module>
>     from . import httpd
> ValueError: Attempted relative import in non-package
Flags: needinfo?(ato)
Attached file serve debug output
Maybe the import error happens because we call that file directly, so it doesn't recognize that it is part of the package? You can run it by changing the import line to:

> from marionette_harness.runner import httpd

So I added a couple of debug lines and you can see the output here in the attachment. While for MacOS all looks fine, the output on Windows is strange. What really opens a question is the changed channel handle of the client channel. While multiprocess.Pipe() returned us 184 and 204, later 184 and 12 is used. Where does 12 come from? Is the client channel somehow re-created, so that this causes the delay?

Running the code through cProfile I get:

> Windows:  11.727    5.864   11.727    5.864 {method 'recv' of '_multiprocessing.Connection' objects}
> MacOS:    0.026    0.013    0.026    0.013 {method 'recv' of '_multiprocessing.Connection' objects}
Priority: -- → P3
Severity: normal → S3
Product: Testing → Remote Protocol
Component: Marionette → Marionette Client and Harness
Product: Remote Protocol → Testing
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: