Closed Bug 1112773 Opened 10 years ago Closed 9 years ago

fix mozharness script to use only the cloned talos repo, not the cached one for running talos

Categories

(Testing :: Talos, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jmaher, Assigned: jlund)

References

Details

(Whiteboard: [good next bug][lang=python])

Attachments

(6 files, 1 obsolete file)

In bug 1037619, we have shown that the mozharness script for Talos has issues.  Specifically when it comes to defining the tests.

We build our configuration (parse command line options, etc.) by using the cached version on the box, then the cloned version is used to actually run the tests.  This results in conflicts when the cached version is not updated at all.

We can fix this, it should be fairly easy, lets look in talos.py:
http://hg.mozilla.org/build/mozharness/file/fa4104a8ca30/mozharness/mozilla/testing/talos.py

I believe our core problem is here:
http://hg.mozilla.org/build/mozharness/file/fa4104a8ca30/mozharness/mozilla/testing/talos.py#l542
jmaher, can you explain this in more details
Flags: needinfo?(jmaher)
Simarpreet Singh, once you fix bug 1078619, let me know if you would be interested on looking into this issue. As always, no rush.
ok, here is how I believe we can solve this:
1) remove talos from the virtualenv_modules list: http://hg.mozilla.org/build/mozharness/file/tip/mozharness/mozilla/testing/talos.py#l163
2) while removing this code, consider keeping pyyaml: http://hg.mozilla.org/build/mozharness/file/tip/mozharness/mozilla/testing/talos.py#l549

there might be some path related issues after doing this, but I suspect it might just be this easy :)
Flags: needinfo?(jmaher)
Okay, here are some changes that I made while looking at this and bug 1078619, keep in mind that bug 1078619 is still WIP so we may need more than what is here.

Question for you :jmaher , is bug 1037619 only reproduced when we try to run the test --activeTests tp5o_scroll:glterrain or is it in general? I, for some reason wasn't able to run the glterrain test locally [1] so I tried with --activeTests tresize:tcanvasmark


[1]: http://pastebin.mozilla.org/8092800
Attachment #8539750 - Flags: feedback?(jmaher)
Comment on attachment 8539750 [details] [diff] [review]
[WIP] Removing talos logic from virtualenv_modules.

Review of attachment 8539750 [details] [diff] [review]:
-----------------------------------------------------------------

For feedback this is great stuff.  The reason glterrain is failing is because we added it to the PerfConfigurator.py (when defining a new test and config), this is then referenced from the cached version of talos even though we clone the valid sources.  I do not thing there is a way to reproduce this locally, unless you removed a test from talos, ran mozharness+talos, added a test, ran it again and verified that it referenced the test properly.  This is only an issue on specific versions of software (really just a few machines).

All in all this set of changes makes it more robust and less confusing.

Lets clean up the developer configs (or do that in a different bug) and when you have a new patch, I will be happy to put it up on try server.

::: configs/developer_config.py
@@ +19,5 @@
>          ("http://tooltool.pvt.build.mozilla.org/build", "https://secure.pub.build.mozilla.org/tooltool/pvt/build")
>      ],
> +    "python_webserver": True,
> +    "webroot": '%s/build/talos-data' % os.getcwd(),
> +    "virtualenv_path": '%s/build/venv' % os.getcwd(),

this might be making assumptions, also the webroot should be in the current directory when developing locally.  When I run talos locally, it is from the talos/ subdir, so os.getcwd() will not work for the virtualenv path.
Attachment #8539750 - Flags: feedback?(jmaher) → feedback+
Comment on attachment 8539750 [details] [diff] [review]
[WIP] Removing talos logic from virtualenv_modules.

Review of attachment 8539750 [details] [diff] [review]:
-----------------------------------------------------------------

::: configs/developer_config.py
@@ +19,5 @@
>          ("http://tooltool.pvt.build.mozilla.org/build", "https://secure.pub.build.mozilla.org/tooltool/pvt/build")
>      ],
> +    "python_webserver": True,
> +    "webroot": '%s/build/talos-data' % os.getcwd(),
> +    "virtualenv_path": '%s/build/venv' % os.getcwd(),

Another approach for this is to not have webroot and virtualenv_path be configurable parameters.

On a patch I'm working on I do something similar for .avds_dir; I remove it from the official config and then calculate it inside of query_abs_dir().
Assignee: nobody → s244sing
I'm not too sure on how to proceed ahead with this. While on one hand the current code (committed and running in production here [1]) that was checked in does work fine and is good for now but in the long run (as :jmaher mentioned) we need to be looking to be getting rid of talos and picking talos up for use from a local repo installation. Also adding on :armenzg 's feedback from the last comment; is to remove the overrides altogether.

Should we at this point make a new bug for two separate tasks as highlighted by the two feedback points mentioned above or is there a better way? Looking for some feedback on how to tackle this one from here.

[1]: https://bugzilla.mozilla.org/show_bug.cgi?id=1078619
Flags: needinfo?(jmaher)
Flags: needinfo?(armenzg)
For the record, the changes on developer_config.py have already landed.
The only line that has not landed is:
+    "webroot": '%s/build/talos-data' % os.getcwd(),
as it is not necessary IIUC.

Simarpreet, could you please attach an up-to-date patch?
Could you please also check if the webroot parameter is needed?
I don't think we will need to separate into two bugs.

jmaher, IIUC I think we're ready to test his patch on try.
Flags: needinfo?(armenzg)
ok, lets get an updated patch with the latest bits from mozharness and I would be glad to test this on try server!
Flags: needinfo?(jmaher)
(In reply to Armen Zambrano - Automation & Tools Engineer (:armenzg) from comment #8)
> For the record, the changes on developer_config.py have already landed.
> The only line that has not landed is:
> +    "webroot": '%s/build/talos-data' % os.getcwd(),
> as it is not necessary IIUC.

Hmm I think I'm missing something here, I pulled in the latest changes from upstream and I don't see this line to be in the file. Here's the last changeset relevant to this file: [1] 
> 
> Simarpreet, could you please attach an up-to-date patch?
> Could you please also check if the webroot parameter is needed?
> I don't think we will need to separate into two bugs.
> 
> jmaher, IIUC I think we're ready to test his patch on try.


-------

More details below:

So from the logs I see something interesting, first I see the commit where the webroot line was added:

>changeset:   3444:7e44c212e585
>parent:      3442:69d77fccd27a
>user:        Simarpreet Singh <s244sing@uwaterloo.ca>
>date:        Sun Nov 30 17:25:22 2014 -0500
>summary:     [PATCH][v1]: Adding paths for webroot and venv on Linux.

Then I see another commit in the series which seems to be the current state of developer_config.py.

>changeset:   3491:7204ff2ff48a
>user:        Simarpreet Singh <s244sing@uwaterloo.ca>
>date:        Mon Dec 22 14:46:56 2014 -0500
>summary:     Bug 1078619 - Allow to run talos jobs as a developer. r=armenzg


The git mirror hosted on github also reveals the same for developer_config.py [2]

Am I missing something here?


[1]: https://hg.mozilla.org/build/mozharness/rev/7204ff2ff48a
[2]: https://github.com/mozilla/build-mozharness/blob/master/configs/developer_config.py
Flags: needinfo?(armenzg)
(In reply to Simarpreet Singh from comment #10)
> (In reply to Armen Zambrano - Automation & Tools Engineer (:armenzg) from
> comment #8)
> > For the record, the changes on developer_config.py have already landed.
> > The only line that has not landed is:
> > +    "webroot": '%s/build/talos-data' % os.getcwd(),
> > as it is not necessary IIUC.
> 
> Hmm I think I'm missing something here, I pulled in the latest changes from
> upstream and I don't see this line to be in the file. Here's the last
> changeset relevant to this file: [1] 

Webroot was not needed after all.

Good hunting though! I'm glad to see that you're comfortable hunting things down through VCS!
Flags: needinfo?(armenzg)
Thanks for checking on it.

Once you attach the next patch without the developer_config.py changes, jmaher can give you a hand testing this in the production systems.
Flags: needinfo?(s244sing)
Okay so I've tested my changes with the developer_config.py changes that currently stand as of today. 

I tested with the following testline:

> $ python scripts/talos_script.py --suite chromez --add-option --branchName=Firefox-Non-PGO 
> --system-bits 64 --cfg talos/linux_config.py --download-symbols ondemand --use-talos-json --blob-upload-branch Firefox-Non-PGO 
> --cfg developer_config.py --installer-url http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/latest-trunk/firefox-37.0a1.en-US.linux-x86_64.tar.bz2 
> --binary-path ./talos/bin/talos

Some things to note here (this might be common knowledge but I was not aware of them and discovered them as part of making this work):

1) This patch is built on top of the developer_config.py that were earlier pushed in.
2) the param --branch_name param from earlier testing seems to have changed. In the current talos binary --branchName is the correct option to use.
3) I manually had to specify the --binary-path to the talos bin.


I have a few follow up questions on this:

1) I also manually tested the talos binary by invoking the following testline:

> $ ./talos/bin/talos --noisy --debug -v --executablePath ./build/application/firefox/firefox 
> --title simar-talos-test --symbolsPath https://ftp-ssl.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-inbound-linux64/1421110424/firefox-38.0a1.en-US.linux-x86_64.crashreporter-symbols.zip  
> --activeTests tresize:tcanvasmark  --output talos.yml --branchName Mozilla-Inbound-Non-PGO 
> --datazilla-url https://datazilla.mozilla.org/talos --authfile /home/simar/mozilla/oauth.txt --develop

My current blocker is trying to run the glterrain test as previously pointed out in bug 1037619. I don't seem to have the right tests cloned in my talos clone (Specifically talos/page_load_test/tp5n/). 



2) If we are giving the developer more freedom and ease of use, should we also clone additional tests like glterrain for them as part of the clone-talos task from within the talos setup scripts?
Attachment #8539750 - Attachment is obsolete: true
Flags: needinfo?(s244sing) → needinfo?(jmaher)
Attachment #8547953 - Flags: review?(jmaher)
overall this is great stuff.

regarding tp5n, I think we should put tp5n.zip on tooltool, then the script can download it for local developers as well as production.  Quite possibly ignore glterrain for now as it is run with tp5o_scroll (a more recent addition) until we get support for tp5n.zip on tooltool and easier developer runnings.

Ideally I would prefer to avoid the talos binary, and just use 'python run_tests.py ...'.  That means that we would require these steps to run talos:
1) cd talos/talos
2) python perfconfigurator.py ... --output talos.yml
3) python run_tests.py talos.yml
Flags: needinfo?(jmaher)
Comment on attachment 8547953 [details] [diff] [review]
[PATCH]: Remove talos as a virtualenv module.

Review of attachment 8547953 [details] [diff] [review]:
-----------------------------------------------------------------

this looks great.  I should test this out on try server.
Attachment #8547953 - Flags: review?(jmaher) → review+
some errors on osx:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jmaher@mozilla.com-998e1d8017ba/try-macosx64/try_snowleopard_test-chromez-snow-bm106-tests1-macosx-build1.txt.gz

09:25:15    ERROR - caught OS error 2: No such file or directory while running ['/builds/slave/talos-slave/test/build/venv/bin/talos', '--noisy', '--debug', '-v', '--executablePath', '/builds/slave/talos-slave/test/build/application/Nightly.app/Contents/MacOS/firefox', '--title', 't-snow-r4-0015', '--symbolsPath', 'https://ftp-ssl.mozilla.org/pub/mozilla.org/firefox/try-builds/jmaher@mozilla.com-998e1d8017ba/try-macosx64/firefox-38.0a1.en-US.mac.crashreporter-symbols.zip', '--activeTests', u'tresize:tcanvasmark', '--results_url', 'http://graphs.mozilla.org/server/collect.cgi', '--output', 'talos.yml', '--branchName', 'Try', '--datazilla-url', 'https://datazilla.mozilla.org/talos', '--authfile', '/builds/slave/talos-slave/test/oauth.txt', '--webServer', 'localhost']


here is a similar command on linux:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jmaher@mozilla.com-998e1d8017ba/try-linux/try_ubuntu32_hw_test-chromez-bm103-tests1-linux-build1341.txt.gz

08:22:17     INFO - Running command: ['/home/cltbld/talos-slave/test/build/venv/bin/talos', '--noisy', '--debug', '-v', '--executablePath', '/builds/slave/talos-slave/test/build/application/firefox/firefox', '--title', 'talos-linux32-ix-042', '--symbolsPath', 'https://ftp-ssl.mozilla.org/pub/mozilla.org/firefox/try-builds/jmaher@mozilla.com-998e1d8017ba/try-linux/firefox-38.0a1.en-US.linux-i686.crashreporter-symbols.zip', '--activeTests', 'tresize:tcanvasmark', '--results_url', 'http://graphs.mozilla.org/server/collect.cgi', '--output', 'talos.yml', '--branchName', 'Try-Non-PGO', '--datazilla-url', 'https://datazilla.mozilla.org/talos', '--authfile', '/builds/slave/talos-slave/test/oauth.txt', '--webServer', 'localhost'] in /builds/slave/talos-slave/test/build


honestly I am not sure why this doesn't work, I suspect it is because we are referencing '/builds/slave/talos-slave/test/build/venv/bin/talos' and that isn't available for some reason on osx?  maybe it is cached on linux.
(In reply to Joel Maher (:jmaher) from comment #14)
> overall this is great stuff.
> 
> regarding tp5n, I think we should put tp5n.zip on tooltool, then the script
> can download it for local developers as well as production.  Quite possibly
> ignore glterrain for now as it is run with tp5o_scroll (a more recent
> addition) until we get support for tp5n.zip on tooltool and easier developer
> runnings.
> 

Linking this bug 1121049 for record purposes.

> Ideally I would prefer to avoid the talos binary, and just use 'python
> run_tests.py ...'.  That means that we would require these steps to run
> talos:
> 1) cd talos/talos
> 2) python perfconfigurator.py ... --output talos.yml
> 3) python run_tests.py talos.yml

Cool, this seems to have succeeded in my local testing. Here are the steps I followed:

1) Installed relevant dependencies: mozversion, datazilla, mozcrash, mozhttpd
2) Ran the following testline:
> ./talos/bin/PerfConfigurator -v --executablePath ./build/application/firefox/firefox --title simar-talos-test --symbolsPath https://ftp-ssl.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-inbound-linux64/1421110424/firefox-38.0a1.en-US.linux-x86_64.crashreporter-symbols.zip  
> --activeTests tresize:tcanvasmark  --output talos.yml --branchName Mozilla-Inbound-Non-PGO 
> --datazilla-url https://datazilla.mozilla.org/talos --authfile /home/simar/mozilla/oauth.txt --develop

3) talos/talos/run_tests.py ./talos.yml

Firefox launched and tests were run.
great, if you can get these steps to work, then we can try them out in a patch with them modified in the mozharness script :)
This patch fixes the path and command calls to call PerfConfigurator directly rather than going through talos.py
Attachment #8550969 - Flags: review?(jmaher)
Comment on attachment 8550969 [details] [diff] [review]
[PATCH]: Fixing paths to PerfConfigurator for talos tests.

Review of attachment 8550969 [details] [diff] [review]:
-----------------------------------------------------------------

just one question

::: mozharness/mozilla/testing/talos.py
@@ +596,5 @@
> +                                            env=env)
> +        # Call run_tests on generated talos.yml
> +        run_tests = "../talos/talos/run_tests.py"
> +        options = "./talos.yml"
> +        command = [run_tests] + [options]

could you make this:
command = [run_tests, '--noisy', '--debug'] + [options]
Attachment #8550969 - Flags: review?(jmaher) → review+
Added back the verbosity options, log was glad to print more.
Attachment #8552864 - Flags: review?(jmaher)
Comment on attachment 8552864 [details] [diff] [review]
[PATCH]: Adding back verbosity options to run_tests.py

Review of attachment 8552864 [details] [diff] [review]:
-----------------------------------------------------------------

awesome, are we all ready to land this?
Attachment #8552864 - Flags: review?(jmaher) → review+
(In reply to Simarpreet Singh from comment #23)
> try commit:
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=636ffa2e3392

Sorry the last one didn't include the new changes. This is the updated try server commit: https://treeherder.mozilla.org/#/jobs?repo=try&revision=0537444086b6

One thing to note that there are some failures like this for instance: [1] that are not related to the actual bug in question but rather are being followed under this bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1125176
simar, jmaher: how goes things here? Resolving this would be a huge win as we are running into another issue where we have lost a bunch more slaves due to failed talos updates. If releng can assist, let me know. Thanks for work so far on this.
Flags: needinfo?(s244sing)
Flags: needinfo?(jmaher)
Blocks: 1141416
(In reply to Jordan Lund (:jlund) from comment #26)
> simar, jmaher: how goes things here? Resolving this would be a huge win as
> we are running into another issue where we have lost a bunch more slaves due
> to failed talos updates. If releng can assist, let me know. Thanks for work
> so far on this.

IIRC, my last talks with jmaher were on the lines of landing this on ash. There was a time when some other stuff was broken so jmaher concluded that it'd be better to delay this landing by a few weeks. 

We can give it a shot now if things are stable and see how the last patch fares through.
Flags: needinfo?(s244sing)
I am not sure what else is blocking this.  now that we have mozharness pinning, this could realistically land on the trunk branch.  Should we give it one more try run?  Simar, what else do you see as being a problem?
Flags: needinfo?(jmaher)
(In reply to Joel Maher (:jmaher) from comment #28)
> I am not sure what else is blocking this.  now that we have mozharness
> pinning, this could realistically land on the trunk branch.  Should we give
> it one more try run?  Simar, what else do you see as being a problem?

It shouldn't have any, but I'll still try this with another try run. There were some intermittent issues that were on the lines of this bug [1] but I haven't seen this again.

[1]: https://bugzilla.mozilla.org/show_bug.cgi?id=1125176
let me know if you have issues with a try run.
(In reply to Simarpreet Singh from comment #29)
> (In reply to Joel Maher (:jmaher) from comment #28)
> > I am not sure what else is blocking this.  now that we have mozharness
> > pinning, this could realistically land on the trunk branch.  Should we give
> > it one more try run?  Simar, what else do you see as being a problem?
> 
> It shouldn't have any, but I'll still try this with another try run. There
> were some intermittent issues that were on the lines of this bug [1] but I
> haven't seen this again.
> 
> [1]: https://bugzilla.mozilla.org/show_bug.cgi?id=1125176

\o/ /me bows in gratitude
hey simar, how did the try run go? This would be *really* cool if you fixed this.
Flags: needinfo?(s244sing)
(In reply to Jordan Lund (:jlund) from comment #32)
> hey simar, how did the try run go? This would be *really* cool if you fixed
> this.

I see a bunch of revisions associated with Simar's address, the most recent dated Mar 23:

https://treeherder.mozilla.org/#/jobs?repo=try&author=s244sing@uwaterloo.ca

None of them seem to have passed though.

Simar: do you have time to continue working on this?
(In reply to Chris Cooper [:coop] from comment #33)
> (In reply to Jordan Lund (:jlund) from comment #32)
> > hey simar, how did the try run go? This would be *really* cool if you fixed
> > this.
> 
> I see a bunch of revisions associated with Simar's address, the most recent
> dated Mar 23:
> 
> https://treeherder.mozilla.org/#/jobs?repo=try&author=s244sing@uwaterloo.ca
> 
> None of them seem to have passed though.
> 
> Simar: do you have time to continue working on this?

I'm honestly lost as to why the previous build have been failing. This is my synopsis of the changes I've done and the ones that I'm sure made an impact:

1) I had a build running successfully with some of my changes in this try commit: [1]

2) After some discussion with :jmaher (read above) we decided to refactor some of the code (wrt the talos module).

3) The only changes I added after 19dd55f057b5 [1] are the following:
ffa48b8c5b1cSS Adding back verbosity options to run_tests.py
357457e2655eSS Fixing paths to PerfConfigurator for talos tests.
8cbcfdb492d0SS talos: Remove talos as a module.


I'm willing to look into this further if someone could let me know what exactly happened between the two try-commits. The commits that I made (listed above) are minor code refactors that I think didn't impact the breakage.

[1]:  https://treeherder.mozilla.org/#/jobs?repo=try&revision=19dd55f057b5
Flags: needinfo?(s244sing)
> I'm willing to look into this further if someone could let me know what
> exactly happened between the two try-commits. The commits that I made
> (listed above) are minor code refactors that I think didn't impact the
> breakage.
> 
> [1]:  https://treeherder.mozilla.org/#/jobs?repo=try&revision=19dd55f057b5

hmm, I'd imagine it would be pretty hard to diagnose changes between try commits. If you don't think your minor changes could have caused the bustage, without looking into this too much, have you tried starting from a clean slate (without your patches) and see if 1 build + 1 test job runs green? Or tried again by pulling latest changes and applying your patches on top?

you can use the rdiff hg extension and diff your local repo against a remote url (like canonical m-c). That way you can see all the changes you've made.
(In reply to Jordan Lund (:jlund) [PTO until May 13th] from comment #35) 
> hmm, I'd imagine it would be pretty hard to diagnose changes between try
> commits. If you don't think your minor changes could have caused the
> bustage, without looking into this too much, have you tried starting from a
> clean slate (without your patches) and see if 1 build + 1 test job runs
> green? Or tried again by pulling latest changes and applying your patches on
> top?
> 
> you can use the rdiff hg extension and diff your local repo against a remote
> url (like canonical m-c). That way you can see all the changes you've made.

Hi Simar,

You were quite close.

I took your combined patches, ran them against try, and was able to reproduce the same errors. I found two problems with your patch:

1) Path issues: you were invoking the talos scripts from /builds/slave/test/build as ../talos/talos/*.py, but that path doesn't exist. The correct path from that working dir is ../../talos-data/talos/*.py. This may be related to bug 1143018.

2) Missing module: once the talos path was correct, I discovered the mozprofile module wasn't being installed.

Here is a minimal try run with this patch:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=72924730bba4

Simar: we're probably not ready to land yet because we'll want to verify the results on other platforms first. If you have time to take this patch and continue with it, that would be great.
Attachment #8599594 - Flags: review?(jmaher)
Attachment #8599594 - Flags: feedback?(s244sing)
Comment on attachment 8599594 [details] [diff] [review]
[mozharness] Use cloned talos

Review of attachment 8599594 [details] [diff] [review]:
-----------------------------------------------------------------

thanks for reviving this patch!

::: mozharness/mozilla/testing/talos.py
@@ +588,5 @@
>              self.mkdir_p(env['MOZ_UPLOAD_DIR'])
>          env = self.query_env(partial_env=env, log_level=INFO)
>          # sets a timeout for how long talos should run without output
>          output_timeout = self.config.get('talos_output_timeout', 3600)
> +        # Call PerfConfigurator to generate talos.yml

this comment should be:
# Call run_tests.py to run the tests specified in talos.yml
Attachment #8599594 - Flags: review?(jmaher) → review+
(In reply to Chris Cooper [:coop] from comment #36)
> Created attachment 8599594 [details] [diff] [review]
> [mozharness] Use cloned talos
> 
> (In reply to Jordan Lund (:jlund) [PTO until May 13th] from comment #35) 
> > hmm, I'd imagine it would be pretty hard to diagnose changes between try
> > commits. If you don't think your minor changes could have caused the
> > bustage, without looking into this too much, have you tried starting from a
> > clean slate (without your patches) and see if 1 build + 1 test job runs
> > green? Or tried again by pulling latest changes and applying your patches on
> > top?
> > 
> > you can use the rdiff hg extension and diff your local repo against a remote
> > url (like canonical m-c). That way you can see all the changes you've made.
> 
> Hi Simar,
> 
> You were quite close.
> 
> I took your combined patches, ran them against try, and was able to
> reproduce the same errors. I found two problems with your patch:
> 
> 1) Path issues: you were invoking the talos scripts from
> /builds/slave/test/build as ../talos/talos/*.py, but that path doesn't
> exist. The correct path from that working dir is
> ../../talos-data/talos/*.py. This may be related to bug 1143018.
> 
> 2) Missing module: once the talos path was correct, I discovered the
> mozprofile module wasn't being installed.
> 
> Here is a minimal try run with this patch:
> 
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=72924730bba4
> 
> Simar: we're probably not ready to land yet because we'll want to verify the
> results on other platforms first. If you have time to take this patch and
> continue with it, that would be great.

Hi Chris,

Thanks a lot for looking into this. I'll try to test the suggestions you've made on other platforms as well. Hopefully we'll be able to iron things out soon.
I did a quick try run yesterday and found that the patch fails on Windows:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=9acc23fe69f3

Need to figure out trychooser platform invocation to test on Mac, although I suspect it will behave like Linux.
(In reply to Chris Cooper [:coop] from comment #39)
> I did a quick try run yesterday and found that the patch fails on Windows:

I bet that at least part of the solution here is to use os.path.join() to construct the paths to the PerfConfigurator.py and run_tests.py. Good ol' Windows!
contacted Simar over email. I'll be wrapping up the work that Simar and coop have done here. try push incoming.
Assignee: s244sing → jlund
Status: NEW → ASSIGNED
(In reply to Chris Cooper [:coop] from comment #40)
> (In reply to Chris Cooper [:coop] from comment #39)
> > I did a quick try run yesterday and found that the patch fails on Windows:
> 
> I bet that at least part of the solution here is to use os.path.join() to
> construct the paths to the PerfConfigurator.py and run_tests.py. Good ol'
> Windows!

turns out that os.path.join didn't help here. Taking a look at this deeper, it seems like c['webroot'] points to an abs path of our talos-data dir that we are trying relatively get to. I think it makes sense to just use webroot as it's configured and absolute:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=d14a7e2208c9

jmaher: I see that we copy scripts out of the local talos clone checkout (e.g. /builds/slave/test/build/talos_repo) into webroot (e.g. /builds/slave/talos-data/talos). When we call PerfConfigurator.py and run_tests.py, should we be using webroot's copy or the local checkout? Does it matter?
Flags: needinfo?(jmaher)
webroot is only setup for the apache server, we don't run scripts from there.  Ideally we would have a fresh clone in /builds/slave/test/build/talos_repo and when we call Perfconfigurator.py, it uses the copy there and not something in a virtualenv.
Flags: needinfo?(jmaher)
(In reply to Joel Maher (:jmaher) from comment #43)
> webroot is only setup for the apache server, we don't run scripts from
> there.  Ideally we would have a fresh clone in
> /builds/slave/test/build/talos_repo and when we call Perfconfigurator.py, it
> uses the copy there and not something in a virtualenv.

thanks for the quick reply. Okay, I'll switch it from talos-data to talos_repo.
Attachment #8599594 - Flags: feedback?(s244sing)
had time to revisit this and hash out the last few hiccups.

interdiff from last r+: http://people.mozilla.org/~jlund/150529_use_cloned_talos-mh-interdiff.patch

interdiff changes:
1) use actual repo checkout (not copy talos-data)
2) explicitly pass python interpreter because winDOEs and the repo isn't in the PATH

trial runs:
linux/windows - https://treeherder.mozilla.org/#/jobs?repo=try&revision=13a278a0acd0
macosx - https://treeherder.mozilla.org/#/jobs?repo=try&revision=69e124da1574
Attachment #8613025 - Flags: review?(jmaher)
Comment on attachment 8613025 [details] [diff] [review]
150529_use_cloned_talos-mh.patch

Review of attachment 8613025 [details] [diff] [review]:
-----------------------------------------------------------------

overall this looks good.  let me know if my list of modules is concerning.

::: mozharness/mozilla/testing/talos.py
@@ +159,4 @@
>                                                'run-tests',
>                                                ])
>          kwargs.setdefault('config', {})
> +        kwargs['config'].setdefault('virtualenv_modules', ["mozinstall", "mozdevice", "pyyaml", "mozversion", "datazilla", "mozcrash", "mozhttpd", "mozprofile"])

here  are the specific requirements.txt contents:
PyYAML
*mozlog==2.6
mozcrash==0.13
mozdevice==0.40
*mozfile==1.1
mozhttpd==0.5
*mozinfo==0.7
*moznetwork==0.24
*mozprocess==0.21
mozversion==0.8
mozprofile==0.23
*httplib2

the ones with a * are missing from the list up there.  Maybe we don't need them as they are subdependencies?
Attachment #8613025 - Flags: review?(jmaher) → review+
judging by my try push results:

17:04:05     INFO - Getting output from command: ['C:\\slave\\test\\build\\venv\\Scripts\\pip', 'freeze']
17:04:05     INFO - Copy/paste: C:\slave\test\build\venv\Scripts\pip freeze
17:04:05     INFO - Reading from file tmpfile_stdout
17:04:05     INFO - Using _rmtree_windows ...
17:04:05     INFO - Using _rmtree_windows ...
17:04:05     INFO - Current package versions:
17:04:05     INFO -   mozInstall == 1.12
17:04:05     INFO -   cache-flusher == 1.0.4
17:04:05     INFO -   distribute == 0.6.14
17:04:05     INFO -   talos == 0.0
17:04:05     INFO -   pywin32 == 216
17:04:05     INFO -   mozlog == 2.6
17:04:05     INFO -   manifestparser == 1.1
17:04:05     INFO -   moznetwork == 0.24
17:04:05     INFO -   mozfile == 1.1
17:04:05     INFO -   mozsystemmonitor == 0.0
17:04:05     INFO -   mozdevice == 0.40
17:04:05     INFO -   mozprocess == 0.21
17:04:05     INFO -   PyYAML == 3.11
17:04:05     INFO -   mozprofile == 0.23
17:04:05     INFO -   mozinfo == 0.7
17:04:05     INFO -   blobuploader == 1.2.4
17:04:05     INFO -   mozcrash == 0.13
17:04:05     INFO -   httplib2 == 0.7.4
17:04:05     INFO -   mozhttpd == 0.5
17:04:05     INFO -   oauth2 == 1.5.211
17:04:05     INFO -   blessings == 1.5.1
17:04:05     INFO -   mozversion == 0.8
17:04:05     INFO -   requests == 1.2.3
17:04:05     INFO -   docopt == 0.6.1


it looks like we get all those modules for free from other sub dep but I suppose it is better to be explicit than hopeful :)


remote:   https://treeherder.mozilla.org/#/jobs?repo=try&revision=e8c5027ae727
remote:   https://treeherder.mozilla.org/#/jobs?repo=try&revision=59948a058837
Attachment #8613114 - Flags: review?(jmaher)
Comment on attachment 8613114 [details] [diff] [review]
150529_use_cloned_talos-mh-interdiff2.patch

Review of attachment 8613114 [details] [diff] [review]:
-----------------------------------------------------------------

thanks!   In the near future we will be able to remove a few of these dependencies, but we cannot do it now.
Attachment #8613114 - Flags: review?(jmaher) → review+
Comment on attachment 8613114 [details] [diff] [review]
150529_use_cloned_talos-mh-interdiff2.patch

remote:   https://hg.mozilla.org/build/mozharness/rev/af81f40c0d13

thanks! checked in.
oh, backed out for media_tests failure:
http://hg.mozilla.org/build/mozharness/rev/bbfd393a3457

we only run this on linux64 other- in there we do import statements and instead we can find another way to access the functions we need.
Depends on: 1171159
tree logs suggest this is working for 'o' and all talos suites across platforms. are we done here?
sounds like it.  Thanks for pushing on this jlund!
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Hello :jlund,

I am looking at mozharness / talos import issues (I suspect because of this change).

We can't do import likes:

from talos import talosconfig

inside talos since now it is no more a package from a mozharness point of view. But this works locally, and I would like to be able to use talos as a package again to avoid this kind of issues.

I am thinking about installing talos this way inside mozharness:

> self.create_virtualenv()  # nothing to install
> self.install_module(self.talos_path, editable=True)

I think the editable mode (that will do a pip -e /path/to/talos) will avoid the issue we had here.

What do you think ?
Flags: needinfo?(jlund)
See Also: → 1189021
replied in 1189021 :)
Flags: needinfo?(jlund)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: