967630 - Add an option to trychooser to select Talos profiling options

Reporter

Description

•

10 years ago

I had a discussion with jmaher and catlee. Sounds like mozharness has access to the commit message so we could parse the TryChooser syntax and set an environment variable before invoking the talos job.

http://hg.mozilla.org/build/mozharness/file/607e2b60deeb/mozharness/mozilla/testing/talos.py

Viktor Stanchev [:vikstrous]

Assignee

Updated

•

10 years ago

Assignee: nobody → vstanchev

Steve Fink [:sfink] [:s:]

Comment 1

•

10 years ago

Attached patch directives (obsolete) — Details — Splinter Review

I've done this in a local patch to my mozharness-based hazard analysis builds. Here's the patch I'm using, which might be useful as an example.

Steve Fink [:sfink] [:s:]

Comment 2

•

10 years ago

sorry, I guess I should comment it now that it's not just for private use

Steve Fink [:sfink] [:s:]

Comment 3

•

10 years ago

Attached patch directives — Details — Splinter Review

Attachment #8381484 - Attachment is obsolete: true

Viktor Stanchev [:vikstrous]

Assignee

Comment 4

•

10 years ago

Attached patch mozharness_talos_profiler (obsolete) — Details — Splinter Review

I considered several approaches and I ended up picking the last one. Here's the list of ideas:

1. Parse mozharness: parameters from the try server message in a generic way in the base class of all tests. This way we use any parameter that we could use when calling mozharness from the command line.

2. Parse mozharness: just like in 1. as well as --talos "<options>" for options that should be passed through to talos's command line arguments.

3. Write a mixin that any test can use to read mozharness: parameters

4. Parse only two specific values from mozharness: for the profiler

I didn't pick options 1 and 2 because the config parsing system for mozharness is big and scary. I would have to come up with some scheme for merging configs from the command line and from the commit message.

I didn't pick option 2 because it would require an awkward nesting of command line parameter lists.

I didn't pick option 3 for a similar reason to option 1. A generic implementation is difficult.

I picked option 4 because it was the fastest and easier thing to do for our use case.

I also updated this wiki page with some more useful info about running mozharness with local talos and firefox. I don't know mozharness well enough to make the instructions very general, but hopefully this will help the next person who tries to do this.

https://wiki.mozilla.org/ReleaseEngineering/Mozharness/How_to_run_tests_as_a_developer

Flags: needinfo?(bgirard)

Benoit Girard (:BenWa)

Reporter

Comment 5

•

10 years ago

That solves our use case pretty well. I don't think there is a lot of need for other things to extend TryChooser. If that's the case this is the right decision. Joel do you agree?

Flags: needinfo?(bgirard) → needinfo?(jmaher)

Joel Maher ( :jmaher ) (UTC -8)

Comment 6

•

10 years ago

I really don't understand how this parsing code is called.  I am still not sure how this allows us to run spsprofiling on try server.

Flags: needinfo?(jmaher)

Viktor Stanchev [:vikstrous]

Assignee

Comment 7

•

10 years ago

Buildbot calls mozharness with some parameters in a json at `buildbot_json_path`. Mozharness can access these parameters in `self.buildbot_config` after `self.read_buildbot_config()` is called (if a buildbot_json_path is given). One of the parameters is the list of commits submitted to the try server. It looks at the commit message of the last commit and finds 'mozharness:' in it. If it's present, it takes all parameters from 'mozharness:' to the end or until it sees another all alpha sequence followed by a `:` (such as 'try: <stuff>'). It extracts --spsProfile and --spsProfileInterval from this. The net result is that you can write on your commit message something like:

mozharness: --spsProfile --spsProfileInterval 10 try: -b o -p linux -u none -t none

Flags: needinfo?(jmaher)

Viktor Stanchev [:vikstrous]

Assignee

Comment 8

•

10 years ago

Mozharness is the thing that calls Talos. It passes the profiling options along to Talos.

Joel Maher ( :jmaher ) (UTC -8)

Comment 9

•

10 years ago

my question is related to the code, previously I assumed this was in the talos repository, now I see it is in the mozharness repository!

This approach seems great.

Flags: needinfo?(jmaher)

Benoit Girard (:BenWa)

Reporter

Comment 10

•

10 years ago

Comment on attachment 8383808 [details] [diff] [review]
mozharness_talos_profiler

Look ok to me. There doesn't seem to be a review for mozharness talos so I'm tentatively asking you Joel.

Attachment #8383808 - Flags: review?(jmaher)

Attachment #8383808 - Flags: feedback+

Joel Maher ( :jmaher ) (UTC -8)

Comment 11

•

10 years ago

Comment on attachment 8383808 [details] [diff] [review]
mozharness_talos_profiler

Review of attachment 8383808 [details] [diff] [review]:
-----------------------------------------------------------------

i think these look fine, lets ask jlund as he knows more about mozharness.  

jlund- do these changes look good for mozharness changes?

Attachment #8383808 - Flags: review?(jmaher) → review?(jlund)

Jordan Lund (:jlund)

Comment 12

•

10 years ago

Comment on attachment 8383808 [details] [diff] [review]
mozharness_talos_profiler

Review of attachment 8383808 [details] [diff] [review]:
-----------------------------------------------------------------

awesome! thanks for the patch and working on this. I have some feedback below.

::: mozharness/mozilla/testing/talos.py
@@ +137,5 @@
> +            "dest": "sps_profile_interval",
> +            "type": "int",
> +            "default": 0,
> +            "help": "The interval between samples taken by the profiler (milliseconds)"
> +        }],

Nice, this will be friendly outside of automation

@@ +159,5 @@
>                                                'install',
>                                                'run-tests',
>                                                ])
>          kwargs.setdefault('config', {})
> +        kwargs['config'].setdefault('virtualenv_modules', ["talos", "mozinstall==1.6"])

I too experimented with this as talos requires mozinfo 0.4. My understanding was that jmaher was testing out mozinfo=0.7 so we may not need to hard code mozinstall 1.6. Were you using this with mach or is there a reason we need to stay back at 1.6?

@@ +192,5 @@
> +    # Example try commit message:
> +    #   mozharness: --spsProfile try: <stuff>
> +    def _pre_config_lock(self, rw_config):
> +        if not self.buildbot_config:
> +            self.read_buildbot_config()

So, looking at our existing talos script, it seems we call read_buildbot_config already twice -- In tbpl logs you can see it being run at the start of the mozharness script and again because it is in the list of actions: 'read-buildbot-config'

I have two concerns with adding this:

1) we should only need to call read_buildbot_config() once (the props never change on subsequent calls) and outputting it again repeatedly is noisy. So even though you have 'if not self.buildbot_config', _pre_config_lock will be called prior to everything else anyway.

2) adding this patch, we will never be able call this talos script outside of automation (buildbot).
Right now, we can pass things like '--no-read-buildbot-config' and that removes it out of list of actions. But _pre_config_lock will be called no matter what on every script run. This means that things like mach/local mozharness calls will fail.

I think this whole talos script needs a clean up (it shouldn't call read_buildbot_config() twice in the first place). But I'd prefer to avoid making it worse with a forced third call (see below for a suggested alternative).

@@ +449,5 @@
> +        if self.sps_profile:
> +          options.append('--spsProfile')
> +        # configure profiling interval
> +        if self.sps_profile_interval:
> +          options.extend(['--spsProfileInterval', str(self.sps_profile_interval)])

So I think we can query things here since read-buildbot-config will already be called by this time so we could do something like (instead of _pre_config_lock)

sps_profile_options = self.query_sps_profile_options()
if sps_profile_options:
    options.extend(sps_profile_options)

# and then in the query method
def query_sps_profile_options(self):
    sps_results = []
    # first let's see if we passed CLI or config options for spsProfile 
    if self.sps_profile:
        # this was set outside of automation and if present,
        # we will go with this
	sps_results.append('--spsProfile')
        if self.sps_profile_interval:
	    # configure profiling interval
            # XXX is this a requirement (is it coupled) with --spsProfile?
            # XXX and if so, should we fail if both are not present?
	    sps_results.extend(
                ['--spsProfileInterval', str(self.sps_profile_interval)]
            )
        # return this as it has highest priority
        return sps_results
    # now let's make sure that we didn't set this via try commit msg
    if self.buildbot_config:
         # this is inside automation
         # now let's see if we added spsProfile specs
         junk, junk, opts = self.buildbot_config['sourcestamp']['changes'][-1]['comments'].partition('mozharness:')
         if opts:
             opts = re.sub(r'\w+:.*', '', opts).strip().split(' ')
         else:
             opts = []
         if "--spsProfile" in opts:
             sps_results.append('--spsProfile')
         try:
             idx = opts.index('--spsProfileInterval')
             if len(opts) > idx + 1:
                   sps_results.extend(
                        ['--spsProfileInterval', str(opts[idx + 1])]
                   )
         except ValueError:
              pass
    return sps_results


NOTE: wrote this in code review so it is sure to have many mistakes. Hopefully it explains my thoughts.

Finally, I'm assuming the spsProfileInterval is variable and will change per commit message. But if it is not, could we put it in the mozharness config (http://mxr.mozilla.org/build/source/mozharness/configs/talos/linux_config.py) or inside talos json(http://mxr.mozilla.org/mozilla-central/source/testing/talos/talos.json)? Then we would only need to parse for '--spsProfile' and there would be less chance of screwing up the msg pattern.

Attachment #8383808 - Flags: review?(jlund) → review-

Joel Maher ( :jmaher ) (UTC -8)

Comment 13

•

10 years ago

I have upgraded talos to use mozinfo 0.7 last week, we are not bound by a specific version of mozinstall, but I see that as helping to futureproof things.

Viktor Stanchev [:vikstrous]

Assignee

Comment 14

•

10 years ago

(In reply to Jordan Lund (:jlund) from comment #12)
> @@ +159,5 @@
> >                                                'install',
> >                                                'run-tests',
> >                                                ])
> >          kwargs.setdefault('config', {})
> > +        kwargs['config'].setdefault('virtualenv_modules', ["talos", "mozinstall==1.6"])
> 
> I too experimented with this as talos requires mozinfo 0.4. My understanding
> was that jmaher was testing out mozinfo=0.7 so we may not need to hard code
> mozinstall 1.6. Were you using this with mach or is there a reason we need
> to stay back at 1.6?

Yeah, I was running it through mach. I had a lot of changes and I lost track of which ones are essential. This one was just to make mach work. I can take this out.

> 
> @@ +192,5 @@
> > +    # Example try commit message:
> > +    #   mozharness: --spsProfile try: <stuff>
> > +    def _pre_config_lock(self, rw_config):
> > +        if not self.buildbot_config:
> > +            self.read_buildbot_config()
> 
> So, looking at our existing talos script, it seems we call
> read_buildbot_config already twice -- In tbpl logs you can see it being run
> at the start of the mozharness script and again because it is in the list of
> actions: 'read-buildbot-config'
> 
> I have two concerns with adding this:
> 
> 1) we should only need to call read_buildbot_config() once (the props never
> change on subsequent calls) and outputting it again repeatedly is noisy. So
> even though you have 'if not self.buildbot_config', _pre_config_lock will be
> called prior to everything else anyway.
> 
> 2) adding this patch, we will never be able call this talos script outside
> of automation (buildbot).
> Right now, we can pass things like '--no-read-buildbot-config' and that
> removes it out of list of actions. But _pre_config_lock will be called no
> matter what on every script run. This means that things like mach/local
> mozharness calls will fail.
> 
> I think this whole talos script needs a clean up (it shouldn't call
> read_buildbot_config() twice in the first place). But I'd prefer to avoid
> making it worse with a forced third call (see below for a suggested
> alternative).

Well, I don't understand why the buildbot config isn't just read whenever it exists. Reading a config file is not expensive. Why is there an action to read it?

The problem I ran into when that action was enabled is that the action does more than just read the json and in my case I wasn't actually running it through buidbot, so it would just break if I used it.

> 
> @@ +449,5 @@
> > +        if self.sps_profile:
> > +          options.append('--spsProfile')
> > +        # configure profiling interval
> > +        if self.sps_profile_interval:
> > +          options.extend(['--spsProfileInterval', str(self.sps_profile_interval)])
> 
> So I think we can query things here since read-buildbot-config will already
> be called by this time so we could do something like (instead of
> _pre_config_lock)
> 
> sps_profile_options = self.query_sps_profile_options()
> if sps_profile_options:
>     options.extend(sps_profile_options)
> 
> # and then in the query method
> def query_sps_profile_options(self):
>     sps_results = []
>     # first let's see if we passed CLI or config options for spsProfile 
>     if self.sps_profile:
>         # this was set outside of automation and if present,
>         # we will go with this
> 	sps_results.append('--spsProfile')
>         if self.sps_profile_interval:
> 	    # configure profiling interval
>             # XXX is this a requirement (is it coupled) with --spsProfile?
>             # XXX and if so, should we fail if both are not present?

There's a default value if we just don't set it. There's no need to fail.

> 	    sps_results.extend(
>                 ['--spsProfileInterval', str(self.sps_profile_interval)]
>             )
>         # return this as it has highest priority

I'm not actually sure which one should have higher priority. In my patch buildbot's commit message had a higher priority. I think your order makes a bit more sense.

>         return sps_results
>     # now let's make sure that we didn't set this via try commit msg
>     if self.buildbot_config:
>          # this is inside automation
>          # now let's see if we added spsProfile specs
>          junk, junk, opts =
> self.buildbot_config['sourcestamp']['changes'][-1]['comments'].
> partition('mozharness:')
>          if opts:
>              opts = re.sub(r'\w+:.*', '', opts).strip().split(' ')
>          else:
>              opts = []
>          if "--spsProfile" in opts:
>              sps_results.append('--spsProfile')
>          try:
>              idx = opts.index('--spsProfileInterval')
>              if len(opts) > idx + 1:
>                    sps_results.extend(
>                         ['--spsProfileInterval', str(opts[idx + 1])]
>                    )
>          except ValueError:
>               pass
>     return sps_results
> 
> 
> NOTE: wrote this in code review so it is sure to have many mistakes.
> Hopefully it explains my thoughts.

In this code if we get the --spsProfile flag from one method we can't get the interval form the other method. I'd rather keep the two separate so that even if we override the --spsProfile on the command line it still uses the interval specified by buildbot. I'm not sure how much this matters, but I think it's a bit confusing if adding --spsProfile manually changes the interval.

The reason I did my changes in _pre_config_lock was so that it's easier to generalize and so that we can make sure that the same code path is taken whether the config is set through the command line or through buildbot. In a perfect world I would make mozharness: be parsed exactly as additional command line parameters, but I couldn't find any simple way to do that.

> 
> Finally, I'm assuming the spsProfileInterval is variable and will change per
> commit message. But if it is not, could we put it in the mozharness config
> (http://mxr.mozilla.org/build/source/mozharness/configs/talos/linux_config.
> py) or inside talos
> json(http://mxr.mozilla.org/mozilla-central/source/testing/talos/talos.
> json)? Then we would only need to parse for '--spsProfile' and there would
> be less chance of screwing up the msg pattern.

Well, I find modifying talos.json as a way to change parameters for try runs really non-intuitive. That is one of the reasons why we are doing this in the first place. I would rather not have two ways to interact with the profiler on talos tests and require both ways to be used simultaneously.

Viktor Stanchev [:vikstrous]

Assignee

Updated

•

10 years ago

Flags: needinfo?(jlund)

Jordan Lund (:jlund)

Comment 15

•

10 years ago

> Yeah, I was running it through mach. I had a lot of changes and I lost track
> of which ones are essential. This one was just to make mach work. I can take
> this out.

Let's keep it out of here but I think you're right, it's probably better to have. Let's isolate it in an attachment here at some point: Bug 935997 - mach talos-test doesn't work, complains about mozinfo==0.4

> Well, I don't understand why the buildbot config isn't just read whenever it
> exists. Reading a config file is not expensive. Why is there an action to
> read it?

Ya I hear ya. read_buildbot_config() does two things though 1) parses the json file and 2) saves the json as a dict obj. I think it makes things clearer if we call that once and then check for the presence of self.buildbot_config (the dict obj) from then on after.

> 
> The problem I ran into when that action was enabled is that the action does
> more than just read the json and in my case I wasn't actually running it
> through buidbot, so it would just break if I used it.

hmm, I think you're showing me I am wrong on behaviour. I thought read_buildbot_config would be fatal if it was called without the presence of the buildbot props but it seems that logic comes from postflight methods:

yes, the talos script looks to use TestMixin as a base class. Both talos and this use a postflight_{action} call for read_buildbot_config (if you define postflight and preflight methods with action names, they will be called). In these postflights, we demand the buildbot props to be there otherwise we will fail in automation to get things like installer urls.

It seems that whoever wrote the talos script, realized this and even though they called read_buildbot_config outside of actions, they made sure to include the postflight call (manually): http://mxr.mozilla.org/build/source/mozharness/mozharness/mozilla/testing/talos.py#204

So I suppose my 2nd concern is not an issue and we will still be able to run this out of automation if we call it in _pre_config_lock.

However, I still feel like we should keep self.buildbot_config and self.config as seperate dicts and not overwrite self.config with buildbot_config's vals (even in _pre_config_lock).

But I'm open to be persuaded otherwise.


> I'm not actually sure which one should have higher priority. In my patch
> buildbot's commit message had a higher priority. I think your order makes a
> bit more sense.

ya, I'm not sure either. I suppose in buildbot/automation we wouldn't pass '--spsProfile' and outside of automation, we wouldn't have buildbot_config with try messages. So really both can't exist. Unless we are planning on adding --spsProfile to every talos run and we add that as extra_args when we call this mozharness script from buildbot.

I have always felt that CLI and then the cfg-file themselves take priority. But I see the argument that if this is a single try push and we explicitly say in the message what values we want, maybe that should take precedence. Maybe if we just add a self.info("") explicitly saying which one we are taking I'm fine with either way.

> In this code if we get the --spsProfile flag from one method we can't get
> the interval form the other method. I'd rather keep the two separate so that
> even if we override the --spsProfile on the command line it still uses the
> interval specified by buildbot. I'm not sure how much this matters, but I
> think it's a bit confusing if adding --spsProfile manually changes the
> interval.

ya makes sense.

> 
> The reason I did my changes in _pre_config_lock was so that it's easier to
> generalize and so that we can make sure that the same code path is taken
> whether the config is set through the command line or through buildbot. In a
> perfect world I would make mozharness: be parsed exactly as additional
> command line parameters, but I couldn't find any simple way to do that.
>

ya, I still don't like the idea of overwriting self.config items with self.buildbot_config items.

So how about this: self.sps_profile{_interval} are originally assigned by self.config like you have in the __init__. They can both be re-assigned if commit msg were given from builbdot. We don't worry about calling read_buildbot_config again but stick with something like I suggested before:

def query_sps_profile_options(self):
    sps_results = []
    # first let's see if we explicitly set the sps profile options
    if self.buildbot_config:
        # this is inside automation
        # now let's see if we added spsProfile specs
        junk, junk, opts = self.buildbot_config['sourcestamp']['changes'][-1]['comments'].partition('mozharness:')
        if opts:
            opts = re.sub(r'\w+:.*', '', opts).strip().split(' ')
            if "--spsProfile" in opts:
                # overwrite whatever was set here.
                self.sps_profile = True
            try:
                idx = opts.index('--spsProfileInterval')
                if len(opts) > idx + 1:
                    # overwrite whatever was set here.
                    self.sps_profile_interval = opts[idx + 1])
            except ValueError:
                pass
    # finally, if sps_profile is set, we add that to the talos options
    if self.sps_profile:
        sps_results.append('--spsProfile')
        if self.sps_profile_interval:
            sps_results.extend(
                ['--spsProfileInterval', str(self.sps_profile_interval)]
            )
    # now let's make sure that we didn't set this via try commit msg
    return sps_results

Notice, I realized you said that if self.sps_profile_interval but self.sps_profile is true we should let talos choose the default. But here I added: if self.sps_profile is not assigned, don't even bother with self.sps_profile_interval.

> Well, I find modifying talos.json as a way to change parameters for try runs
> really non-intuitive. That is one of the reasons why we are doing this in
> the first place. I would rather not have two ways to interact with the
> profiler on talos tests and require both ways to be used simultaneously.

OK cool, sounds good, just thought I'd see what you thought.

Thanks. Let me know if I am still way off base.

Flags: needinfo?(jlund)

Jordan Lund (:jlund)

Updated

•

10 years ago

Whiteboard: [mozharness]

Viktor Stanchev [:vikstrous]

Assignee

Comment 16

•

10 years ago

Attached patch mozharness_talos_profiler2 — Details — Splinter Review

Attachment #8383808 - Attachment is obsolete: true

Viktor Stanchev [:vikstrous]

Assignee

Updated

•

10 years ago

Attachment #8390003 - Flags: review?(jlund)

Jordan Lund (:jlund)

Comment 17

•

10 years ago

Comment on attachment 8390003 [details] [diff] [review]
mozharness_talos_profiler2

lgtm

I can't say if the parse logic works as I have'nt tested.

We were talking about testing this out and I think you made progress outside of automation. Were you able to determine if:

1) talos was unaffected by this change (for when you did not add sps profile options)
2) this worked via CLI options
3) this worked via simulated try commit message from a fake buildprops.json

If yes to all those 3 then let's r+ and land on mozharness default.

From there, since this could potentially affect all talos suites, I'd like to see how things look on cypress[1]. After that, we can merge to production and try on try (the real test!).

Thanks for the patch! The ability to change this via try msg sounds useful :)

[1] cypress is the mozharness testing branch where we try out every suite against vanilla mozilla-central and the default mozharness branch. If you need triggering build jobs to this, I can do one after you commit to mozharness. Instructions are here: https://wiki.mozilla.org/ReleaseEngineering/SpecialBranches#Cypress

Attachment #8390003 - Flags: review?(jlund) → review+

Steve Fink [:sfink] [:s:]

Comment 18

•

10 years ago

(In reply to Jordan Lund (:jlund) from comment #17)
> Thanks for the patch! The ability to change this via try msg sounds useful :)

Note that this patch refers to the "try message", it doesn't restrict itself to that. A push to any tree with a "mozharness:" directive in the commit message will get this special handling.

I don't see very many mistaken "try: " commit messages in mozilla-central's history (11 total), so this probably doesn't matter too much. But it's something to keep in mind.

I don't know if it would be better to restrict this to the try branch (which would make it annoying to test on Cypress), or if we should add a commit hook to reject commit messages with "\btry: -" and "\bmozharness: " in them. Or doing nothing's not so bad, either.

Viktor Stanchev [:vikstrous]

Assignee

Comment 19

•

10 years ago

(In reply to Jordan Lund (:jlund) from comment #17)
> Comment on attachment 8390003 [details] [diff] [review]
> mozharness_talos_profiler2
> 
> lgtm
> 
> I can't say if the parse logic works as I have'nt tested.
> 
> We were talking about testing this out and I think you made progress outside
> of automation. Were you able to determine if:
> 
> 1) talos was unaffected by this change (for when you did not add sps profile
> options)

Yes

> 2) this worked via CLI options

Yes

> 3) this worked via simulated try commit message from a fake buildprops.json

Yes

> 
> If yes to all those 3 then let's r+ and land on mozharness default.
> 
> From there, since this could potentially affect all talos suites, I'd like
> to see how things look on cypress[1]. After that, we can merge to production
> and try on try (the real test!).
> 
> Thanks for the patch! The ability to change this via try msg sounds useful :)
> 
> [1] cypress is the mozharness testing branch where we try out every suite
> against vanilla mozilla-central and the default mozharness branch. If you
> need triggering build jobs to this, I can do one after you commit to
> mozharness. Instructions are here:
> https://wiki.mozilla.org/ReleaseEngineering/SpecialBranches#Cypress

How do I get this committed to mozharness? Does someone else have to do it for me?

Viktor Stanchev [:vikstrous]

Assignee

Comment 20

•

10 years ago

(In reply to Steve Fink [:sfink] from comment #18)
> (In reply to Jordan Lund (:jlund) from comment #17)
> > Thanks for the patch! The ability to change this via try msg sounds useful :)
> 
> Note that this patch refers to the "try message", it doesn't restrict itself
> to that. A push to any tree with a "mozharness:" directive in the commit
> message will get this special handling.
> 
> I don't see very many mistaken "try: " commit messages in mozilla-central's
> history (11 total), so this probably doesn't matter too much. But it's
> something to keep in mind.
> 
> I don't know if it would be better to restrict this to the try branch (which
> would make it annoying to test on Cypress), or if we should add a commit
> hook to reject commit messages with "\btry: -" and "\bmozharness: " in them.
> Or doing nothing's not so bad, either.

That's a good point. It would be a weird issue to debug if it does happen, but I suspect the first thing anyone would do if they see a performance degradation is look at the last commit that caused it and it would include "mozharness: --spsProfile" so maybe that's not too hard to figure out.

Jordan Lund (:jlund)

Comment 21

•

10 years ago

(In reply to Steve Fink [:sfink] from comment #18)
> (In reply to Jordan Lund (:jlund) from comment #17)
> > Thanks for the patch! The ability to change this via try msg sounds useful :)
> 
> Note that this patch refers to the "try message", it doesn't restrict itself
> to that. 

Ah ya that's true.

> I don't know if it would be better to restrict this to the try branch (which
> would make it annoying to test on Cypress), or if we should add a commit
> hook to reject commit messages with "\btry: -" and "\bmozharness: " in them.

My vote is for this. If we accidentally push to another branch other than try, it would make sense to have a hook that would catch it. That goes beyond the times when that accidental push contains '--spsProfile' in the commit message.

Smells like bug worthy to me even if, as you say, there have only been 11 accidental try pushes to m-c.

Jordan Lund (:jlund)

Comment 22

•

10 years ago

> How do I get this committed to mozharness? Does someone else have to do it
> for me?

1) push to default at any time
2) merge m-c -> cypress (if you want to test, which we should here)[1]
3) it will be merged to production via release engineering ~3 times a week.

https://wiki.mozilla.org/ReleaseEngineering/SpecialBranches#Cypress

Jordan Lund (:jlund)

Comment 23

•

10 years ago

Comment on attachment 8390003 [details] [diff] [review]
mozharness_talos_profiler2

pushed to default: https://hg.mozilla.org/build/mozharness/rev/0417eb4a2ad5

Attachment #8390003 - Flags: checked-in+

Benoit Girard (:BenWa)

Reporter

Comment 24

•

10 years ago

I think the more likely case is a developer pushing to try with multiple try syntax in his queue. Normally only the topmost push is picked up. Not that I think it should change the outcome of the decision.

Jordan Lund (:jlund)

Comment 25

•

10 years ago

and the cypress jobs: https://tbpl.mozilla.org/?showall=all&tree=Cypress&rev=b01286b4ed37

Viktor, keep an eye on the talos jobs for:

1) they are all green
2) the command line args for the talos jobs should be the same as before
3) performance shouldn't have changed: https://datazilla.mozilla.org/?start=1394128127&stop=1394732927&product=Firefox&repository=Cypress

*NOTE: you can click cancel on the other jobs that are not talos after the builds have completed to save resources

I don't think there will be any perf diff WRT (3) if '--spsProfile' wasn't added accidentally.

We could add a fake commit message in cypress on another push or else try it on 'try' and back it out if need be.

Viktor Stanchev [:vikstrous]

Assignee

Comment 26

•

10 years ago

Looks normal to me. If you give it a fake commit message with "mozharness: --spsProfile" it will cause talos to error out because it doesn't have that option yet (bug 967635), so that should be pretty easy to test.

Flags: needinfo?(jlund)

Jordan Lund (:jlund)

Comment 27

•

10 years ago

sounds good! I guess if you know how to use the commit message pattern, you should know it's not implemented on talos yet so that seems OK :)

Flags: needinfo?(jlund)

Viktor Stanchev [:vikstrous]

Assignee

Comment 28

•

10 years ago

Attached patch trychooser (obsolete) — Details — Splinter Review

These are my proposed changes to the try chooser. I could make them more generalized, but I think there is no need yet.

Viktor Stanchev [:vikstrous]

Assignee

Updated

•

10 years ago

Attachment #8398733 - Flags: review?(sphink)

Steve Fink [:sfink] [:s:]

Updated

•

10 years ago

Attachment #8398733 - Flags: review?(sphink) → review+

Viktor Stanchev [:vikstrous]

Assignee

Comment 29

•

10 years ago

Attached patch trychooser (obsolete) — Details — Splinter Review

Looks like we need "mozharness: ..." at the end instead of the beginning of it won't be passed to mozharness. https://bugzilla.mozilla.org/show_bug.cgi?id=967635#c27

Attachment #8398733 - Attachment is obsolete: true

Attachment #8399579 - Flags: review?(sphink)

Steve Fink [:sfink] [:s:]

Comment 30

•

10 years ago

Comment on attachment 8399579 [details] [diff] [review]
trychooser

Review of attachment 8399579 [details] [diff] [review]:
-----------------------------------------------------------------

What requires the ordering? I thought the try parser allowed unrecognized args, even at the end, but perhaps the mozharness talos one doesn't?

Anyway, this patch is clearly fine. Hopefully it accomplishes what you need it to.

Attachment #8399579 - Flags: review?(sphink) → review+

Viktor Stanchev [:vikstrous]

Assignee

Comment 31

•

10 years ago

Here we I put "mozharness:" first and it was not passed to moznarness. Only the part of the message after "try:" was sent to mozharness - https://tbpl.mozilla.org/?tree=Try&rev=dd76971bd49f

Steve Fink [:sfink] [:s:]

Comment 32

•

10 years ago

(In reply to Viktor Stanchev [:vikstrous] from comment #31)
> Here we I put "mozharness:" first and it was not passed to moznarness. Only
> the part of the message after "try:" was sent to mozharness -
> https://tbpl.mozilla.org/?tree=Try&rev=dd76971bd49f

What the heck? You're right. Here's the message shown in tbpl for the try push:

dd76971bd49f Viktor Stanchev – mozharness: --spsProfile try: try: -b o -p linux,linux64,macosx64,win32 -u none -t all Bug 978585 - Loader lock deadlock on startup when initializing NS_StackWalk

And here's a summarized portion of the log file

    "sourcestamp": {
        "branch": "try-linux-talos", 
        "changes": [
            {
                "rev": "dd76971bd49f", 
                "who": "vstanchev@mozilla.com", 
                "comments": "try: try: -b o -p linux,linux64,macosx64,win32 -u none -t all Bug 978585 - Loader lock deadlock on startup when initializing NS_StackWalk", 
            }
        ], 
        "revision": "dd76971bd49f"
    }

so something stripped off everything before "try:". /me digs into buildbotcustom code...

Jordan Lund (:jlund)

Comment 33

•

10 years ago

off a quick look it seems like for our sendchanges we: http://mxr.mozilla.org/build/source/buildbotcustom/steps/misc.py#129

and for our trychooser scheduling: http://mxr.mozilla.org/build/source/buildbotcustom/misc_scheduler.py#63

so I am guessing we strip off everything up until the regex 'try:'

Viktor Stanchev [:vikstrous]

Assignee

Comment 34

•

10 years ago

Steve, I don't have commit access. Can you commit attachment 8399579 [details] [diff] [review] for me? We are getting the talos changes committed right now. Is there any extra work to be done to get the new revision deployed?

Flags: needinfo?(sphink)

Viktor Stanchev [:vikstrous]

Assignee

Comment 35

•

10 years ago

Attached patch trychooser2 — Details — Splinter Review

Added a link to https://wiki.mozilla.org/Buildbot/Talos/Profiling

Attachment #8399579 - Attachment is obsolete: true

Steve Fink [:sfink] [:s:]

Comment 36

•

10 years ago

Ok, <bbc>/steps/misc.py sends the message through try_parser's processMessage() before constructing the comment for the sendchange that will trigger a dependent job. processMessage() grabs out the portions it recognizes and returns them as an array. This was done in bug 697802 to shorten the commit message and sanitize it for passing to Window's command interpreter.

Ugh. This feels brittle.

(Sorry, I had this comment open in a tab for... a month? Just sending it now. No action required.)

Steve Fink [:sfink] [:s:]

Updated

•

10 years ago

Attachment #8411322 - Flags: review+

Steve Fink [:sfink] [:s:]

Updated

•

10 years ago

Attachment #8411322 - Flags: checked-in+

Steve Fink [:sfink] [:s:]

Comment 37

•

10 years ago

http://hg.mozilla.org/build/tools/rev/51e2b2115841

Benoit Girard (:BenWa)

Reporter

Updated

•

10 years ago

Blocks: 1000536

Ed Morley [:emorley]

Updated

•

10 years ago

Keywords: trychooser

Ed Morley [:emorley]

Updated

•

10 years ago

No longer blocks: 1000536

Steve Fink [:sfink] [:s:]

Comment 39

•

10 years ago

(In reply to Viktor Stanchev [:vikstrous] from comment #34)
> Steve, I don't have commit access. Can you commit attachment 8399579 [details] [diff] [review]
> [details] [diff] [review] for me? We are getting the talos changes committed
> right now. Is there any extra work to be done to get the new revision
> deployed?

Sorry, missed this last question. The change needs to be merged to the production branch, and the masters reconfigured. #releng buildduty will do that automatically every so often. Maybe they already have for this one? You can ask there.

Flags: needinfo?(sphink)

Jordan Lund (:jlund)

Comment 40

•

10 years ago

https://bugzilla.mozilla.org/attachment.cgi?id=8411322&action=edit is in the tools repo. Unlike buildbot-configs and buildbotcustom, we just have the one default branch in tools so there won't be any merging, we just need to reconfig.

last reconfig was just now: https://wiki.mozilla.org/ReleaseEngineering/Maintenance#Reconfigs_.2F_Deployments

This should be live in production :)

Benoit Girard (:BenWa)

Reporter

Comment 41

•

10 years ago

(In reply to Jordan Lund (:jlund) from comment #40)
> This should be live in production :)

I don't see it:
http://trychooser.pub.build.mozilla.org/

Perhaps there is a delay?

Jordan Lund (:jlund)

Comment 42

•

10 years ago

oh sorry, I didn't look at the actual patch. tools repo is deployed in a number of places and not always involved with buildbot continuous integration.

let me find out how http://trychooser.pub.build.mozilla.org/ is updated.

Jordan Lund (:jlund)

Updated

•

10 years ago

Depends on: 1001136

Jordan Lund (:jlund)

Comment 43

•

10 years ago

\o/ looks like it worked :)

Benoit Girard (:BenWa)

Reporter

Updated

•

10 years ago

Status: NEW → RESOLVED

Closed: 10 years ago

Resolution: --- → FIXED

Nobody; OK to take it and work on it

Updated

•

7 years ago

Component: Tools → General

directives 10 years ago Steve Fink [:sfink] [:s:] 2.66 KB, patch		Details \| Diff \| Splinter Review
directives 10 years ago Steve Fink [:sfink] [:s:] 2.97 KB, patch		Details \| Diff \| Splinter Review
mozharness_talos_profiler 10 years ago Viktor Stanchev [:vikstrous] 5.15 KB, patch	jlund : review- BenWa : feedback+	Details \| Diff \| Splinter Review
mozharness_talos_profiler2 10 years ago Viktor Stanchev [:vikstrous] 4.68 KB, patch	jlund : review+ jlund : checked-in+	Details \| Diff \| Splinter Review
trychooser 10 years ago Viktor Stanchev [:vikstrous] 2.57 KB, patch	sfink : review+	Details \| Diff \| Splinter Review
trychooser 10 years ago Viktor Stanchev [:vikstrous] 2.57 KB, patch	sfink : review+	Details \| Diff \| Splinter Review
trychooser2 10 years ago Viktor Stanchev [:vikstrous] 2.95 KB, patch	sfink : review+ sfink : checked-in+	Details \| Diff \| Splinter Review