Closed Bug 1055794 Opened 10 years ago Closed 9 years ago

Use runner on Windows

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: dustin, Assigned: mrrrgn)

References

Details

(Whiteboard: [windows])

Attachments

(7 files, 15 obsolete files)

runner windows buildbot task 10 years ago Morgan Phillips [:mrrrgn] 73 bytes, text/plain		Details
https://github.com/mrrrgn/build-runner/tree/windoze 10 years ago Morgan Phillips [:mrrrgn] 51 bytes, text/plain		Details
runner-windows.zip 10 years ago Morgan Phillips [:mrrrgn] 8.50 KB, application/zip		Details
start-runner.bat 10 years ago Morgan Phillips [:mrrrgn] 407 bytes, text/plain		Details
start-runner.bat 10 years ago Morgan Phillips [:mrrrgn] 545 bytes, text/plain		Details
Set interpreter explicitly in runner. 10 years ago Morgan Phillips [:mrrrgn] 46 bytes, text/x-github-pull-request	rail : review+ mrrrgn : checked-in+	Details \| Review
runner-windows.zip 10 years ago Morgan Phillips [:mrrrgn] 8.50 KB, application/zip		Details
start-runner.bat 10 years ago Morgan Phillips [:mrrrgn] 160 bytes, text/plain		Details
start-runner.bat 10 years ago Morgan Phillips [:mrrrgn] 513 bytes, text/plain	dustin : review+	Details
start-runner-win64.bat 10 years ago Morgan Phillips [:mrrrgn] 3.13 KB, text/plain	dustin : review+	Details
runner.windows.diff 9 years ago Morgan Phillips [:mrrrgn] 20.64 KB, patch	dustin : review+ markco : review-	Details \| Diff \| Splinter Review
runner.windows.diff 9 years ago Morgan Phillips [:mrrrgn] 22.82 KB, patch		Details \| Diff \| Splinter Review
runner.windows.diff 9 years ago Morgan Phillips [:mrrrgn] 22.82 KB, patch		Details \| Diff \| Splinter Review
runner.windows.diff 9 years ago Morgan Phillips [:mrrrgn] 22.82 KB, patch	rail : review+ mrrrgn : checked-in-	Details \| Diff \| Splinter Review
opt.tar.gz 9 years ago Morgan Phillips [:mrrrgn] 4.32 MB, application/x-gzip		Details
opt-1.8.tar.gz 9 years ago Morgan Phillips [:mrrrgn] 16.10 KB, application/x-gzip		Details
opt-1.8-fixed-halt.tar.gz 9 years ago Morgan Phillips [:mrrrgn] 16.00 KB, application/x-gzip	markco : feedback+	Details
opt.tar.gz 9 years ago Morgan Phillips [:mrrrgn] 19.07 KB, application/x-gzip		Details
opt.tar.gz 9 years ago Morgan Phillips [:mrrrgn] 18.77 KB, application/x-gzip	grenade : feedback+	Details
opt.tar.gz 9 years ago Morgan Phillips [:mrrrgn] 18.66 KB, application/x-gzip		Details
opt.tar.gz 9 years ago Morgan Phillips [:mrrrgn] 18.70 KB, application/x-gzip		Details
opt.tar.gz 9 years ago Morgan Phillips [:mrrrgn] 18.71 KB, application/x-gzip		Details

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Description

•

10 years ago

Start runner at startup, and then run Buildbot from it instead of from a scheduled task.

Morgan Phillips [:mrrrgn]

Assignee

Updated

•

10 years ago

Assignee: nobody → winter2718

Morgan Phillips [:mrrrgn]

Assignee

Comment 2

•

10 years ago

Attached file runner windows buildbot task — Details

This task will start buildbot on windows when run under cygwin via:

python -c runner.cfg tasks.d

Morgan Phillips [:mrrrgn]

Assignee

Comment 3

•

10 years ago

Attached file https://github.com/mrrrgn/build-runner/tree/windoze (obsolete) — Details

This branch can be used in its entirety to start runner on a Windows host. All of the changes made here can be accomplished via puppet templates, since only paths are being changed.

Morgan Phillips [:mrrrgn]

Assignee

Comment 4

•

10 years ago

Edit * You can launch buildbot just fine without using cygwin/bash.exe

Morgan Phillips [:mrrrgn]

Assignee

Updated

•

10 years ago

Depends on: 1099212

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Comment 5

•

10 years ago

(In reply to Morgan Phillips [:mrrrgn] from comment #4)
> Edit * You can launch buildbot just fine without using cygwin/bash.exe

This is scary!  I vaguely recall that the effects of not running with bash were subtle -- some particular job that relied on PATH or some other environment variable.  I wonder if we shouldn't try to change this separately, just to minimize the moving pieces.

The new buildbot.py seems to share a lot in common with modules/runner/templates/tasks/buildbot.py.erb -- could those be combined?

Morgan Phillips [:mrrrgn]

Assignee

Comment 6

•

10 years ago

(In reply to Dustin J. Mitchell [:dustin] from comment #5)
> (In reply to Morgan Phillips [:mrrrgn] from comment #4)
> > Edit * You can launch buildbot just fine without using cygwin/bash.exe
> 
> This is scary!  I vaguely recall that the effects of not running with bash
> were subtle -- some particular job that relied on PATH or some other
> environment variable.  I wonder if we shouldn't try to change this
> separately, just to minimize the moving pieces.
> 
> The new buildbot.py seems to share a lot in common with
> modules/runner/templates/tasks/buildbot.py.erb -- could those be combined?

I reused the runner template exactly, in fact, if we can use puppet with the windows hosts it will only require stubbing out template variables (same as the recent OSX changes).

In the task itself I start runner via bash.exe from the subprocess command itself so I think starting runner via bash.exe as well is redundant; especially since I'm using the '--login' argument which stubs out environment variables. https://github.com/mrrrgn/build-runner/blob/windoze/tasks.d/4-buildbot.py#L16

I'm using the '--login' argument because I noticed it in the current startup batch script btw.

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Comment 7

•

10 years ago

Ah, that makes sense then -- indeed, double-bashing probably isn't necessary!

Morgan Phillips [:mrrrgn]

Assignee

Comment 8

•

10 years ago

Attached file runner-windows.zip (obsolete) — Details

Morgan Phillips [:mrrrgn]

Assignee

Comment 9

•

10 years ago

Could you help me get this deployed?

Flags: needinfo?(mcornmesser)

Mark Cornmesser [:markco] OOO 2024/04/15

Comment 10

•

10 years ago

For sure. What all are we looking to deploy? runner-windows.zip? And schedule task to start runner?

Flags: needinfo?(mcornmesser)

Morgan Phillips [:mrrrgn]

Assignee

Comment 11

•

10 years ago

(In reply to Mark Cornmesser [:markco] from comment #10)
> For sure. What all are we looking to deploy? runner-windows.zip? And
> schedule task to start runner?

Yay, yes, basically we should replace start-buildbot.bat (or whatever script is starting buildbot) with something that starts runner. To start runner, we just want to cd into its directory and use python. I've been using a batch script that just does:

cd c:\mozilla-build\runner
python runner.py -c runner.cfg tasks.d

I'll attach the batch script for review.

Morgan Phillips [:mrrrgn]

Assignee

Comment 12

•

10 years ago

Attached file start-runner.bat (obsolete) — Details

This assumes that runner-windows.zip has been unpacked to c:\mozilla-build\runner

Attachment #8524051 - Flags: review?(mcornmesser)

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Comment 13

•

10 years ago

Let's not dump more stuff in mozilla-build :)

On POSIX, runner ends up in /opt/runner.  I'd be game to create C:\opt to replicate that, or put it in C:\tools (a la bug 1097349) and maybe even move runner to /tools on POSIX.

Also, mrrrgn, please check in a procedure for creating the ZIP file - even if we won't be using that particular technique in the puppety future.

Morgan Phillips [:mrrrgn]

Assignee

Comment 14

•

10 years ago

(In reply to Dustin J. Mitchell [:dustin] from comment #13)
> Let's not dump more stuff in mozilla-build :)
> 
> On POSIX, runner ends up in /opt/runner.  I'd be game to create C:\opt to
> replicate that, or put it in C:\tools (a la bug 1097349) and maybe even move
> runner to /tools on POSIX.
> 
> Also, mrrrgn, please check in a procedure for creating the ZIP file - even
> if we won't be using that particular technique in the puppety future.

Good point, and very fair. For the time being I'll create a branch that we can grab from GitHub come to think of it. github.com/mozilla/build-runner/tree/windows

Morgan Phillips [:mrrrgn]

Assignee

Updated

•

10 years ago

Attachment #8524051 - Attachment is obsolete: true

Attachment #8524051 - Flags: review?(mcornmesser)

Morgan Phillips [:mrrrgn]

Assignee

Comment 15

•

10 years ago

Attached file start-runner.bat (obsolete) — Details

I'm not sure if it's acceptable to allow mkdir/git clone to fail; but I'm going with it initially here. Especially as this is expected to be a temporary measure.

Attachment #8524239 - Flags: review?(dustin)

Morgan Phillips [:mrrrgn]

Assignee

Comment 16

•

10 years ago

Comment on attachment 8524239 [details]
start-runner.bat

Eh, actually, going to try for something a smidge better here. This is kind of lame even for a temporary fix.

Attachment #8524239 - Attachment is obsolete: true

Attachment #8524239 - Flags: review?(dustin)

Morgan Phillips [:mrrrgn]

Assignee

Comment 17

•

10 years ago

Attached file Set interpreter explicitly in runner. — Details

I had coerced runner to work on windows via shell=True in Popen, then forgot about how brittle that was. In short, setting shell=True somehow allows us to take advantage of the nice environment set up by cygwin's bash and get around the fact that cmd.exe can't deal with hashbangs natively.

That sort of dependency could lead to lots of confusion if it ever breaks down. These changes make the interpreter being used for each task explicit. We can use this on windows to make things behave nice in a way that's simple to debug.

Attachment #8522628 - Attachment is obsolete: true

Attachment #8523972 - Attachment is obsolete: true

Attachment #8524299 - Flags: review?(catlee)

Morgan Phillips [:mrrrgn]

Assignee

Updated

•

10 years ago

Attachment #8524299 - Flags: review?(catlee) → review?(rail)

Mark Cornmesser [:markco] OOO 2024/04/15

Comment 18

•

10 years ago

Is cygwin going to be a prerequisite for runner?

Morgan Phillips [:mrrrgn]

Assignee

Comment 19

•

10 years ago

(In reply to Mark Cornmesser [:markco] from comment #18)
> Is cygwin going to be a prerequisite for runner?

It is. It runs buildbot via c:\mozilla-build\msys\bin\bash.exe

Justin Wood (:Callek)

Comment 20

•

10 years ago

(In reply to Morgan Phillips [:mrrrgn] from comment #19)
> (In reply to Mark Cornmesser [:markco] from comment #18)
> > Is cygwin going to be a prerequisite for runner?
> 
> It is. It runs buildbot via c:\mozilla-build\msys\bin\bash.exe

to be clear, this is not cygwin, its MSYS. Cygwin has a whole host of conflicting things with MSYS and some complications with keeping both around.

Morgan Phillips [:mrrrgn]

Assignee

Comment 21

•

10 years ago

Attached file runner-windows.zip — Details

This is the latest runner taken from the runner windows repo: https://github.com/mozilla/build-runner-windows/archive/0.0.0.zip

Morgan Phillips [:mrrrgn]

Assignee

Comment 22

•

10 years ago

Attached file start-runner.bat (obsolete) — Details

Here I'm assuming that c:\opt\runner will exist and be populated by the archive at: https://github.com/mozilla/build-runner-windows/archive/0.0.0.zip

Attachment #8524754 - Flags: feedback?(dustin)

Rail Aliiev [:rail]

Comment 23

•

10 years ago

Comment on attachment 8524299 [details] [review]
Set interpreter explicitly in runner.

It was merged

Attachment #8524299 - Flags: review?(rail) → review+

Morgan Phillips [:mrrrgn]

Assignee

Comment 24

•

10 years ago

(In reply to Justin Wood (:Callek) from comment #20)
> (In reply to Morgan Phillips [:mrrrgn] from comment #19)
> > (In reply to Mark Cornmesser [:markco] from comment #18)
> > > Is cygwin going to be a prerequisite for runner?
> > 
> > It is. It runs buildbot via c:\mozilla-build\msys\bin\bash.exe
> 
> to be clear, this is not cygwin, its MSYS. Cygwin has a whole host of
> conflicting things with MSYS and some complications with keeping both around.

I was confused, I see. I went with c:\mozilla-build\msys\bin\bash.exe because I noticed that's how buildbot is currently being started.

Morgan Phillips [:mrrrgn]

Assignee

Updated

•

10 years ago

Attachment #8524299 - Flags: checked-in+

Morgan Phillips [:mrrrgn]

Assignee

Comment 25

•

10 years ago

I'm noticing compilation errors on my test machine. I looked at one in production and noticed that it had this environment variable set: MAKE_MODE=unix

I believe this difference is causing the problem. http://dev-master1.srv.releng.scl3.mozilla.com:8046/builders/WINNT%205.2%20mozilla-central%20leak%20test%20non-unified/builds/89/steps/compile/logs/stdio

Of course, the real root of the issue is, I don't know how these environment variables are being set. We'll need to delay rolling out windows runner until I figure this out.

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Comment 26

•

10 years ago

I don't have any particular advice -- I know env variables are a little different on Windows, and I'm sure MSYS bash and Python are doing some interesting things to make the world look POSIXy, too.  I am happy to see that the errors aren't subtle, though -- it looks like when you hit on the fix it will be pretty clear that it works.  And hopefully you'll have a good idea of *why* it works, too -- then we can have less cargo to cult around in the future :)

Morgan Phillips [:mrrrgn]

Assignee

Comment 27

•

10 years ago

Attached file start-runner.bat — Details

Attachment #8524754 - Attachment is obsolete: true

Attachment #8524754 - Flags: feedback?(dustin)

Attachment #8525481 - Flags: review?(dustin)

Morgan Phillips [:mrrrgn]

Assignee

Comment 28

•

10 years ago

Attached file start-runner-win64.bat — Details

Attachment #8525483 - Flags: review?(dustin)

Morgan Phillips [:mrrrgn]

Assignee

Comment 29

•

10 years ago

I'm trying not to cargo cult anything; but it's still a bit opaque to me how all of parts of the environment are working together. That said, these startup scripts are working with runner, as placed in c:\opt\runner

Armen [:armenzg]

Comment 30

•

10 years ago

There are some environment variables added directly to the Windows machines through the imaging process.
Other through the buildbot start up scripts (msys/.bat files), others through b-c and others through mozharness (not sure if applies in here at all). I'm not saying that I agree with any of it :)

If you paste the log I can have a look.

I have look at releng repos, mozilla-build and trunk and I can't find where MAKE_MODE comes from.
Does GPO install anything funky?

Morgan Phillips [:mrrrgn]

Assignee

Comment 31

•

10 years ago

This is working in my tests with the following steps btw:

mkdir c:\opt
cd c:\opt
wget https://github.com/mozilla/build-runner-windows/archive/0.0.0.zip
unzip 0.0.0
mv build-runner-windows-0.0.0 runner

then use the start-runner.bat or start-runner-win64.bat scripts.

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Comment 32

•

10 years ago

Comment on attachment 8525481 [details]
start-runner.bat

The environment, shell, and MSYS details hidden in these patches remain mysterious to me, but I don't see anything I don't like.

Attachment #8525481 - Flags: review?(dustin) → review+

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Updated

•

10 years ago

Attachment #8525483 - Flags: review?(dustin) → review+

Morgan Phillips [:mrrrgn]

Assignee

Updated

•

9 years ago

No longer blocks: 1052581

Morgan Phillips [:mrrrgn]

Assignee

Comment 33

•

9 years ago

Attached patch runner.windows.diff (obsolete) — Details — Splinter Review

The only thing I'm not clear on is how to create a startup service. I only copied start-builbot.xml to try to make this happen. Is there more to it?

Attachment #8544231 - Flags: review?(dustin)

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Comment 34

•

9 years ago

Comment on attachment 8544231 [details] [diff] [review]
runner.windows.diff

Review of attachment 8544231 [details] [diff] [review]:
-----------------------------------------------------------------

Minor bits from me, but I'd like Mark's feedback too.

::: modules/buildslave/manifests/startup/runner.pp
@@ +43,5 @@
> +                'c:/programdata/puppetagain/start-buildbot.xml':
> +                    ensure => absent;
> +            }
> +
> +        }

These probably aren't necesary - when we deploy this in prod, we'll be reimaging machines from the current non-puppet status.

::: modules/runner/manifests/init.pp
@@ +8,5 @@
>      include runner::service
>      include runner::settings
>      include packages::mozilla::python27
>  
> +    $interpreter = $runner::settings::interpreter

Might want to move this closer to the template usage below, just for clarity (I wondered for a while why the python::virtualenv parameter 'python' wasn't set to $interpreter)

::: modules/runner/manifests/service.pp
@@ +67,5 @@
> +            include dirs::programdata::puppetagain
> +
> +            $builder_username = $users::builder::username
> +            $puppet_semaphore = 'C:\ProgramData\PuppetAgain\puppetcomplete.semaphore'
> +            # Batch file to start buildbot

s/buildbot/runner/g

@@ +88,5 @@
> +            shared::execonce { "startrunner":
> +                command => "$schtasks /Create /XML c:\\programdata\\puppetagain\\start-runner.xml /tn StartRunner",
> +                require => File['c:/programdata/puppetagain/start-runner.xml'],
> +            }
> +            service {

I don't think runner ends up being a service here - instead, it's a scheduled task.  Mark can confirm..

Attachment #8544231 - Flags: review?(mcornmesser)

Attachment #8544231 - Flags: review?(dustin)

Attachment #8544231 - Flags: review+

Mark Cornmesser [:markco] OOO 2024/04/15

Comment 35

•

9 years ago

Comment on attachment 8544231 [details] [diff] [review]
runner.windows.diff

Review of attachment 8544231 [details] [diff] [review]:
-----------------------------------------------------------------

::: modules/runner/manifests/service.pp
@@ +88,5 @@
> +            shared::execonce { "startrunner":
> +                command => "$schtasks /Create /XML c:\\programdata\\puppetagain\\start-runner.xml /tn StartRunner",
> +                require => File['c:/programdata/puppetagain/start-runner.xml'],
> +            }
> +            service {

It doesn't appear that runner would be a service here. Is there is something in the runner.exe that installs and starts it as a service? 
IS the, "service {"runner":" throwing an error or warning?

Mark Cornmesser [:markco] OOO 2024/04/15

Updated

•

9 years ago

Attachment #8544231 - Flags: review?(mcornmesser) → review-

Morgan Phillips [:mrrrgn]

Assignee

Comment 36

•

9 years ago

(In reply to Mark Cornmesser [:markco] from comment #35)
> Comment on attachment 8544231 [details] [diff] [review]
> runner.windows.diff
> 
> Review of attachment 8544231 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: modules/runner/manifests/service.pp
> @@ +88,5 @@
> > +            shared::execonce { "startrunner":
> > +                command => "$schtasks /Create /XML c:\\programdata\\puppetagain\\start-runner.xml /tn StartRunner",
> > +                require => File['c:/programdata/puppetagain/start-runner.xml'],
> > +            }
> > +            service {
> 
> It doesn't appear that runner would be a service here. Is there is something
> in the runner.exe that installs and starts it as a service? 
> IS the, "service {"runner":" throwing an error or warning?

The service here does not throw a warning. Rather, not having it caused an error thanks to some other steps having a requirement for a runner "service" existing before they will execute. I added the service to avoid putting in tons of conditionals.

Dustin J. Mitchell [:dustin] (he/him)

Reporter

Comment 37

•

9 years ago

That a Service resource for an undefined service doesn't error out sounds like a puppet bug, which could be patched at any time.  That would be a nasty surprise down the road.

Rather than conditionals, you could define a $runner_service_requirements = [] on Windows and = [Service['runner']] elsewhere.

Morgan Phillips [:mrrrgn]

Assignee

Comment 38

•

9 years ago

Attached patch runner.windows.diff (obsolete) — Details — Splinter Review

This addresses the issues up above. Seems to install properly on my test box.

Attachment #8544231 - Attachment is obsolete: true

Attachment #8550996 - Flags: review?(dustin)

Morgan Phillips [:mrrrgn]

Assignee

Updated

•

9 years ago

Attachment #8550996 - Flags: review?(dustin) → review?(mcornmesser)

Morgan Phillips [:mrrrgn]

Assignee

Comment 39

•

9 years ago

Attached patch runner.windows.diff (obsolete) — Details — Splinter Review

Attachment #8550996 - Attachment is obsolete: true

Attachment #8550996 - Flags: review?(mcornmesser)

Attachment #8550997 - Flags: review?(mcornmesser)

Morgan Phillips [:mrrrgn]

Assignee

Comment 40

•

9 years ago

Attached patch runner.windows.diff — Details — Splinter Review

Sorry for the new patches, there was a template problem which only affected osx puppet clients. Fixed now.

Attachment #8550997 - Attachment is obsolete: true

Attachment #8550997 - Flags: review?(mcornmesser)

Attachment #8551423 - Flags: review?(mcornmesser)

Rail Aliiev [:rail]

Comment 41

•

9 years ago

Comment on attachment 8551423 [details] [diff] [review]
runner.windows.diff

Review of attachment 8551423 [details] [diff] [review]:
-----------------------------------------------------------------

stamping the xmls and bat files. Otherwise looks straight forward.

Attachment #8551423 - Flags: review?(mcornmesser) → review+

Morgan Phillips [:mrrrgn]

Assignee

Updated

•

9 years ago

Attachment #8551423 - Flags: checked-in+

Morgan Phillips [:mrrrgn]

Assignee

Comment 42

•

9 years ago

I don't mean to add more load; but is there any way we can go ahead and deploy these changes under our current model (non-puppet)? That will allow us to track the status of our windows hosts via the runner dashboard, which comes in handy for seeing problems pop up.

I can create a zip archive with the compiled bat/xml files if necessary. Runner is already available in our pip repository on releg-puppet2.srv.releng.scl3.mozilla.com

Flags: needinfo?(mcornmesser)

Justin Wood (:Callek)

Comment 43

•

9 years ago

(In reply to Morgan Phillips [:mrrrgn] from comment #42)
> I don't mean to add more load; but is there any way we can go ahead and
> deploy these changes under our current model (non-puppet)? That will allow
> us to track the status of our windows hosts via the runner dashboard, which
> comes in handy for seeing problems pop up.
> 
> I can create a zip archive with the compiled bat/xml files if necessary.
> Runner is already available in our pip repository on
> releg-puppet2.srv.releng.scl3.mozilla.com

fwiw, I support this, *or* I support puppet on windows turned on without runner. I don't support a split in what one does vs the other.

E.g. if we have runner in GPO we should have runner in puppet, if we don't we shouldn't. Having a divergence here will incur much more cost in terms of turning on puppet-on-windows (imho) since if there are issues we won't know what caused them easily, and have a harder time triaging/fixing them.

Morgan Phillips [:mrrrgn]

Assignee

Comment 44

•

9 years ago

I backed out the changes for now, I'll sit on the patch until windows-puppet deploys or we can get runner going via GPO.

Morgan Phillips [:mrrrgn]

Assignee

Updated

•

9 years ago

Attachment #8551423 - Flags: checked-in+ → checked-in-

Mark Cornmesser [:markco] OOO 2024/04/15

Comment 45

•

9 years ago

(In reply to Morgan Phillips [:mrrrgn] from comment #42)
> I don't mean to add more load; but is there any way we can go ahead and
> deploy these changes under our current model (non-puppet)? That will allow
> us to track the status of our windows hosts via the runner dashboard, which
> comes in handy for seeing problems pop up.
> 
> I can create a zip archive with the compiled bat/xml files if necessary.
> Runner is already available in our pip repository on
> releg-puppet2.srv.releng.scl3.mozilla.com

I am down to do this, but let's get Amy's opinion/OK or not OK to divert time to it.

Flags: needinfo?(mcornmesser) → needinfo?(arich)

Amy Rich [:arr] [:arich]

Comment 46

•

9 years ago

Talked with Morgan and Mark, and this sounds like a great use of a day or two to get us a bunch of reporting. Proceed, please!

Flags: needinfo?(arich)

Morgan Phillips [:mrrrgn]

Assignee

Comment 47

•

9 years ago

Attached file opt.tar.gz (obsolete) — Details

This has everything you should need to get runner working. After you extract it from c:\ try running c:\opt\start-runner.bat and that should start buildbot.

Attachment #8551889 - Flags: feedback?(mcornmesser)

Mark Cornmesser [:markco] OOO 2024/04/15

Updated

•

9 years ago

Blocks: 1126523

Morgan Phillips [:mrrrgn]

Assignee

Comment 48

•

9 years ago

Attached file opt-1.8.tar.gz (obsolete) — Details

This version of runner is compatible with python 2.6 > and comes with a new script install-runner.bat which will put runner into the buildbot virtual environment. The start-runner.bat script has also been modified in light of this.

This should work on any windows host by just un-tarring to /c/
then running /c/opt/install-runner.bat and /c/opt/start-runner.bat

Attachment #8551889 - Attachment is obsolete: true

Attachment #8551889 - Flags: feedback?(mcornmesser)

Attachment #8557738 - Flags: feedback?(mcornmesser)

Morgan Phillips [:mrrrgn]

Assignee

Comment 49

•

9 years ago

Attached file opt-1.8-fixed-halt.tar.gz (obsolete) — Details

Windows is failing on halt.py because of the hashbang, this changes the halt script to NT batch. It also fixes a bug in the start-runner.bat script.

Attachment #8557738 - Attachment is obsolete: true

Attachment #8557738 - Flags: feedback?(mcornmesser)

Attachment #8558694 - Flags: feedback?(mcornmesser)

Mark Cornmesser [:markco] OOO 2024/04/15

Comment 50

•

9 years ago

A GPO is ready to push runner out to the domain machines. 

How do we want to proceed?

Morgan Phillips [:mrrrgn]

Assignee

Comment 51

•

9 years ago

(In reply to Mark Cornmesser [:markco] from comment #50)
> A GPO is ready to push runner out to the domain machines. 
> 
> How do we want to proceed?

Sweet! I saw the data from your test make a nice blip on the runner dashboards.

Can we roll it out on a subset of the machines first? The first issue I can imagine is some interference between the old buildbot startup and the runner buildbot startup. 

In any case, I'm ready for this as soon as possible.

Mark Cornmesser [:markco] OOO 2024/04/15

Comment 52

•

9 years ago

I going to need info Jordan on this. 

Jordan, is there a few 2008 machines we could roll runner out to? Even though we are in the begin steps of transitioning 2008 to Puppet managed, I think this is worthwhile since the time of the transition is largely unknown.

Flags: needinfo?(jlund)

Mark Cornmesser [:markco] OOO 2024/04/15

Updated

•

9 years ago

Attachment #8558694 - Flags: feedback?(mcornmesser) → feedback+

Jordan Lund (:jlund)

Comment 53

•

9 years ago

(In reply to Mark Cornmesser [:markco] from comment #52)
>
> Jordan, is there a few 2008 machines we could roll runner out to? .

sure. feel free to do 1 at a time or whatever works for you with say:
b-2008-ix-0100 -> b-2008-ix-0109

the above gives you 10 to eventually get up to and they are numerically beside each other. I checked each one out. It's hard to find a slave that has a wall of solid green build as I think win64 and some specific builders are not the most reliable. At any rate, the above do not seem any worse than the rest :)

Flags: needinfo?(jlund)

Mark Cornmesser [:markco] OOO 2024/04/15

Comment 54

•

9 years ago

Thanks Jordan!

I have disabled 0100 through 0104 to start. If the current builds finish in a timely manner I will move them into the testing OU tonight. If not I will move them in the morning.

Mark Cornmesser [:markco] OOO 2024/04/15

Comment 55

•

9 years ago

0100 through 0104 have been moved to the testing OU, force a group policy update, and rebooted each machine.

They are ready to be re-enabled.

Phil Ringnalda (:philor)

Comment 56

•

9 years ago

They got reenabled, then I just disabled them, since they burn the compile step with "which: python2.7: unknown command".

Morgan Phillips [:mrrrgn]

Assignee

Comment 57

•

9 years ago

I modified b-2008-ix-0102 and put it back into play for the night to see if the situation improves. Sadly, the Windows path situation is black magic to me, but I used a working start-buildbot.bat from a normal slave and copied it over the buildbot task in runner. 

Keeping an eye on it, and will disable again if need be.

Phil Ringnalda (:philor)

Comment 58

•

9 years ago

Please can we test this in some other way, like say with a dev master running non-production jobs?

Either 0104 got reenabled too, or I missed disabling it, and so it burned l10n nightlies all night long, and because of the way we report them (to nowhere, basically), instead of being retriggered and wasting several hours of developer and sheriff time like the rest of the jobs we're burning, those just got thrown on the floor, and too bad if anyone wanted to test those locales today.

Phil Ringnalda (:philor)

Updated

•

9 years ago

Blocks: b-2008-ix-0101

Phil Ringnalda (:philor)

Updated

•

9 years ago

Blocks: b-2008-ix-0100

Phil Ringnalda (:philor)

Updated

•

9 years ago

Blocks: b-2008-ix-0103

Phil Ringnalda (:philor)

Updated

•

9 years ago

Blocks: b-2008-ix-0104

Phil Ringnalda (:philor)

Updated

•

9 years ago

Blocks: b-2008-ix-0102

Mark Cornmesser [:markco] OOO 2024/04/15

Comment 59

•

9 years ago

I am going to go ahead and re-image those machines. They should be ready to go back in production within a the couple hours.

Jordan Lund (:jlund)

Comment 60

•

9 years ago

(In reply to Mark Cornmesser [:markco] from comment #59)
> I am going to go ahead and re-image those machines. They should be ready to
> go back in production within a the couple hours.

IIUC - we are re-imaging them to pre-runner. so they are fine to all enable at once when ready. thanks markco :)

Morgan Phillips [:mrrrgn]

Assignee

Updated

•

9 years ago

Blocks: 1140425

Morgan Phillips [:mrrrgn]

Assignee

Comment 61

•

9 years ago

So, runner is handling jobs here: http://dev-master2.bb.releng.use1.mozilla.com:8046/buildslaves/ix-mn-w0864-001 it looks like there was a problem with the environment when we tried to deploy with GPO. I've solved the Makefile issue by adding this to the runner config:

[env]
MAKE_MODE=unix
MSYSTEM=MINGW32

However, I'm not convinced we won't run into a path issue. The good news is, it's simple to transplant the correct environment variables (from a working machine) to the runner config. Once we can deploy with puppet, this _should_ be trivial to get right (we manually set the path for some of our OSX machines in their runner config already for example).

I'll continue running builds on this slave, and any other I can get my hands on to look for troubles.

Morgan Phillips [:mrrrgn]

Assignee

Comment 62

•

9 years ago

Attached file opt.tar.gz (obsolete) — Details

This has everything fixed up, and a bulk clobber task.

Attachment #8558694 - Attachment is obsolete: true

Morgan Phillips [:mrrrgn]

Assignee

Comment 63

•

9 years ago

bulk clobbering works => https://pastebin.mozilla.org/8833439 and jobs are completing successfully on our test machine: http://buildbot-master87.bb.releng.scl3.mozilla.com:8101/buildslaves/b-2008-ix-0058 I think we're ready to roll this baby out.

Morgan Phillips [:mrrrgn]

Assignee

Comment 64

•

9 years ago

Two problems with the latest deploy:

1.) The PATH set by runner is messed up because of problems with slashes (this is the joy of Windows paths)
"PATH=C:\mozilla-buildpython27;C:\mozilla-buildbuildbotve\scripts; .... " It should be
"PATH=C:\mozilla-build\python27;C:\mozilla-build\buildbotve\scripts; ...."

2.) Runner and the old system are running side by side. One of the boxes was in the process of running two jobs at the same time, last I checked. B-2008-IX-0057

Morgan Phillips [:mrrrgn]

Assignee

Comment 65

•

9 years ago

We can tell when a job has been launched by runner because it will have the "RUNNER_CONFIG_CMD" environment variable set. See: http://buildbot-master87.bb.releng.scl3.mozilla.com:8101/builders/WINNT%206.1%20x86-64%20try%20spidermonkey_try-plain%20build/builds/60/steps/run_script/logs/stdio

Morgan Phillips [:mrrrgn]

Assignee

Comment 66

•

9 years ago

I can create a new config file that attempts to mitigate this slashing issue.

Morgan Phillips [:mrrrgn]

Assignee

Comment 67

•

9 years ago

Attached file opt.tar.gz (obsolete) — Details

Here opt/runner/runner.cfg has been modified with a path that starts with Python. This is to mitigate an issue of the system's default python being used. The system python breaks jobs because it's 2.6

Attachment #8605323 - Attachment is obsolete: true

Attachment #8608879 - Flags: feedback?(rthijssen)

Rob Thijssen [:grenade (EET/UTC+0300)]

Comment 68

•

9 years ago

I've updated the runner gpo files with the modified opt. I don't know what's going on with the 2 buildbot processes on 0057. The runner test pool is 0050 to 0054 so (I think) that one shouldn't be running the new runner stuff.

Rob Thijssen [:grenade (EET/UTC+0300)]

Comment 69

•

9 years ago

cancelling fb req...

Rob Thijssen [:grenade (EET/UTC+0300)]

Comment 70

•

9 years ago

Comment on attachment 8608879 [details]
opt.tar.gz

deployed

Attachment #8608879 - Flags: feedback?(rthijssen) → feedback-

Rob Thijssen [:grenade (EET/UTC+0300)]

Updated

•

9 years ago

Attachment #8608879 - Flags: feedback- → feedback+

Morgan Phillips [:mrrrgn]

Assignee

Comment 71

•

9 years ago

Looking at the jobs running on the 0050 to 0054 machines, I'm not seeing the RUNNER_CONFIG_CMD environment variable set (which is always set by runner), so it looks like the buildbot here was still started using the old script. :(

Amy Rich [:arr] [:arich]

Updated

•

9 years ago

Blocks: 1154062

Morgan Phillips [:mrrrgn]

Assignee

Comment 72

•

9 years ago

Attached file opt.tar.gz (obsolete) — Details

The windows runner was trying to run every task with python: breaking batch scripts.

Attachment #8608879 - Attachment is obsolete: true

Morgan Phillips [:mrrrgn]

Assignee

Comment 73

•

9 years ago

Installed the new tarball on B-2008-IX-0057 with successful clobber and bb initialization.

Rob Thijssen [:grenade (EET/UTC+0300)]

Comment 74

•

9 years ago

New opt installed via GPO on:
 - B-2008-IX-0050
 - B-2008-IX-0051
 - B-2008-IX-0052
 - B-2008-IX-0053
 - B-2008-IX-0054

Morgan Phillips [:mrrrgn]

Assignee

Comment 75

•

9 years ago

2015-05-28 06:06:27,078 - try_w32-d_sm-plaindebug-000000:Server is forcing a clobber, init
There's a clobberer bug. Have we seen this before on Windows?

iated by rvandermeulen@mozilla.com
2015-05-28 06:06:27,078 - try_w32-d_sm-plaindebug-000000:Clobbering...
2015-05-28 06:06:27,078 - Skipping buildprops.json
2015-05-28 06:06:27,079 - Skipping last-clobber
2015-05-28 06:06:27,079 - Skipping scripts
2015-05-28 06:06:27,079 - Removing src.deleteme/
c:/opt/clobberer.py:163: DeprecationWarning: BaseException.message has been deprecated as
of Python 2.6
  log.error("[clobber failed]: %s", e.message)
2015-05-28 06:06:27,082 - [clobber failed]:

Morgan Phillips [:mrrrgn]

Assignee

Comment 76

•

9 years ago

Attached file opt.tar.gz (obsolete) — Details

Attachment #8612249 - Attachment is obsolete: true

Morgan Phillips [:mrrrgn]

Assignee

Comment 77

•

9 years ago

Attached file opt.tar.gz — Details

Fixes a string encoding error that the clobberer client had (windows filesystem encoding was not utf-8 compatible) and introduces better logging.

Morgan Phillips [:mrrrgn]

Assignee

Updated

•

9 years ago

Attachment #8614672 - Attachment is obsolete: true

Morgan Phillips [:mrrrgn]

Assignee

Updated

•

9 years ago

Blocks: 930826

Comment 78

•

9 years ago

Working to deploy new opt.tar.gz to b-208-ix-0050 - 054

Comment 79

•

9 years ago

The wmi filters and item level targets for gpo need some review before this gets pushed but it will be done today.

Comment 80

•

9 years ago

The gpo logic is now correct and the install runs only installs once. I checked 054 and 052 and they look like they are getting the right config and runner is starting

Morgan Phillips [:mrrrgn]

Assignee

Updated

•

9 years ago

Blocks: 1173945

Updated

•

9 years ago

Depends on: 1169789

Comment 81

•

9 years ago

expanding runner gpo to encompass b-2008-ix-0020 - 0059

Comment 82

•

9 years ago

per conversations with mrrrgn the expanded try roll-out looks good we are going to let this run over the weekend and chat about expanding to all try machines during the work week.

Morgan Phillips [:mrrrgn]

Assignee

Comment 83

•

9 years ago

Looking to roll out runner to all try machines today.

Comment 84

•

9 years ago

All try machines confirmed. Per conversations at whistler b-2008-ix-0171 is using runner. Next step is add 0171 - 0173 as the rest if 017x are in try using runner already.

Q

Comment 85

•

9 years ago

The content of the current wmi filter used to include machines in the runner deployment process:

select * from Win32_ComputerSystem where (Name LIKE "B-2008-IX-005%" OR Name LIKE "B-2008-IX-004%" OR Name LIKE "B-2008-IX-003%" OR Name LIKE "B-2008-IX-002%" OR NAME = "B-2008-IX-0060" OR NAME = "B-2008-IX-0061" OR NAME = "B-2008-IX-0062" OR NAME = "B-2008-IX-0063" OR NAME = "B-2008-IX-0064" OR NAME = "B-2008-IX-0170" OR NAME = "B-2008-IX-0173" OR NAME = "B-2008-IX-0174" OR NAME = "B-2008-IX-0175" OR NAME = "B-2008-IX-0176" OR NAME = "B-2008-IX-0177" OR NAME = "B-2008-IX-0178" OR NAME = "B-2008-IX-0179" OR NAME = "B-2008-IX-0180" OR NAME = "B-2008-IX-0181" OR NAME = "B-2008-IX-0182" OR NAME = "B-2008-IX-0183" OR NAME = "B-2008-IX-0184" )

And to exclude runner machines from having the buildbot schedule task reinstalled:

select * from Win32_ComputerSystem where NOT  (Name LIKE "B-2008-IX-005%" OR Name LIKE "B-2008-IX-004%" OR Name LIKE "B-2008-IX-003%" OR Name LIKE "B-2008-IX-002%" OR NAME = "B-2008-IX-0060" OR NAME = "B-2008-IX-0061" OR NAME = "B-2008-IX-0062" OR NAME = "B-2008-IX-0063" OR NAME = "B-2008-IX-0064" OR NAME = "B-2008-IX-0170" OR NAME = "B-2008-IX-0173" OR NAME = "B-2008-IX-0174" OR NAME = "B-2008-IX-0175" OR NAME = "B-2008-IX-0176" OR NAME = "B-2008-IX-0177" OR NAME = "B-2008-IX-0178" OR NAME = "B-2008-IX-0179" OR NAME = "B-2008-IX-0180" OR NAME = "B-2008-IX-0181" OR NAME = "B-2008-IX-0182" OR NAME = "B-2008-IX-0183" OR NAME = "B-2008-IX-0184" )

Comment 86

•

9 years ago

Adding B-2008-IX-017x to the mix this only adds 171 - 173 to the build runner testing pool as other machines are already in try or testing for Puppett.

Comment 87

•

9 years ago

Per IRC discussions added the rest of 018x and 006x for the sake of filter cleanliness and tracking:

select * from Win32_ComputerSystem where (Name LIKE "B-2008-IX-018%" OR Name LIKE "B-2008-IX-017%" OR NAME LIKE "B-2008-IX-006%" OR Name LIKE "B-2008-IX-005%" OR Name LIKE "B-2008-IX-004%" OR Name LIKE "B-2008-IX-003%" OR Name LIKE "B-2008-IX-002%")

select * from Win32_ComputerSystem where NOT (Name LIKE "B-2008-IX-018%" OR Name LIKE "B-2008-IX-017%" OR NAME LIKE "B-2008-IX-006%" OR Name LIKE "B-2008-IX-005%" OR Name LIKE "B-2008-IX-004%" OR Name LIKE "B-2008-IX-003%" OR Name LIKE "B-2008-IX-002%")

Comment 88

•

9 years ago

Expanding to include b-2008-ix-000* b-2008-ix-001* b-2008-ix-016* That adds 27 machines using runner

Rob Thijssen [:grenade (EET/UTC+0300)]

Updated

•

9 years ago

Blocks: 1183535

Rob Thijssen [:grenade (EET/UTC+0300)]

Updated

•

9 years ago

Whiteboard: [windows]

Rob Thijssen [:grenade (EET/UTC+0300)]

Updated

•

9 years ago

Depends on: 1185909

Chris Cooper [:coop] (he/him)

Updated

•

9 years ago

Blocks: 1190868

Morgan Phillips [:mrrrgn]

Assignee

Updated

•

9 years ago

Status: NEW → RESOLVED

Closed: 9 years ago

Resolution: --- → FIXED

Nobody; OK to take it and work on it

Updated

•

6 years ago

Component: General Automation → General

You need to log in before you can comment on or make changes to this bug.