Closed Bug 1286123 Opened 8 years ago Closed 8 years ago

Slave loan request for a try-linux64-ec2 vm

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task)

x86_64
Linux
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: impossibus, Assigned: impossibus)

Details

(Whiteboard: [buildduty][capacity][buildslaves][loaner])

I need a loaner to run desktop builds as part of working on bug 1278698. Thanks!
Email sent to maja for further instructions. 

Loaning machines: 
- try-linux64-ec2

Hi Maja, 

I am going to assign this to you to keep track of the loan. 

When you are finished with the loan forever, please comment stating so here in the bug, and mark the bug as RESOLVED.

By the way, now that this aws instance has been created, starting and stopping it can happen in a flash!
If you are not going to be using this machine for multiple hours, let us know in this bug and we can stop it.

Comment again when you want it started back up.
* For faster turnaround, ping #releng (look for nick with 'buildduty')
Assignee: nobody → mjzffr
re: irc today, maja had issues where they were not able to get mock working. after some trialing we discovered that `runner` is in charge of setting up mock. I thought recreating the machine so we would have secrets and /opt/runner/* paths available again would do the trick.

unfortunately, it seems like even after re-create and reboots, runner refused to run on its own. not sure why...

[root@dev-linux64-ec2-maja.dev.releng.use1.mozilla.com ~]# runlevel
N 3
[root@dev-linux64-ec2-maja.dev.releng.use1.mozilla.com ~]# chkconfig --list | grep runner
runner          0:off   1:off   2:on    3:on    4:on    5:on    6:off

so I forced it:
[root@dev-linux64-ec2-maja.dev.releng.use1.mozilla.com ~]# /etc/init.d/runner start
Starting runner

which failed miserably on some random runner task. not surprised since this is not a worker/slave in the pool. I removed all the tasks bar the `/opt/runner/tasks.d/3-config_mockbuild` task. and reran:

2016-07-14 15:23:00,753 - DEBUG - running pre-task hook: /opt/runner/task_hook.py {"try_num": 1, "max_retries": 5, "task": "3-config_mockbuild", "result": "RUNNING"}
No twistd long found, can't find last_build_name
2016-07-14 15:23:01,756 - DEBUG - 3-config_mockbuild: starting (max time 360s)
2016-07-14 15:23:02,758 - DEBUG - 3-config_mockbuild: OK
2016-07-14 15:23:02,759 - DEBUG - running post-task hook: /opt/runner/task_hook.py {"try_num": 1, "max_retries": 5, "task": "3-config_mockbuild", "result": "OK"}

and now we have mock path available:

[root@dev-linux64-ec2-maja.dev.releng.use1.mozilla.com ~]# ls /etc/mock_mozilla/
b2g.cfg                             logging.ini                         mozilla-centos6-x86_64.cfg          mozilla-f16-x86_64.cfg
config-templates/                   mozilla-centos6-x86_64-android.cfg  mozilla-f16-i386.cfg                site-defaults.cfg

I then cleaned the machine of secrets and rebooted it.

new fqdn/ip: dev-linux64-ec2-maja.dev.releng.use1.mozilla.com (10.134.53.30)


maja: can you try again? same original user/pw from loan email

this raises some questions. are we expecting runner to run at least once post aws_create_instance.py? are we expecting developers to be able to run linux builds with mock when loaning slaves? If yes to either of these, why is runner (and thus mock setup) running during loan create process? What's changed?
Flags: needinfo?(mjzffr)
(In reply to Jordan Lund (:jlund) from comment #2)

> If yes to either of these, why is runner (and thus mock setup) running during loan create process? What's
> changed?

sentence above should be: "why is runner (and thus mock setup) *not* running"

(In reply to Jordan Lund (:jlund) from comment #2)

> this raises some questions. are we expecting runner to run at least once
> post aws_create_instance.py? are we expecting developers to be able to run
> linux builds with mock when loaning slaves? If yes to either of these, why
> is runner (and thus mock setup) running during loan create process? What's
> changed?

I have created another instance and made some tests on it and the runner seems to work when the instance is created but we need to do the cleaning before we can loan the machine and here we must delete /opt/runner so the runner will be stopped.
As `runner` is in charge of setting up mock and before the loan /opt/runner is deleted we expecting developers not to be able to run linux builds with mock when loaning slave except the case when they will configure runner to run again as you Jordan did on Maja's machine.
(In reply to Jordan Lund (:jlund) from comment #2)
> new fqdn/ip: dev-linux64-ec2-maja.dev.releng.use1.mozilla.com (10.134.53.30)
> 
> maja: can you try again? same original user/pw from loan email

I can't connect.

$ ssh dev-linux64-ec2-maja.dev.releng.use1.mozilla.com
ssh: connect to host dev-linux64-ec2-maja.dev.releng.use1.mozilla.com port 22: Operation timed out

I am connected to VPN. Tried connected to given IP as well.

Also tried previous hostname just in case:
$  ssh cltbld@try-linux64-ec2-maja.try.releng.use1.mozilla.com
ssh: Could not resolve hostname try-linux64-ec2-maja.try.releng.use1.mozilla.com: nodename nor servname provided, or not known
Flags: needinfo?(mjzffr)
(In reply to Andrei Obreja [:aobreja][:buildduty] from comment #4)
> As `runner` is in charge of setting up mock and before the loan /opt/runner
> is deleted we expecting developers not to be able to run linux builds with
> mock when loaning slave except the case when they will configure runner to
> run again as you Jordan did on Maja's machine.

Is this documented anywhere? 

If not, adding the necessary steps to https://wiki.mozilla.org/ReleaseEngineering/Mozharness/How_to_run_tests_as_a_developer would be helpful.
Flags: needinfo?(aobreja)
(In reply to Maja Frydrychowicz (:maja_zf) from comment #5)
> (In reply to Jordan Lund (:jlund) from comment #2)
> > new fqdn/ip: dev-linux64-ec2-maja.dev.releng.use1.mozilla.com (10.134.53.30)
> > 
> > maja: can you try again? same original user/pw from loan email
> 
> I can't connect.

Was able to connect after :jlund changed VPN perms, I believe.

The setup-mock step progresses further, but I still hit errors for missing items like `/builds/gapi.data`:
>09:59:24     INFO - Copy/paste: mock_mozilla -r mozilla-centos6-x86_64 --copyin --unpriv /builds/gapi.data /builds/gapi.data
>09:59:24     INFO -  INFO: mock_mozilla.py version 1.0.3 starting...
>09:59:24     INFO -  State Changed: init plugins
>09:59:24     INFO -  INFO: selinux disabled
>09:59:24     INFO -  State Changed: start
>09:59:24     INFO -  State Changed: lock buildroot
>09:59:24     INFO -  Mock Version: 1.0.3
>09:59:24     INFO -  INFO: Mock Version: 1.0.3
>09:59:24     INFO -  INFO: copying /builds/gapi.data to /builds/mock_mozilla/mozilla-centos6-x86_64/root/builds/gapi.data
>09:59:24     INFO -  ERROR: [Errno 2] No such file or directory: '/builds/gapi.data'
>09:59:24     INFO -  Traceback (most recent call last):
>09:59:24     INFO -    File "/usr/sbin/mock_mozilla", line 862, in <module>
>09:59:24     INFO -      main(retParams)
>09:59:24     INFO -    File "/usr/sbin/mock_mozilla", line 823, in main
>09:59:24     INFO -      shutil.copy(src, dest)
>09:59:24     INFO -    File "/usr/lib64/python2.6/shutil.py", line 84, in copy
>09:59:24     INFO -      copyfile(src, dst)
>09:59:24     INFO -    File "/usr/lib64/python2.6/shutil.py", line 50, in copyfile
>09:59:24     INFO -      with open(src, 'rb') as fsrc:
>09:59:24     INFO -  IOError: [Errno 2] No such file or directory: '/builds/gapi.data'

I will see how far I can get with just faking these missing pieces, but again, please point me to any info regarding how to set up the prerequisites properly. Thanks for your help so far.
Ok, I think I have enough of the mozharness steps working to do what I need for now. I ended up commenting out a few of the mock_files specified in scripts/configs/builds/releng_base_linux_64_builds.py
Email sent to maja for further instructions. 

New loaning machine: 
- try-linux64-ec2-maja-2.try.releng.use1.mozilla.com

Hi Maja, 

I am going to assign this to you to keep track of the loan. 
I have created a new try-linux64 instance on which we first let runner run the process before remove it.
Please check your tests on this new instance and give us a feedback.
Flags: needinfo?(aobreja)
Thanks! 
I'm going to continue using dev-linux64-ec2-maja.dev.releng.use1.mozilla.com since it does enough of what I need right now. Email sent to Andrei with details.
I'm done with this loaner.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Machine dev-linux64-ec2-maja.dev.releng.use1.mozilla.com was terminated.
Component: Loan Requests → Buildduty
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.