Closed Bug 1593483 Opened 5 years ago Closed 5 years ago

generic-worker looks for deploymentId in the wrong place

Categories

(Taskcluster :: Workers, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dustin, Assigned: pmoore)

Details

Attachments

(1 file)

For the aws and google provider types, configuration is in various launchConfigs in the worker pool definition. This is not where generic worker looks for it.

There's a question of what to do if there are multiple launchConfigs, and they have different deploymentIds. Let's ignore that for the moment, and plan to fix this in a nice way in bug 1593482.

Released in generic-worker 16.5.3.

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED

There was unfortunately another bug that meant that if generic-worker found the generic-worker config file on the host, it didn't update the function to fetch the latest deployment ID from the default, which just checks the local file.

The solution I've implemented is to have a gwconfig.Provider interface that has a UpdateConfig(c *gwconfig.Config) error method for updating the generic worker config settings, and a NewestDeploymentID() (string, error) method for fetching the newest deployment ID.

The provider is set on startup, so the correct method will be used for fetching the deployment ID. The UpdateConfig method of the provider will be called if there is not already a generic-worker config file available, otherwise the cached config file will be loaded. This is also how updating config worked before, but the difference is that the deployment ID fetching logic is now independent of whether there is a cached config file or not.

Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee: nobody → pmoore
Status: REOPENED → ASSIGNED

Note, I'm deploying to aws-provisioner-v1/gwci-linux-beta from the PR branch to test the changes work.

Success!

2019/11/05 19:50:55 UTC New deploymentId found! "A4FTIg5EQTW3thMLb3aZug" => "fred" - therefore shutting down!

This was after running two different tasks on the same aws-provisioner-v1/gwci-linux-beta us-west-2 worker (i-0c2b19391977926c1). So things are at least working on AWSProvisioner.

I'll also deploy using AWS Provider, and check it works there too.

I've updated pmoore-test/gwci-linux to use the same AMIs under AWS Provider instead of AWS Provisioner. I successfully submitted a task, and I manually updated the worker pool config to change the deployment IDs. I'm now tailing the logs on i-0f9fa55fe51ca14e6 (after modifying its Security Groups to allow ssh) to wait for the deployment ID check. In ten minutes or so we should see if this is working under AWS Provider too.

Success under AWS Provider too:

2019/11/05 20:37:01 UTC New deploymentId found! "KSjGK16ySI-A7Cs-DGAnLg" => "d1" - therefore shutting down!

Released in generic-worker 16.5.4.

Status: ASSIGNED → RESOLVED
Closed: 5 years ago5 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: