Closed Bug 1459376 Opened 6 years ago Closed 6 years ago

Log information about files downloaded via "mounts"

Categories

(Taskcluster :: Workers, enhancement)

enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: gps, Assigned: pmoore)

References

Details

Attachments

(1 file)

AFAICT, when "mounts" is used to retrieve remote files (either from a URL or a task artifact), there is no indication of this in the task output unless an error occurs.

There are a handful of problems with this:

a) We don't how long downloads are taking. This makes it hard to understand what a task is doing, whether downloads are too slow, etc.

b) It isn't obvious things are working as expected. If I'm looking at the log of a task that is supposed to download something, I want to see confirmation that that actually occurred. e.g. if I'm debugging a "file not found" error I'm not sure if that's because the "mounts" download failed, whether I'm referencing the incorrect local path, etc.

c) Undermines security auditing.

I'd ideally like to see the following in log output:

* The time it took to download. This could be logging an elapsed time. Or it could be inferred by printing multiple lines with "started download" "ended download" etc.

* The URL being downloaded, ideally including any HTTP redirects that are followed.

* The local path where the file is saved.

* The total size of the downloaded file.

* The hash of the downloaded file (for security auditing).
I like this. I'll take a stab at it.
:gps, I've taken an initial stab at this in https://github.com/taskcluster/generic-worker/pull/89

I'm stopping for the day now, but please feel free to take a look.

I've implemented the logging you requested, and also made "sha256" an optional property for url content / artifact content in the task mounts, so that the task can declare the required SHA256 of content it refers to. If the SHA256 doesn't match, the task will fail with MalformedPayload (perhaps a different resolution would be better).

I haven't written an unit tests yet, but this is the work I'm parking for the evening, and will continue tomorrow.

Thanks for raising the bug!
Flags: needinfo?(gps)
Assignee: nobody → pmoore
Status: NEW → ASSIGNED
Comment on attachment 8973787 [details] [review]
Github Pull Request for generic-worker

Details in the PR...
Attachment #8973787 - Flags: review?(jhford)
Attachment #8973787 - Flags: review?(gps)
(In reply to Gregory Szorc [:gps] from comment #0)

> * The URL being downloaded, ideally including any HTTP redirects that are
> followed.

In the case of artifacts, I don't include the URL since they are signed URLs, so logging them could be a problem if they are private artifacts. However, in the PR I am logging the taskID and the artifact name.

Feel free to suggest alternative wording or layout of the log lines themselves, e.g. to make them easier to parse, or to include explicit time measurements of activities etc etc.
Depends on: 1460267
I've added the dependency, as the win10 CI is broken at the moment which means we can't run test the changes on win10 at the moment, so we shouldn't really roll out without those tests running.
Comment on attachment 8973787 [details] [review]
Github Pull Request for generic-worker

This is more an r+ of the design, not the implementation, as I'm not qualified to review the code.
Flags: needinfo?(gps)
Attachment #8973787 - Flags: review?(gps) → review+
Attachment #8973787 - Flags: review?(jhford) → review+
Commits pushed to master at https://github.com/taskcluster/generic-worker

https://github.com/taskcluster/generic-worker/commit/1365abcfbfa1b48dea05da83e6d1a6add71a217d
Bug 1459376 - log activity in [mounts] feature and add optional sha256 property for downloadable content

https://github.com/taskcluster/generic-worker/commit/3809e99e74f5915dd50b4954a804dd57560f7ffd
Bug 1459376 - Don't retain downloads where the declared/required/expected SHA256 doesn't match calculated/actual value

https://github.com/taskcluster/generic-worker/commit/da37f61d4cfc321a6c76003f4a0b095c6aaca857
Merge pull request #89 from taskcluster/bug1459376

Bug 1459376 - log actions taken by mounts feature, and validate SHA256 of content
Released in generic-worker 10.8.0.
Does this mean we can rely on this feature being available <now> (or at least a short time from <now> once all workers are recycled)? i.e. can I start submitting patches that use the new feature?
I've released to staging, and have a try push incoming at the moment using the staging worker types. If that goes well, we should be able to deploy to production in the next few days. :-)
Depends on: 1461901
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Component: Generic-Worker → Workers
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: