Closed Bug 1123390 Opened 9 years ago Closed 6 years ago

Use allthethings as the canonical location for builder->machine mappings

Categories

(Release Engineering :: General, defect)

x86_64
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: coop, Unassigned)

References

Details

(Whiteboard: [buildduty][slavehealth])

We'd like to start pulling in more accurate and timely data into slave health (bug 1123371). As part on this we'd like to include the data/graphs generated by nthomas' reporting scripts, e.g.:

http://cruncher.srv.releng.scl3.mozilla.com/~nthomas/pending5/running.html
http://cruncher.srv.releng.scl3.mozilla.com/~nthomas/pending5/pending.html

I think the list of running jobs would be particularly great to add as a replacement for the current display of "Working" machines in slave health.

I'd like to make sure slave health is using the same algorithms for parsing the pending/running queues to make sure our results are the same in both places.
(In reply to Chris Cooper [:coop] from comment #0)
> I think the list of running jobs would be particularly great to add as a
> replacement for the current display of "Working" machines in slave health.

Not completely sure what you mean here, but there's the age-old problem of schedulerdb having no clue about which slave is doing the job. You'd have to combine in pulse data or something.
 
> I'd like to make sure slave health is using the same algorithms for parsing
> the pending/running queues to make sure our results are the same in both
> places.

Feel free to poke around at cruncher:~nthomas/bb_dash/pending5/. The quick version is that each running/pending job is mapped to a slavepool using allthethings.json, and there's a rough heuristic to convert the slavepool hash to a human-friendly name. Then dump data into csv's and use gnuplot (there's talk of putting this part into grafana in bug 1119647). It's fairly robust, except for zombies when jobs get deprecated, or big changes in the slave pool layouts.
Summary: Synchronize the running/pending parsing algorithms between slave health and nthomas' reports → Use allthethings as the canonical location for builder->machine mappings
See Also: → 1112777
Assignee: coop → nobody
Whiteboard: [buildduty][slave health] → [buildduty][slavehealth]
Component: Tools → General
not useful after tcmigration is complete
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.