Closed Bug 1240605 Opened 8 years ago Closed 5 years ago

Set _NT_SYMBOL_PATH on Windows test machines

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task)

Unspecified
Windows
task
Not set
normal

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: bugzilla, Unassigned)

References

()

Details

I'd like to start a discussion about improved symbolication of call stacks in our Windows builds. In particular, the lack of symbolication of system DLLs when we log a call stack (due to assertions, crashes, etc).

In particular, I'd like us to consider setting _NT_SYMBOL_PATH such that the Microsoft dbghelp library can properly reference Microsoft symbols.

These symbols may either be accessed via HTTP on demand, or cached locally for later use. Obviously in the latter case, they do take up a non-trivial amount of disk space. Every time there is an update of an OS component, the symbol client may need to download one or more new symbol files.

Of course, these could be cached on a centralized server, or the local cache could be purged monthly, or something else... anyway, I'm curious what the RelEng opinion is on doing something like this.
(In reply to Aaron Klotz [:aklotz] (please use needinfo) from comment #0) 
> These symbols may either be accessed via HTTP on demand, or cached locally
> for later use. Obviously in the latter case, they do take up a non-trivial
> amount of disk space. Every time there is an update of an OS component, the
> symbol client may need to download one or more new symbol files.
> 
> Of course, these could be cached on a centralized server, or the local cache
> could be purged monthly, or something else... anyway, I'm curious what the
> RelEng opinion is on doing something like this.

We definitely want to facilitate easier debugging. If we can do so while keeping the machine images as simple as possible, all the better. If we can access the symbols on-demand, that would be the best. 

We don't tend to rev the underlying OS or components very often because it can mess with performance data and cause other test failures. We need a period to validate a new baseline when this happens.

Do all the Windows test machine types support this variable, i.e. w10/w8/w7/xp?
(In reply to Chris Cooper [:coop] from comment #1)
> Do all the Windows test machine types support this variable, i.e.
> w10/w8/w7/xp?
Yes.
The most trivial setting for this variable is:
_NT_SYMBOL_PATH=srv*http://msdl.microsoft.com/download/symbols

if we were to create a downstream store to cache the files locally, we'd set it to:
_NT_SYMBOL_PATH=srv*C:\symbols*http://msdl.microsoft.com/download/symbols
where C:\Symbols is an absolute path to a directory created for that purpose.
Looping in Rob to get his take here.

Rob: what's the correct way to do this so that it covers all our test systems? Is this something puppet can manage now, or do we still need to make changes in https://hg.mozilla.org/build/buildbotcustom/file/d56b91bb3b1f/env.py ?
Flags: needinfo?(rthijssen)
Puppet can manage this but the timing isn't good to submit a puppet patch this week (transitioning from puppetless ec2 2008 instances to puppeted ones this week). After this week, when the dust starts to settle, its a simple patch/commit/merge.

markco or myself can assist.
Flags: needinfo?(rthijssen)
(In reply to Rob Thijssen (:grenade - GMT) from comment #5)
> Puppet can manage this but the timing isn't good to submit a puppet patch
> this week (transitioning from puppetless ec2 2008 instances to puppeted ones
> this week). After this week, when the dust starts to settle, its a simple
> patch/commit/merge.
> 
> markco or myself can assist.

Totally fair. Thanks for the info.
Component: Platform Support → Buildduty
Product: Release Engineering → Infrastructure & Operations
Found in triage,
Hey Chris/Rob any updates on this bug? Do we still need it open?
Flags: needinfo?(rthijssen)
Flags: needinfo?(coop)
(In reply to Danut Labici [:dlabici] from comment #7)
> Found in triage,
> Hey Chris/Rob any updates on this bug? Do we still need it open?

i suspect not. if someone wants to set an env var on taskcluster windows builders, it's trivial. they can create a pull request to add it the manifests. here's an example:
https://github.com/mozilla-releng/OpenCloudConfig/blob/462d9fd/userdata/Manifest/gecko-1-b-win2012.json#L568-L581
Flags: needinfo?(rthijssen)
(In reply to Rob Thijssen (:grenade UTC+2) from comment #8)
> i suspect not. if someone wants to set an env var on taskcluster windows
> builders, it's trivial. they can create a pull request to add it the
> manifests. here's an example:
> https://github.com/mozilla-releng/OpenCloudConfig/blob/462d9fd/userdata/
> Manifest/gecko-1-b-win2012.json#L568-L581

...and a corresponding example from a tester: https://github.com/mozilla-releng/OpenCloudConfig/blob/462d9fd37275e37224194349109ff06f981b5481/userdata/Manifest/gecko-t-win10-64-gpu-b.json#L324

Aaron: is this still wanted?
Flags: needinfo?(coop) → needinfo?(aklotz)
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → INCOMPLETE
Flags: needinfo?(aklotz)
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.