Closed
Bug 1391275
Opened 7 years ago
Closed 7 years ago
Windows buildbot test jobs fail: Unable to establish SSL connection.
Categories
(Infrastructure & Operations Graveyard :: CIDuty, task, P1)
Infrastructure & Operations Graveyard
CIDuty
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: aryx, Unassigned)
References
Details
(Whiteboard: [stockwell infra])
Trees are closed for this, this hits across branches. Push with issues: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=b5d33e3a84200f0e5868e3429b5ad581a04441b6&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=retry&filter-resultStatus=usercancel&filter-resultStatus=runnable Log: https://treeherder.mozilla.org/logviewer.html#?job_id=123879462&repo=mozilla-inbound --07:18:54-- https://hg.mozilla.org/build/tools/raw-file/default/buildfarm/utils/archiver_client.py => `archiver_client.py' Resolving hg.mozilla.org... 63.245.215.25, 63.245.215.102 Connecting to hg.mozilla.org|63.245.215.25|:443... connected. Unable to establish SSL connection. program finished with exit code 1
Comment 1•7 years ago
|
||
Log from a previously working job: --06:50:11-- https://hg.mozilla.org/build/tools/raw-file/default/buildfarm/utils/archiver_client.py => `archiver_client.py' Resolving hg.mozilla.org... 63.245.215.25, 63.245.215.102 Connecting to hg.mozilla.org|63.245.215.25|:443... connected. WARNING: Certificate verification error for hg.mozilla.org: certificate signature failure HTTP request sent, awaiting response... 200 Script output follows Length: 12,179 (12K) [text/x-python] 0K .......... . 100% 8.42 MB/s 06:50:11 (8.42 MB/s) - `archiver_client.py' saved [12179/12179] As noticed, we still hit a warning before this actual error occurred.
Comment 3•7 years ago
|
||
Let the record show that we're executing the following command: $ 'bash' '-c' 'wget -Oarchiver_client.py --no-check-certificate --tries=10 --waitretry=3 https://hg.mozilla.org/build/tools/raw-file/default/buildfarm/utils/archiver_client.py' We then run the downloaded archiver_client.py and use it to download other things, like mozharness. The use of --no-check-certificate completely undermines security. We might as well be using plaintext. This was a pre-existing security bug and it is a good thing this insecure download stopped working. Unfortunately, it is causing a tree closure :/
Comment 4•7 years ago
|
||
It's an issue between the openssl version, which is sending a SSLv2 client hello and Zeus is dropping it (after the upgrade). :fox2mike will go downgrade Zeus and IT will follow up with Zeus
* Support for SSLv2 ClientHellos has been removed in this release. Following the earlier removal of support for SSLv2, the traffic manager has previously accepted an SSLv2-compatible ClientHello when acting as a TLS server, so long as it indicated that the client was willing to upgrade to SSLv3/TLS records for the remainder of the connection. Following the upgrade, the traffic manager will respond to SSLv2-compatible ClientHellos with an SSL alert message, before dropping the connection. VTM-34550 OpenSSL 0.9.x appears to, by default, send an SSLv2 ClientHello unless instructed otherwise through, for instance, "openssl s_client -tls1" or whatever your Python/Wget/Curl/etc mechanisms are for indicating a similar "minimum protocol version of TLS 1.0". It appears that Zeus has broken compatibility with all OpenSSL 0.9.x clients. Revert and discuss with Zeus Actual.
Comment 6•7 years ago
|
||
Just noting here for the record that we rolled back Zeus back to 17.1 by 1010 PDT.
I just reopened inbound/autoland since the retriggers were coming back green.
Comment 8•7 years ago
|
||
Richard's analysis looks right. It looks like you can maybe wget to do this by "--secure-protocol=protocol" (https://linux.die.net/man/1/wget) [the fact that TLSv1 is what they list as the newest version is terrifying]. Maybe time to switch to curl? GPS, good catch on --no-check-certificate
I'm not sure what --secure-protocol implies beyond ClientHello - it may simply be specifying the Hello version, or it might have deeper effects in the wget code. Recompiling wget (and python) against openssl 1.x instead of the current 0.x would likely resolve the issue without any changes to code, command lines, and so forth.
Comment 10•7 years ago
|
||
(In reply to Gregory Szorc [:gps] from comment #3) > Let the record show that we're executing the following command: > > $ 'bash' '-c' 'wget -Oarchiver_client.py --no-check-certificate --tries=10 > --waitretry=3 > https://hg.mozilla.org/build/tools/raw-file/default/buildfarm/utils/ > archiver_client.py' > > We then run the downloaded archiver_client.py and use it to download other > things, like mozharness. > > The use of --no-check-certificate completely undermines security. We might > as well be using plaintext. This was a pre-existing security bug and it is a > good thing this insecure download stopped working. Unfortunately, it is > causing a tree closure :/ bug 798025 :-(
Updated•7 years ago
|
Whiteboard: [stockwell infra]
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Updated•7 years ago
|
Priority: -- → P1
Comment 13•7 years ago
|
||
The solution was to roll back the zeus code. Webops is investigating moving to the LTS branch of the zeus code until we can decommission the machines using older ssl versions.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Updated•6 years ago
|
Product: Release Engineering → Infrastructure & Operations
Updated•4 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•