Closed
Bug 1387541
Opened 7 years ago
Closed 6 years ago
check for signing servers' ssl cert expiration
Categories
(Infrastructure & Operations Graveyard :: CIDuty, task, P2)
Infrastructure & Operations Graveyard
CIDuty
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: mozilla, Assigned: arny)
Details
Attachments
(4 files)
10.91 KB,
image/png
|
Details | |
33.45 KB,
patch
|
jlund
:
review-
mozilla
:
review+
|
Details | Diff | Splinter Review |
726 bytes,
patch
|
riman
:
review+
|
Details | Diff | Splinter Review |
549 bytes,
patch
|
bcrisan
:
review+
|
Details | Diff | Splinter Review |
Related: bug 673303, bug 1373986
Reporter | ||
Comment 1•7 years ago
|
||
https://community.letsencrypt.org/t/it-there-a-command-to-show-how-many-days-certificate-you-have/11351 shows several scripts that help show ssl cert expiration dates. Some can run on the server itself, given a path (e.g. ssl-cert-check) or openssl x509 -noout -dates -in /path/to/cert.pem This can run outside the server, as long as it can reach the ssl port: echo | openssl s_client -connect SERVER:PORT -servername FQDN 2>/dev/null | openssl x509 -noout -dates So it's a question of whether we want to run this check on each server, or have a central location checking the other locations.
Updated•6 years ago
|
Priority: -- → P2
Comment 2•6 years ago
|
||
Some of our certs are going to expire in about 6 months. We'll get bitten by this again unless we get notifications ahead of time :)
Updated•6 years ago
|
Component: Release Automation → Buildduty
QA Contact: catlee → jlund
Assignee | ||
Comment 3•6 years ago
|
||
Usually, the certificate issuer will send out notice, to the admin email of ssl, with 90 days before the ssl expire so the admin can renew it in time. If you want a second check maybe we can set a cron job with a script like https://github.com/Matty9191/ssl-cert-check so we get notices before the ssl expire. The above script can check local certificates and remote ones. Also, can work like a nagios plugin. Maybe the best is to run the check from a central location if all the ssl domains are accessible from remote.
Comment 4•6 years ago
|
||
Thanks Attila. 90 days should be sufficient but it will be a matter of where it's sent and who catches it. Do we know who owns it?
Assignee | ||
Comment 5•6 years ago
|
||
According to the new SSL standards, the admin email can be one of administrator/postmaster/webmaster/admin/hostmaster@domain or any other email which is in the whois data of the domain - this email where used to validate the domain during the process of issuing the ssl certificate. Another email that can receive the notice is the email account used to place the SSL order. Can anyone give me the list of the ssl domains so I can check them? Ex mozilla.org is issued by www.digicert.com
Comment 6•6 years ago
|
||
@aki - ciduty (buildduty) would like to help here but we have some questions. At a minimum it sounds like we need a check on each of our signing servers from an outside host that has the right network flows. If the date of expiry is within some near window, we alert via nagios. They need some context about how these certs were generated (by us) and where they are used (e.g. signingscript -> signing server). As well as where you would recommend we run these checks and how. aki - would you have 15min to meet with attila (arny)
Updated•6 years ago
|
Assignee: nobody → acraciun
Comment 7•6 years ago
|
||
I'll leave it to Attila to reach out to you
Reporter | ||
Comment 8•6 years ago
|
||
The certs are self-signed, per https://mana.mozilla.org/wiki/display/RelEng/Signing#Signing-SSL They're currently used from all buildbot masters, all buildbot workers, funsize signing workers, partner-repack1, and signing scriptworkers. When esr52 dies and we have an off-cycle partner repack story from taskcluster, we'll narrow this list down to just the signing scriptworkers. I believe the ports are the signing ports: 9100, 9110, 9120. I also believe the cert is shared across all signing servers and ports, so as long as that assumption is true, checking one may be sufficient.
Reporter | ||
Comment 9•6 years ago
|
||
It may be possible to also do `openssl x509 -noout -dates -in` against the host cert. The signing scriptworker host.cert is here: https://github.com/mozilla-releng/signingscript/blob/master/signingscript/data/host.cert We use that for everything but esr52. The esr52 host.cert is here: https://hg.mozilla.org/build/tools/file/tip/release/signing/host.cert These are public files, so we could even create a cron'ed taskcluster task that runs periodically, that: * clones tools from hg * clones signingscript from git * runs `openssl x509 -noout -dates -in host.cert` against both host.cert files * checks the notBefore and notAfter dates to make sure they're still valid, and will still be valid for x amount of time (90 days?) ** if true, exit 0 ** if false, exit non-zero * has taskclusterNotify set so it emails on failure, similar to the cot-gpg-keys expiration monitoring hook https://tools.taskcluster.net/hooks/project-releng/cot-gpg-keys%2Fexpiration
Reporter | ||
Comment 10•6 years ago
|
||
(In reply to Aki Sasaki [:aki] from comment #9) > The esr52 host.cert is here: > https://hg.mozilla.org/build/tools/file/tip/release/signing/host.cert We should use default, not tip: https://hg.mozilla.org/build/tools/file/default/release/signing/host.cert > These are public files, so we could even create a cron'ed taskcluster task > that runs periodically, that: > > * clones tools from hg > * clones signingscript from git We could also just wget those 2 URLs and avoid git and mercurial completely.
Assignee | ||
Comment 11•6 years ago
|
||
Aki, I have tried to connect to Signing-Host from https://mana.mozilla.org/wiki/display/RelEng/Signing#Signing-Hosts on the ports you mentioned, but looks like there are no open ports. Also, I have checked the signing-linux-18.srv.releng.usw2.mozilla.com server for any process running on this ports but there is nothing. Maybe we need to check the certs locally on each server? Also, at https://mana.mozilla.org/wiki/display/RelEng/Signing#Signing-Inventory the quickest that expire are: Releases & Nightlies Authenticode (SHA2) 6/23/2017 6/28/2019 Linux signing machines Releases & Nightlies GPG 6/22/2017 6/22/2019 All signing machines Is the list up to date?
Reporter | ||
Comment 12•6 years ago
|
||
(In reply to Attila Craciun [:arny] from comment #11) > Aki, I have tried to connect to Signing-Host from > https://mana.mozilla.org/wiki/display/RelEng/Signing#Signing-Hosts on the > ports you mentioned, but looks like there are no open ports. Also, I have > checked the signing-linux-18.srv.releng.usw2.mozilla.com server for any > process running on this ports but there is nothing. > > Maybe we need to check the certs locally on each server? It's possible. We have a firewall that restricts those ports to an allowlist of IPs: https://hg.mozilla.org/build/puppet/file/tip/modules/fw/manifests/networks.pp#l191 signing-linux-18 is a signing scriptworker; it talks to the signing servers, e.g. signing4 and mac-v2-signing1. The ports and certs are on the signing servers, not the signing scriptworkers. Right now, for the ssl certs, host.cert is probably the easiest solution. I'm not sure if the openssl check checks both certs, or if we have to split it apart for expiration checking. > Also, at > https://mana.mozilla.org/wiki/display/RelEng/Signing#Signing-Inventory the > quickest that expire are: > > Releases & Nightlies Authenticode (SHA2) 6/23/2017 6/28/2019 Linux signing > machines > > Releases & Nightlies GPG 6/22/2017 6/22/2019 All signing machines > > > Is the list up to date? It should be, but we're not always good at keeping that up to date. Otherwise we could just check the list rather than running automation to check expiration :) For signing key expiration, you're not going to be able to log into the signing servers to check the keys, which will make writing the nagios checks difficult. If we can get you throwaway or dep keys of each type, you might be able to write the checks against those, and then we can deploy those checks on the production instances.
Assignee | ||
Comment 13•6 years ago
|
||
I tried to ssh to signing4.srv.releng.scl3.mozilla.com with my user, it ask me for DUO but later I get: debug1: Offering RSA public key: /home/arny/.ssh/id_rsa debug1: Authentications that can continue: publickey debug1: No more authentication methods to try. Permission denied (publickey). I don't think I'll need ssh access because: This https://hg.mozilla.org/build/tools/file/default/release/signing/host.cert has two certificates for the following FQDNs: Valid From: August 30, 2017 Valid To: August 30, 2018 mac-v2-signing1.srv.releng.scl3.mozilla.com mac-v2-signing2.srv.releng.scl3.mozilla.com mac-v2-signing3.srv.releng.scl3.mozilla.com mac-v2-signing4.srv.releng.scl3.mozilla.com mac-v2-signing6.srv.releng.scl3.mozilla.com mac-v2-signing7.srv.releng.scl3.mozilla.com (looks like the bellow was renewed recently) Valid From: January 31, 2018 Valid To: January 31, 2019 mac-v2-signing8.srv.releng.mdc1.mozilla.com mac-v2-signing9.srv.releng.mdc1.mozilla.com mac-v2-signing10.srv.releng.mdc1.mozilla.com mac-depsigning1.srv.releng.mdc1.mozilla.com mac-depsigning2.srv.releng.mdc1.mozilla.com mac-depsigning3.srv.releng.mdc1.mozilla.com signing7.srv.releng.mdc1.mozilla.com signing8.srv.releng.mdc1.mozilla.com signing9.srv.releng.mdc1.mozilla.com signing4.srv.releng.scl3.mozilla.com signing5.srv.releng.scl3.mozilla.com signing6.srv.releng.scl3.mozilla.com Are all this the signing servers? If yes, then probably the best is to create one new certificate for all the above servers for at least 3 years and upload it to the repo by replacing the actual host.cert? :)
Assignee | ||
Comment 14•6 years ago
|
||
Yup, looks like this are the signing server according to https://github.com/mozilla/build-puppet/blob/master/modules/buildmaster/templates/passwords.py.erb So, we can do: 1) we can create a cronjob on each server to check the ssl expire date with openssl x509 -noout -dates -in /path/to/cert.pem 2) Add nagios plugin to check this. 3) Use another central location to do the checks, however, once we have nagios, I think is the best central location :).
Comment 15•6 years ago
|
||
(In reply to Attila Craciun [:arny] from comment #13) > I tried to ssh to signing4.srv.releng.scl3.mozilla.com with my user, it ask > me for DUO but later I get: yeah, you won't be able to access the signing servers directly. fwiw - I don't have access either :)
Assignee | ||
Comment 16•6 years ago
|
||
Aki, what do you think?
Comment 17•6 years ago
|
||
If there is no way to check cert remotely from a machine with valid network flows, this probably is not be in ciduty's scope.
Reporter | ||
Comment 18•6 years ago
|
||
It seems like a cron-hook to check the two host.certs is a good path forward.
Assignee | ||
Comment 19•6 years ago
|
||
Why are both certificates in one file? The openssl x509 -noout -dates -in /path/to/cert.pem checks only the first ssl, so we have to split the files like: cat host.cert |awk 'split_after == 1 {n++;split_after=0} /-----END CERTIFICATE-----/ {split_after=1}{print > "cert" n ".pem"}' - this will produce cert.pem and cert1.pen which can be checked with openssl. How about the notification? We can use an central server, pull the file, split it, check it and send notifications. Any Linux box with a mail servers? Like a jumphost?
Reporter | ||
Comment 20•6 years ago
|
||
(In reply to Attila Craciun [:arny] from comment #19) > Why are both certificates in one file? Signtool only allows for one cert file, so we put the certs in the same file. > The openssl x509 -noout -dates -in /path/to/cert.pem checks only the first > ssl, so we have to split the files like: > > cat host.cert |awk 'split_after == 1 {n++;split_after=0} /-----END > CERTIFICATE-----/ {split_after=1}{print > "cert" n ".pem"}' > - this will produce cert.pem and cert1.pen which can be checked with openssl. Sounds good to me :) > How about the notification? We can use an central server, pull the file, > split it, check it and send notifications. Any Linux box with a mail > servers? Like a jumphost? Also sounds good. I'm open about the server. Is there a machine that ciduty uses for things like this? Cruncher or aws-manager2? If you write a script, where to run it is a little easier than otherwise. Running it on a machine would probably require a small puppet module; we could have our pick of machines to run it on once that's written. Or we could write a cron hook to run the script and run it on taskcluster.
Updated•6 years ago
|
Product: Release Engineering → Infrastructure & Operations
Assignee | ||
Comment 21•6 years ago
|
||
Today I have tested the check ssl script from Comment#3 on rejh1.srv.releng.mdc1.mozilla.com and works fine (see attachment). I have manually splited the cert file and checked one from the command line. Now, if all you agree to use this jumphost, I can do the script which will do all the "hard work", then, via puppet, put it to /etc/cron/weekly.
Assignee | ||
Comment 22•6 years ago
|
||
Van, what do you think using one of the jumphost?
Comment 23•6 years ago
|
||
(In reply to Attila Craciun [:arny] from comment #22) > Van, what do you think using one of the jumphost? unping van, relops own the jumphost infra afaik. Attila, as aki pointed out in comment 20, you should be able to use many servers within our network. The jumphosts are used when we are trying to reach a host from outside our network. e.g. from our local machine. Given that, I believe we could use something like our miscellaneous worker host, cruncher: https://dxr.mozilla.org/build-central/source/puppet/manifests/moco-nodes.pp#318 To do so, we would want to add a new cron job via puppet. Here are the cron jobs we currently have on cruncher: https://dxr.mozilla.org/build-central/source/puppet/modules/cruncher/manifests/cron.pp
Flags: needinfo?(acraciun)
Comment 24•6 years ago
|
||
(In reply to Jordan Lund (:jlund) from comment #23) > (In reply to Attila Craciun [:arny] from comment #22) > > Van, what do you think using one of the jumphost? > > unping van, relops own the jumphost infra afaik. > > Attila, as aki pointed out in comment 20, you should be able to use many > servers within our network. The jumphosts are used when we are trying to > reach a host from outside our network. e.g. from our local machine. > > Given that, I believe we could use something like our miscellaneous worker > host, cruncher: > https://dxr.mozilla.org/build-central/source/puppet/manifests/moco-nodes. > pp#318 > > To do so, we would want to add a new cron job via puppet. Here are the cron > jobs we currently have on cruncher: > https://dxr.mozilla.org/build-central/source/puppet/modules/cruncher/ > manifests/cron.pp needinfo attila: does this help?
Assignee | ||
Comment 25•6 years ago
|
||
Yes, got it. I'll do a patch and upload it here for review.
Flags: needinfo?(acraciun)
Assignee | ||
Comment 26•6 years ago
|
||
Attached the patch.
Attachment #8981392 -
Flags: review?(jlund)
Attachment #8981392 -
Flags: review?(aki)
Reporter | ||
Comment 27•6 years ago
|
||
Comment on attachment 8981392 [details] [diff] [review] check_ssl.diff Thanks!
Attachment #8981392 -
Flags: review?(aki) → review+
Comment 28•6 years ago
|
||
Comment on attachment 8981392 [details] [diff] [review] check_ssl.diff Review of attachment 8981392 [details] [diff] [review]: ----------------------------------------------------------------- Thanks arny! Looks good. We should fix up the email name since as it stands, this will go to a black hole. r- till that gets addressed. Since this landed already, a follow up patch is fine. ::: modules/cruncher/manifests/cron.pp @@ +12,4 @@ > '/etc/cron.d/allthethings': > mode => '0644', > content => template('cruncher/allthethings_cron.erb'); > + '/usr/local/bin/cert_check.sh': perhaps the bash script that cron runs should be added here: https://dxr.mozilla.org/build-central/source/puppet/modules/cruncher/manifests/init.pp#4 or in its own file, e.g. check_cert.pp Not a big deal, it's also reasonable to leave it beside the cron definition here. ::: modules/cruncher/templates/check_sign_srv_ssl_exp.sh.erb @@ +11,5 @@ > + > +cat $CERT | awk 'split_after == 1 {n++;split_after=0} /-----END CERTIFICATE-----/ {split_after=1}{print > "/tmp/cert" n ".pem"}' > + > +sh $CHECKSSL -c $CERT1 -x $DAYS -ab -e builddutty@mozilla.com > +sh $CHECKSSL -c $CERT2 -x $DAYS -ab -e builddutty@mozilla.com s/builddutty/buildduty/ Better yet, we can now use ciduty@mozilla.com
Attachment #8981392 -
Flags: review?(jlund) → review-
Assignee | ||
Comment 29•6 years ago
|
||
Updated notification email.
Attachment #8982134 -
Flags: review?(riman)
Updated•6 years ago
|
Attachment #8982134 -
Flags: review?(riman) → review+
Comment 30•6 years ago
|
||
Pushed by acraciun@mozilla.com: https://hg.mozilla.org/build/puppet/rev/a9f248ebbabe Updated notification email for the signing servers' ssl cert expiration. Bug 1387541. r=riman.
Assignee | ||
Comment 31•6 years ago
|
||
:jlund: I have checked with :dragrom once I made the first patch and he suggested to put it in the cron.pp. I wanted to put it somewhere else however, since the cron need that bash script, we wanted to make sure that is there each time when it runs.
Assignee | ||
Comment 32•6 years ago
|
||
Now that the script is in place, how can I can access the server so I can do a manually run for the check. Maybe there is a missing program like mailx. I have tried to ssh but get (with my ssh user). channel 2: open failed: administratively prohibited: open failed Stdio forwarding request failed: Session open refused by peer ssh_exchange_identification: Connection closed by remote host
Comment 33•6 years ago
|
||
(In reply to Attila Craciun [:arny] from comment #32) > Now that the script is in place, how can I can access the server so I can do > a manually run for the check. Maybe there is a missing program like mailx. I > have tried to ssh but get (with my ssh user). > > channel 2: open failed: administratively prohibited: open failed > Stdio forwarding request failed: Session open refused by peer > ssh_exchange_identification: Connection closed by remote host Sorry, please needinfo me as I don't check cc bugmail daily. Strange, you are should be able to get to cruncher through the jumphost: [jlund@cruncher-aws.srv.releng.usw2.mozilla.com ~]$ sudo cat /home/acraciun/.ssh/authorized_keys ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDXYkHS2dBgka49LH7lwQaJx3ORpFUiTSvNCcA5rzPPFRql1y6/qcLEDmw0+v4JRFMubrrp1WfL5XLGqyPc2RBX5r3Z0eJbI9rEWxU3Lkp02XR7oRpOoY8Ffhx0yZ4nxXU+BaA3NK+i4kM6UMhH0e7KCPW+9ACKNEgUNSI2UZmsS+kL/veIcfyipPDVX44u06I1ytekZ/x/uXRZcZjf1jMsVYacuOUyfzAvkfW+5XEFxvwkf71uSE73C/dw7Tt3TCsi5qptqN4AcYF5XobksykN8fwwjWbza+JuQhiTz9LvfQfvOhhdvbv/PphC1+pmSaAjVHfiMPUmjykRD/9XcfJMQN7eY8jACA1h2JAb+kG4ABnmQElzCPH8ygerwaWQ682cxMmGzwqFRDEhP9Qo4J2qR9lPaznOooHk2psOsS/h0dtFWGJEfgOIJkmb9Nf8HDzkUEx0n3rY9yqbxu6HZw7EJisKOqLudIJ4eZa/AudSnkcsFDV1SWx3zC32IRv3N76uwqNjHgU7qLOeyFH1y2U91QdxdkjTiSuCe0TzNW5zjSzOSyVEfjTyHqFZuSCT55ZvG/hrelrrNuQVgV050o9e/WWp8dsZCzddMaoVw+VJnCpt/eZSwhTtxn/rOiUqQ/gVDLhdE9jI/IrJndjCVcc3D0Ayq89QrfEapFrs1Vahqw== Mozilla_key
Assignee | ||
Comment 34•6 years ago
|
||
Works now the ssh login. Made a check and looks like I made a mistake on the script name. This patch fix it.
Attachment #8984805 -
Flags: review?(bcrisan)
Updated•6 years ago
|
Attachment #8984805 -
Flags: review?(bcrisan) → review+
Assignee | ||
Comment 35•6 years ago
|
||
:aki, can you push the patch to github? Looks like I don't have access. fatal: unable to access 'https://github.com/mozilla/build-puppet.git/': The requested URL returned error: 403
Flags: needinfo?(aki)
Reporter | ||
Comment 36•6 years ago
|
||
You need to fork the repo, push the patch to a branch on your own fork, and create a PR. Do you want to try that, or would you like me to?
Flags: needinfo?(aki)
Assignee | ||
Comment 37•6 years ago
|
||
Done the fork, pushed and created the PR. Looks good?
Flags: needinfo?(aki)
Comment 39•6 years ago
|
||
:arny - can we close this bug? Has it been verified?
Flags: needinfo?(acraciun)
Assignee | ||
Comment 40•6 years ago
|
||
:aki - next week I'll close it after I receive the cron email with the warning. How we proceed once we get warning with the ssl expiration? We'll generate a new one and push it to the repo?
Flags: needinfo?(acraciun) → needinfo?(aki)
Reporter | ||
Comment 41•6 years ago
|
||
Sure, or file a bug for a new ssl cert for someone in releng to create and deploy. As long as we have heads up about the expiration, we're good.
Flags: needinfo?(aki)
Assignee | ||
Comment 42•6 years ago
|
||
Looks like the warning emails where sent from root@localhost.localdomain and the ciduty AT mozilla dot com was not accepting. Created PR with a patch https://github.com/mozilla-releng/build-puppet/pull/101/commits/cccea32cca5c7c5890f475a6029e108c7471facf Please merge, thanks.
Flags: needinfo?(aki)
Comment 44•6 years ago
|
||
It worked! sent from root@cruncher-aws.srv.releng.usw2.mozilla.com. I think we are done here. Please reopen if I'm wrong.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Updated•4 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•