Closed
Bug 1184422
Opened 9 years ago
Closed 9 years ago
repo failures breaking b2g builds
Categories
(Infrastructure & Operations Graveyard :: CIDuty, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: hwine, Unassigned)
References
Details
A bad commit landed on upstream master for "repo". The b2g build system insists on taking updates for "repo", so all build slaves got into a bad state, forcing the trees to be closed.
Comment 1•9 years ago
|
||
Gaia and trunk trees were closed at 18:11 PT for this.
Severity: normal → blocker
Reporter | ||
Comment 2•9 years ago
|
||
workaround to get trees open again: - stop mirroring from https://gerrit.googlesource.com/git-repo to git.mozilla.org - strip offending commit from tip of master on vcs-sync host - force push to git.mozilla.org (required temporary unset of denyNonFastForward protection)
Comment 3•9 years ago
|
||
Trees reopened at 21:02 PT.
Reporter | ||
Comment 4•9 years ago
|
||
(In reply to Hal Wine [:hwine] (use NI) from comment #2) > workaround to get trees open again: > - stop mirroring from https://gerrit.googlesource.com/git-repo to > git.mozilla.org NOTE: once issue fixed, the vcs-sync mirroring of git-repo will need to be restarted. A bug should be filed to Developer Services :: General requesting that.
Comment 5•9 years ago
|
||
We're seeing frequent 600s timeouts again like we had a month ago. I wonder if we ended up with a corrupted repo during all of this again. Trees re-closed. https://treeherder.mozilla.org/logviewer.html#?job_id=11756591&repo=mozilla-inbound
Comment 6•9 years ago
|
||
Taking a look...
Comment 7•9 years ago
|
||
No output during the ten minutes before the timeout, will try to hop on a node to take a look: 22:44:34 INFO - repo has been initialized in /builds/slave/b2g_m-in_l64-b2g-haz_dep-00000/build 22:54:34 INFO - Automation Error: mozprocess timed out after 600 seconds running ['./config.sh', '-q', 'emulator-jb', '/builds/slave/b2g_m-in_l64-b2g-haz_dep-00000/build/tmp_manifest/emulator-jb.xml'] 22:54:34 ERROR - timed out after 600 seconds of no output 22:54:34 ERROR - Return code: -9
Comment 8•9 years ago
|
||
Was there any changes made to the repo forAll based repository verification step that was being done?
Updated•9 years ago
|
Component: Gaia::Build → Buildduty
Product: Firefox OS → Release Engineering
QA Contact: bugspam.Callek
Comment 9•9 years ago
|
||
The last time we saw this, and there was a corrupted repo, the smoking gun was a hung git process on a host. There wasn't any log output indicating which repository was hanging everything because repo spawns VCS commands in parallel.
Reporter | ||
Comment 10•9 years ago
|
||
No corrupted repo this time.
Reporter | ||
Comment 11•9 years ago
|
||
Okay, closing this bug. This bug is about a specific issue with 'repo' which was addressed by the workaround in comment 2. The 600s timeout appears to be a coincidental-in-time issue, with same tree impact, but is reported differently and appears to have a different root cause. A new bug will be opened for that. The root cause of this bug remains the way 'repo' is being deployed by mozharness in buildbot land. I'll open another but to find a long term fix for that.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Updated•6 years ago
|
Product: Release Engineering → Infrastructure & Operations
Updated•4 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•