Closed
Bug 1127304
Opened 9 years ago
Closed 9 years ago
Flame device fails to start after flashing latest build M-C and M-I
Categories
(Core :: Graphics: CanvasWebGL, defect)
Tracking
()
People
(Reporter: RobertC, Unassigned)
References
Details
(Keywords: qablocker, regression, smoketest, Whiteboard: [fromAutomation])
Attachments
(1 file)
675.09 KB,
text/plain
|
Details |
After flashing with the latest mozilla-inbound build the device fails to start. We tried restarting the device, adb reboot, but the device still fails to start. Device info: Flame 319MB v18D-1 Regression info from Jenkins sanity runs. Last working: Device firmware (base) L1TC100118D0 Device firmware (date) 29 Jan 2015 00:20:38 Device firmware (incremental) eng.cltbld.20150129.032025 Device firmware (release) 4.4.2 Device identifier flame Gaia date 28 Jan 2015 10:25:55 Gaia revision 9d2378a9ef09 Gecko build 20150128235034 Gecko revision e5c85f765f2d Gecko version 38.0a1 First broken: Device firmware (base) L1TC100118D0 Device firmware (date) 29 Jan 2015 01:18:49 Device firmware (incremental) eng.cltbld.20150129.041838 Device firmware (release) 4.4.2 Device identifier flame Gaia date 28 Jan 2015 10:25:55 Gaia revision 9d2378a9ef09 Gecko build 20150129005832 Gecko revision fc58ed477ccb Gecko version 38.0a1 It looks like between these builds there were only gecko changes. gecko diff: http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=e5c85f765f2d&tochange=fc58ed477ccb
Comment 1•9 years ago
|
||
This is now in mozilla-central now: Last good: Device firmware (base) L1TC100118D0 Device firmware (date) 28 Jan 2015 18:52:23 Device firmware (incremental) eng.cltbld.20150128.215212 Device firmware (release) 4.4.2 Device identifier flame Gaia date 28 Jan 2015 10:25:55 Gaia revision 9d2378a9ef09 Gecko build 20150128183748 Gecko revision 6bfc0e1c4b29 Gecko version 38.0a1 First bad: Device firmware (base) L1TC100118D0 Device firmware (date) 29 Jan 2015 06:43:07 Device firmware (incremental) eng.cltbld.20150129.094256 Device firmware (release) 4.4.2 Device identifier flame Gaia date 28 Jan 2015 10:25:55 Gaia revision 9d2378a9ef09 Gecko build 20150129060452 Gecko revision a98d16e6a3b4 Gecko version 38.0a1 As stated above after flashing with 20150129060452 the device won't start
Comment 2•9 years ago
|
||
geko diff: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=6bfc0e1c4b29&tochange=a98d16e6a3b4
Updated•9 years ago
|
Summary: Flame fails to start after flashing with latest m-i build → Flame device fails to start after flashing latest build M-C and M-I
Comment 3•9 years ago
|
||
[Blocking Requested - why for this release]: Will fail smoketests and automation. http://hg.mozilla.org/integration/mozilla-inbound/rev/ea243bbbb45c looking through the pushlogs, could this be the culprit? Bob Owens can you take a look?
blocking-b2g: --- → 3.0?
Flags: needinfo?(bobowen.code)
Comment 4•9 years ago
|
||
(In reply to Peter Bylenga [:PBylenga] from comment #3) > [Blocking Requested - why for this release]: > Will fail smoketests and automation. > > http://hg.mozilla.org/integration/mozilla-inbound/rev/ea243bbbb45c > looking through the pushlogs, could this be the culprit? Bob Owens can you > take a look? Hmm, all of that patch should be within Windows #ifdefs, so I don't see how it could have affected B2G.
Flags: needinfo?(bobowen.code)
Comment 5•9 years ago
|
||
I rebuilt with ea243bbbb45c backed out but that didn't fix the crash.
Device lab is essentially fully down with black-screened phones. I'll dig further to make sure it's from this, but seems highly likely.
Comment 7•9 years ago
|
||
WE have a busted nighlty env today for fxOS and QA is blocked completely. Can we get some engg help to see what is the culprit here keeping the pushlog from comment #2 in mind? Note: QA is trying to further bisect it, and it looks like http://hg.mozilla.org/integration/mozilla-inbound/rev/ea243bbbb45c may not be the cuplrit..
Comment 8•9 years ago
|
||
alright, so we crash at: I/Gecko (13287): RUNTIME ASSERT: Uninitialized GL function: fEGLImageTargetTexture2D
Comment 9•9 years ago
|
||
That comes from GLContext.h's ASSERT_SYMBOL_PRESENT, which is called for this function here: http://mxr.mozilla.org/mozilla-central/source/gfx/gl/GLContext.h#2354 So we're working with a GLContext that has some uninitialized functions, for some reason.
Comment 10•9 years ago
|
||
which means it's likely to be https://hg.mozilla.org/mozilla-central/rev/176166c0bae9 (bug 1124394), which reworks GLContext.h and .cpp a bit.
Updated•9 years ago
|
Flags: needinfo?(jgilbert)
Comment 11•9 years ago
|
||
I confirm that backing out https://hg.mozilla.org/mozilla-central/rev/176166c0bae9 fixes the issue.
I backed it out and pushed the backout to inbound in https://hg.mozilla.org/integration/mozilla-inbound/rev/b556a1f684ed I'll merge that backout around before the next b2g nightlies get scheduled in an hour and a half.
Comment 13•9 years ago
|
||
In the future, run a DEBUG build if you're hitting runtime asserts. Runtime asserts mean something is seriously messed up. Debug asserts have a good chance of telling us what.
Flags: needinfo?(jgilbert)
The attached logcat has the error. the logcat was attached at the creation of the bug. Automation requires the engineering build to be used (Marionette). I think the engineering build has a limited debug output log?
Comment 15•9 years ago
|
||
QAWanted to check the next nightly M-C after 4pm when it's available.
Updated•9 years ago
|
QA Contact: pcheng
Comment 16•9 years ago
|
||
(In reply to Naoki Hirata :nhirata (please use needinfo instead of cc) from comment #14) > The attached logcat has the error. the logcat was attached at the creation > of the bug. > Automation requires the engineering build to be used (Marionette). > > I think the engineering build has a limited debug output log? I'm not clear on what an engineering build is. Does it include a full DEBUG build of gecko? That is what we need.
Comment 17•9 years ago
|
||
(In reply to Peter Bylenga [:PBylenga] from comment #15) > QAWanted to check the next nightly M-C after 4pm when it's available. Verified that issue is fixed on latest nightly M-C after 4pm. Device can be flashed and enters FTU. Device: Flame (nightly user build, 319MB mem) BuildID: 20150129160230 Gaia: 8238eeacc7030b2cdbf7ab4eba2f36779b702599 Gecko: 29b05d283b00 Gonk: e7c90613521145db090dd24147afd5ceb5703190 Version: 38.0a1 (3.0 master) Firmware Version: v18D-1 User Agent: Mozilla/5.0 (Mobile; rv:38.0) Gecko/38.0 Firefox/38.0
Updated•9 years ago
|
Updated•9 years ago
|
Status: RESOLVED → VERIFIED
Updated•9 years ago
|
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(ktucker)
Comment 18•9 years ago
|
||
(In reply to Jeff Gilbert [:jgilbert] from comment #16) > (In reply to Naoki Hirata :nhirata (please use needinfo instead of cc) from > comment #14) > > The attached logcat has the error. the logcat was attached at the creation > > of the bug. > > Automation requires the engineering build to be used (Marionette). > > > > I think the engineering build has a limited debug output log? > > I'm not clear on what an engineering build is. Does it include a full DEBUG > build of gecko? That is what we need. No, we usually don't run DEBUG builds on devices because they are too slow (it's only bearable on high end devices like the N5, but on a flame). If you really need it I can do a debug build with the patch that was backed out and give you a stack trace.
Comment 19•9 years ago
|
||
(In reply to Fabrice Desré [:fabrice] from comment #18) > (In reply to Jeff Gilbert [:jgilbert] from comment #16) > > (In reply to Naoki Hirata :nhirata (please use needinfo instead of cc) from > > comment #14) > > > The attached logcat has the error. the logcat was attached at the creation > > > of the bug. > > > Automation requires the engineering build to be used (Marionette). > > > > > > I think the engineering build has a limited debug output log? > > > > I'm not clear on what an engineering build is. Does it include a full DEBUG > > build of gecko? That is what we need. > > No, we usually don't run DEBUG builds on devices because they are too slow > (it's only bearable on high end devices like the N5, but on a flame). If you > really need it I can do a debug build with the patch that was backed out and > give you a stack trace. It's not a stack trace I need, it's the additional asserts. Just because it seems to run properly in opt builds doesn't mean it's not broken under the surface. Ignoring debug builds is a huge hit to the confidence of a clean run. DEBUG+OPT builds are fine. Getting assert coverage is the important part. I really do understand that working with slow devices sucks, but skipping DEBUG builds completely is risky. Since we're already backed it out, I'm going to be doing these investigations myself anyways, so there's no need for you to duplicate work here. My desire for the results from a DEBUG run are because often these such things are caused by situations explicitly caught by our sanity asserts. If this case is indeed like this, I would have been able to immediately recommend a fix, instead of having to slosh this in and out of landing again.
(In reply to Jeff Gilbert [:jgilbert] from comment #19) > (In reply to Fabrice Desré [:fabrice] from comment #18) > > (In reply to Jeff Gilbert [:jgilbert] from comment #16) > > > (In reply to Naoki Hirata :nhirata (please use needinfo instead of cc) from > > > comment #14) > > > > The attached logcat has the error. the logcat was attached at the creation > > > > of the bug. > > > > Automation requires the engineering build to be used (Marionette). > > > > > > > > I think the engineering build has a limited debug output log? > > > > > > I'm not clear on what an engineering build is. Does it include a full DEBUG > > > build of gecko? That is what we need. > > > > No, we usually don't run DEBUG builds on devices because they are too slow > > (it's only bearable on high end devices like the N5, but on a flame). If you > > really need it I can do a debug build with the patch that was backed out and > > give you a stack trace. > > It's not a stack trace I need, it's the additional asserts. Just because it > seems to run properly in opt builds doesn't mean it's not broken under the > surface. Ignoring debug builds is a huge hit to the confidence of a clean > run. > > DEBUG+OPT builds are fine. Getting assert coverage is the important part. I > really do understand that working with slow devices sucks, but skipping > DEBUG builds completely is risky. > > Since we're already backed it out, I'm going to be doing these > investigations myself anyways, so there's no need for you to duplicate work > here. > > My desire for the results from a DEBUG run are because often these such > things are caused by situations explicitly caught by our sanity asserts. If > this case is indeed like this, I would have been able to immediately > recommend a fix, instead of having to slosh this in and out of landing again. Thank you for expressing the context of why you need the DEBUG version run. Based on what you're asking, it seems like a case by case basis; QA can't run this all the time because it's pretty costly to run each time. Automation runs with a partial debug, engineering builds are based off of VARIANT=eng which has some DEBUG=1... I can't recall what levels and where, it's flagged at build time int he build time script. Having said that I saw : https://pvtbuilds.mozilla.org/pvt/mozilla.org/b2gotoro/tinderbox-builds/b2g-inbound-flame-kk-eng-debug/ According to the log, the build script contains B2G_DEBUG=1 When asked, we could probably run these.
Comment 21•9 years ago
|
||
Ok, great, thanks for the clarification!
Comment 22•9 years ago
|
||
The nightly build 2015-01-29-16-02-30 runs okay, test triggered by Jenkins. Report http://mozilla-twqa.github.io/Gaiatest-Reports/2015/01/20150129160230/index.html Build https://pvtbuilds.mozilla.org/pvt/mozilla.org/b2gotoro/nightly/mozilla-central-flame-kk/2015/01/2015-01-29-16-02-30/
Comment 23•9 years ago
|
||
We do run automated monkey tests against debug builds from the b2g-inbound branch. These picked up the crash, but there's not a lot of information provided. Perhaps we can make these more useful, or add some sanity tests based on debug builds. Where would we see the information on the additional asserts, would they be in the logcat?
Flags: needinfo?(jgilbert)
Comment 24•9 years ago
|
||
Moving the smoketest blocker bug to the component where the regression came from.
blocking-b2g: 2.5? → 2.5+
Component: General → Canvas: WebGL
Product: Firefox OS → Core
Updated•8 years ago
|
Flags: needinfo?(jgilbert)
You need to log in
before you can comment on or make changes to this bug.
Description
•