Closed
Bug 1234898
Opened 8 years ago
Closed 8 years ago
Fullimg mar off of taskcluster fails to build appropriately.
Categories
(Taskcluster :: General, defect)
Taskcluster
General
Tracking
(Not tracked)
RESOLVED
WONTFIX
People
(Reporter: nhirata, Unassigned)
Details
(Keywords: qablocker, Whiteboard: [dogfood-blocker])
Attachments
(3 files)
1. update device with : https://queue.taskcluster.net/v1/task/OnnzQ-rTRLubS0MFR-ECYA/runs/0/artifacts/private/build/fota-aries-update-fullimg.mar Expected: device updates and boots Actual: device doesn't update. logcat: W/linker ( 976): could not load library "/system/b2g/libmozglue.so" from LD_PRELOAD for "getprop"; caused by library "/system/b2g/libmozglue.so" not found E/Diag_Lib( 359): qmi_init: Not initialized, calling process init sequence E/Diag_Lib( 359): Setting internal use port to rmnet0 W/linker ( 1028): could not load library "/system/b2g/libmozglue.so" from LD_PRELOAD for "getprop"; caused by library "/system/b2g/libmozglue.so" not found D/charger_monitor( 506): init target 500000 W/linker ( 1035): could not load library "/system/b2g/libmozglue.so" from LD_PRELOAD for "getprop"; caused by library "/system/b2g/libmozglue.so" not found Note: works fine when building locally. There's something wrong with the taskcluster image or build system.
Reporter | ||
Comment 1•8 years ago
|
||
Unmaring and sideloading the mar file fails to install.
Reporter | ||
Updated•8 years ago
|
Whiteboard: [dogfood-blocker]
Reporter | ||
Comment 2•8 years ago
|
||
Alexandre Lissey and I had pre-investigated the issue, and the issue lies in the Taskcluster builds.
Reporter | ||
Comment 3•8 years ago
|
||
Note: the full zip file is fine when flashing the device and OTA mar file is fine. There's something going wrong with the zip process for the mar file for the fota-aries-update-fullimg.mar. Alexandre had told me that the update.zip from the mar is based off of the files for aries.zip. The mar does unpackage the files without error so I think the main problem lies in the zipping of the update.zip file on taskcluster. It works fine locally.
Flags: needinfo?(jopsen)
Reporter | ||
Comment 4•8 years ago
|
||
What's even more odd is that today's build : https://tools.taskcluster.net/task-inspector/#W4AHfVjAT0WHdZied1KckA/0 does unmar and sideload. So it appears that there might be some sort of race condition that the build structure hits every now and then is my guess? I'm not sure I understand the difference yet.
Reporter | ||
Comment 5•8 years ago
|
||
[ to note, we have a dogfood blocker in today's build so we can't use it ]
Reporter | ||
Comment 6•8 years ago
|
||
I think there's a possibility that the cache might be corrupted. I built without cache on taskcluster and it seemed to work fine as well. I'm not sure how to expose this.
Reporter | ||
Comment 7•8 years ago
|
||
Note: that the logcat can appear with a misconfiguration of the update server for a FOTA build. Having said this, I double checked and the issue still occurs still with this build. https://tools.taskcluster.net/task-inspector/#HMZj1bt9Sz2y5WqaQmzYow/1 was kicked off with a build.sh -j1 for troubleshooting.
Reporter | ||
Comment 8•8 years ago
|
||
A build made via using mach shows this issue as well : https://tools.taskcluster.net/task-inspector/#TakfjYt8RtizBxrjKzdkVw/ Made a new dogfood build using j1 to see if this works: https://tools.taskcluster.net/task-inspector/#CHzIWbKVTSmtRkfJ9cjXKA/
Reporter | ||
Comment 9•8 years ago
|
||
Even with a j1 the update.zip fails. It seems like the failure occurs in the packaging of the zip file as the mar file unpacks just fine.
Reporter | ||
Comment 10•8 years ago
|
||
I added a sleep 10 in two places and it seems to work for one try. Need to do further testing to see if this resolves the issue.
Comment 11•8 years ago
|
||
> I added a sleep 10 in two places and it seems to work for one try.
> Need to do further testing to see if this resolves the issue.
Sounds like a race condition in a build system I'm completely unfamiliar with.
Flags: needinfo?(jopsen)
Comment 12•8 years ago
|
||
May I suggest we do try also with adding sha1sum call on the resulting zip and MAR file, so we can compare the result in the end?
Comment 13•8 years ago
|
||
(In reply to Naoki Hirata :nhirata (please use needinfo instead of cc) from comment #8) > A build made via using mach shows this issue as well : > https://tools.taskcluster.net/task-inspector/#TakfjYt8RtizBxrjKzdkVw/ > > Made a new dogfood build using j1 to see if this works: > https://tools.taskcluster.net/task-inspector/#CHzIWbKVTSmtRkfJ9cjXKA/ I took the fota-aries-update-fullimg.mar out of your first link, mar.py -x -j, I get a readable update.zip. Then adb sideload this. And then. Well, it works.
Comment 14•8 years ago
|
||
Said otherwise, I am starting to doubt there is a bug with build system itself. On the other hand, I got at least one report from Julien of failing update. He investigated, found the update.zip on the device was broken. Then cleaned and redownloaded MAR and it worked. I also got report from Ben, but we could not get logs to confirm. However I am starting to get convinced he might have suffered the same condition since after three retries it applied properly.
Flags: needinfo?(felash)
Flags: needinfo?(bfrancis)
Comment 15•8 years ago
|
||
Ok, adb sideload works, it starts to flash partition and then it fails with: "E:Error in /tmp/update.zip (Status 0)" during flashing of system partition.
Comment 16•8 years ago
|
||
(In reply to Alexandre LISSY :gerard-majax from comment #15) > Ok, adb sideload works, it starts to flash partition and then it fails with: > "E:Error in /tmp/update.zip (Status 0)" during flashing of system partition. now that is fun because the device does boot with the new system, but I am missing homescreen? Could it be a resurgence of gaia profile building race condition ?
Comment 17•8 years ago
|
||
Homescreen is running but empty
Comment 18•8 years ago
|
||
(In reply to Alexandre LISSY :gerard-majax from comment #17) > Homescreen is running but empty That might be because of back/forth with new/old build: > 01-09 15:14:40.489 1147 1147 E Default Home Screen: [JavaScript Error: "NS_ERROR_STORAGE_CONSTRAINT: " {file: "app://homescreen.gaiamobile.org/gaia_build_defer_index.js" line: 800}]
Comment 19•8 years ago
|
||
Sorry, I don't have any more log. I also don't have the failed update.log anymore (I guess coming from applying update.mar) but I remember there was something about an error while building the zip.
Flags: needinfo?(felash)
Comment 20•8 years ago
|
||
Maybe the race is triggered by the multiple forced dependencies we have in gonk-misc/Android.mk against system.img that triggers rebuild of gecko and gaia? This was made in the early days to be extra cautious.
Comment 21•8 years ago
|
||
Naoki, that would remove all dependency against system.img avoiding rebuilding that partition and Gecko/Gaia. Maybe this is the root cause of that race?
Flags: needinfo?(nhirata.bugzilla)
Comment 22•8 years ago
|
||
I doubt this is the issue I had: likely I downloaded the same .mar in both my failed and successful attempts, only how it was extracted differed... That means this isn't a build issue IMO.
Reporter | ||
Comment 23•8 years ago
|
||
I had hit the same issues as you when dealing with this, where doing a sideload with a manually un-mar-ed file with -x -j would finish sideloading with an error at the end and shows the error banner. It doesn't seem like anything failed from the install, at the same time, it will freak end users out if we don't have a clean upgrade and is a bad user experience. It's almost like something like a misplaced EoF or something after zipping or marring that's causing the error to be thrown? That's just a guess. I'm not sure why the sleeps seems to help...
Flags: needinfo?(nhirata.bugzilla)
Updated•8 years ago
|
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
Updated•7 years ago
|
Flags: needinfo?(bfrancis)
You need to log in
before you can comment on or make changes to this bug.
Description
•