Closed Bug 1161734 Opened 9 years ago Closed 9 years ago

Blank and flashing thumbnails in Gallery app

Categories

(Core :: Graphics: ImageLib, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME
Tracking Status
b2g-v2.0 --- unaffected
b2g-v2.1 --- unaffected
b2g-v2.2 --- affected
b2g-master --- affected

People

(Reporter: ntroast, Assigned: seth)

References

()

Details

(Keywords: regression, Whiteboard: [caf priority: p2][CR 830488])

Attachments

(5 files)

Attached video IMG_0160.mp4
STR:
Restrict total memory via |fastboot oem mem 320m|
Load ~100 images on the device
Open Gallery app and scroll

Video of the symptom described is attached.
Note that this reproduces with either MDP or GPU composition.

Gecko: 302f6a4afaa718c116fed3c6cbc61212242a002a
Gaia: 265ca0bc9408c21fc4b25a259fcee7fb642cd06b
Recent regression?
Flags: needinfo?(seth)
I can reproduce on master with yesterday's nightly, on a 319mb flame.  I'm using a mix of photos and test images, so there is nothing special here about the test images shown in the video.
QA Contact: ktucker
Whiteboard: [CR 833577]
Whiteboard: [CR 833577] → [caf priority: p2][CR 833577]
To be more clear: I have reproduce the blank thumbnails on master, not the flickering thumbnails. 

Sometimes I can reproduce just by scrolling quickly to the end of my images. When I do this, sometimes the last four thumbnails are blank gray.  But this doesn't seem to happen often.

What does make this reproduce easily is tapping on one of the thumbnails, then tapping the back button, then scrolling. I almost always see gray and/or black squares when I do that.
This sounds like a low-memory issue.

I would expect that bug 1148696 would take care of the flashing, and indeed from what David is saying it sounds like there's no flashing on master. We should probably uplift bug 1148696 to b2g 2.2.

I'm not sure why we're seeing black/gray thumbnails. I'd expect that to happen in a situation where we are so low-memory that we decide to not even try to decode the image. However, I definitely would not expect that to be the case here, particularly not on the 2.2 branch, which doesn't have image locking enabled right now. That should only happen when we're on the edge of OOM.

I'll put this on my todo-list to investigate, but unfortunately I can't get to it right away. Also, it'd be a great help to have a regression range.
What I *can* do right away is request uplift for bug 1148696. I'll do that now.
This issue occurs on the Flame 3.0 and 2.2

Thumbnails appear blank and flash.

Environmental Variables:
Device: Flame 3.0 (Full Flash)(KK)(319mb)
BuildID: 20150506010204
Gaia: 3e6fd1e0a478af2c95d09ce95c2c6de2de2fec14
Gecko: ba44099cbd07
Gonk: a9f3f8fb8b0844724de32426b7bcc4e6dc4fa2ed
Version: 40.0a1 (3.0) 
Firmware Version: v18D-1
User Agent: Mozilla/5.0 (Mobile; rv:40.0) Gecko/40.0 Firefox/40.0

Environmental Variables:
Device: Flame 2.2(Full Flash)(KK)(319mb)
BuildID: 20150506002501
Gaia: 772a9491909abd02dc67278dd453746e2dd358a8
Gecko: 3af6a0a79227
Gonk: ab265fb203390c70b8f2a054f38cf4b2f2dad70a
Version: 37.0 (2.2) 
Firmware Version: v18D-1
User Agent: Mozilla/5.0 (Mobile; rv:37.0) Gecko/37.0 Firefox/37.0

------------------------------

This issue does not occur on the Flame 2.1 and 2.0

Thumbnails always load and do not flicker.

Environmental Variables:
Device: Flame 2.1 (Full Flash)(KK)(319mb)
BuildID: 20150506001242
Gaia: b4a03b7ee61de5a479b3cf0916f47e91a43b0f50
Gecko: 4493015380ab
Gonk: ab265fb203390c70b8f2a054f38cf4b2f2dad70a
Version: 34.0 (2.1) 
Firmware Version: v18D-1
User Agent: Mozilla/5.0 (Mobile; rv:34.0) Gecko/34.0 Firefox/34.0

Environmental Variables:
Device: Flame 2.0 (Full Flash)(KK)(319mb)
BuildID: 20150506000201
Gaia: 84898cadf28b1a1fcd03b726cff658de470282f0
Gecko: e67ed29e8ad0
Gonk: ab265fb203390c70b8f2a054f38cf4b2f2dad70a
Version: 32.0 (2.0) 
Firmware Version: v18D-1
User Agent: Mozilla/5.0 (Mobile; rv:32.0) Gecko/32.0 Firefox/32.0
This issue was written here bug 1145284. It was verified as fixed on master but never uplifted to 2.2.
Here is a video of what I am seeing on master http://youtu.be/Awyi5aJIgWQ
This is also easy to reproduce on 2.2. I see the blank squares and flashing. Sometimes a thumbnail will be displayed and then will go blank and flashing will begin. Very deja-vu. We've had flickering issues before when memory was low and this looks a lot like that.

This is pretty clearly something we should block on.
blocking-b2g: 2.2? → 2.2+
Whiteboard: [caf priority: p2][CR 833577] → [caf priority: p2][CR 830488]
(In reply to KTucker [:KTucker] from comment #8)
> Here is a video of what I am seeing on master http://youtu.be/Awyi5aJIgWQ

This video is private. Can you make it public?
(In reply to KTucker [:KTucker] from comment #7)
> This issue was written here bug 1145284. It was verified as fixed on master
> but never uplifted to 2.2.

There must be something else going on here, since people are reporting seeing this on master.
(In reply to Seth Fowler [:seth] from comment #11)
> (In reply to KTucker [:KTucker] from comment #7)
> > This issue was written here bug 1145284. It was verified as fixed on master
> > but never uplifted to 2.2.
> 
> There must be something else going on here, since people are reporting
> seeing this on master.

On master we've got memory regressions based on the fact that the Nuwa process isn't working right.  Just launching gallery then pressing the home button is often enough to cause gallery to OOM (with a 319mb flame) so that could be a confounding factor on master.
In addition to the sometimes flickering or missing thumbnails, I also see flickering images (and very occasional missing images) when I tap on a thumbnail and flick through the images. The image is loaded and it looks like it gets decoded as second time or something.  Could that be related?
(In reply to David Flanagan [:djf] from comment #12)
> On master we've got memory regressions based on the fact that the Nuwa
> process isn't working right.  Just launching gallery then pressing the home
> button is often enough to cause gallery to OOM (with a 319mb flame) so that
> could be a confounding factor on master.

Oh whoa! That could indeed be a confounding factor. What's the bug number?
Sorry about that Seth. You should be able to view the video now http://youtu.be/Awyi5aJIgWQ

Seems to me that we have multiple issues going on here. I flashed to the reported regression window on bug 1145284 and notice that the last working does not seem to have the flickering issue but does have missing thumbnails...

At this point should we look for a regression window for just the missing thumbnails?
(In reply to KTucker [:KTucker] from comment #15)
> At this point should we look for a regression window for just the missing
> thumbnails?

I'm not sure, but I think they're two different issues, and probably bug 1145284 fixed the flickering. So yeah, it'd be good to figure out a regression window for just the missing thumbnails.
(In reply to David Flanagan [:djf] from comment #13)
> In addition to the sometimes flickering or missing thumbnails, I also see
> flickering images (and very occasional missing images) when I tap on a
> thumbnail and flick through the images. The image is loaded and it looks
> like it gets decoded as second time or something.  Could that be related?

Possibly. Are you seeing this on 2.2, or on master? I ask because I suspect the double-decode is pretty much the same thing as flickering, and that's something that bug 1145284 should've fixed.
Seth:

The Nuwa process issue on master is bug 1151672. The fact that it makes it really easy to OOM is bug 1161189.

The double-decode when swiping between images was on 2.2. I don't think it is on master, but I'll let you know if I see it there.
I've edited my prefs.js file to add:

  user_pref("image.mem.allow_locking_in_content_processes", true);

which is (I think) the same as uplifting 1148696.

It seems to fix the flickering problem, but does not fix the missing thumbnail problem.

And, it makes bug 1163102 manifest in 2.2.
Allowing locking also seems to fix the more subtle flicker of full-size images I described in comment #13

But it seems to make the missing thumbnail problem worse, and I've double checked that bug 1163102 occurs when the pref is true and does not occur when the pref is false.
One more thing I've noticed. If you reboot or restart b2g and quickly launch the gallery, sometimes the Nuwa process is not ready yet and gallery is launched as a direct child of the B2G process instead of Nuwa. This happens all the time on master, and is the subject of bug 1151672. But it can also happen on 2.2 if you launch quickly after a reboot.

I gather that when Nuwa is not ready, apps use more memory than they should. But I have verified that this bug occurs in both cases. It is not specific to the Nuwa-related memory regression.
As far as I know bug 1163102 was fixed by bug 1162282.

Are you sure that the missing thumbnail problem is not the same as bug 1163102? Is it possible we're just throwing when trying to draw those thumbnails?
Attached file about-memory-2.zip
At Seth's request, I'm attaching an about-memory report from a 319mb Flame running today's m-c nightly build. The report was generated while the gallery app was displaying a bunch of gray boxes where thumbnails should have been, and immediately after I took a screenshot of those gray boxes and had drawImage() fail (bug 1163102)
Component: Gaia::Gallery → ImageLib
Product: Firefox OS → Core
OK, the problem here is definitely that we're locking too many images, which is getting us close to OOM. I'm going to investigate; will report back.
Depends on: 1163187
Attached file about-memory-0.zip
This second about:memory attachment is also taken right after a failed drawImage() for a screenshot, but with a modified gallery that does not have the -moz-crisp-edges CSS that is being removed in bug 1163187.
I'm guessing the new about:memory report was captured while looking at a full-size image? In this one, all of the thumbnails are unlocked, and almost none of them have surfaces. However, 11 full-size images are locked. Many of them have two surfaces, one at the intrinsic size and one downscaled version.

It's unclear to me why the 11 full-size images are locked. These are CSS background images, and thus they stay locked as long as the associated style sticks around, but I'd think we'd drop that style as soon as we drop the element. Probably it doesn't go away until the GC runs, though.
The new about:memory report was captured while looking at a bunch of gray boxes where thumbnails should have been.  I got the gallery app into that state by looking at and panning through some (I forget how many, but it could have been 11) full-size images.

I don't think I ever zoomed in on any of the images, so I think there would be only a single background-image for each.

The Gallery app creates three MediaFrame objects, and the MediaFrame reuses a single element for displaying images. It just changes the background-image property of that element. So there wouldn't be 11 elements waiting for garbage collection.  But there could be three elements each of which had about 4 different background-image values.
I filed bug 1163215 to modify MediaFrame so that Gallery is using <img> elements instead of background-image for the full-size images it displays. Possibly that will help here.
https://bugzilla.mozilla.org/show_bug.cgi?id=1163225#c5

It seems that unrendered grey thumbnail shows icons on top left corner when tapped to display its image, and then exits to the list view by tapping the back button.  Repeating this causes most of the unrendered grey thumbnails to show the icon.  But there seem to be a handful of grey thumbnails which do not display the icon, and when those thumbnails are tapped, it does not show the proper picture - it just shows the grey screen as well.
I can see the broken image thing that No-Jun reports. It did not occur with Friday's nightly build, but it does occur with today's nightly build.

Another thing that is new in today's nightly is that when swiping between full size images, when we get a blank screen that does not display the image it is supposed to display, I sometimes (but I think not always) see a message like this:

E/Gallery ( 4445): [JavaScript Error: "Image corrupt or truncated: blob:app://gallery.gaiamobile.org/77182b92-4ad9-4358-ba53-96c7e35124be" {file: "blob:app://gallery.gaiamobile.org/77182b92-4ad9-4358-ba53-96c7e35124be" line: 0}]

I wonder if Seth's fix for 1162282 could have anything to do with this...

To be clear, I don't think that this new behavior is any worse than the old behavior. Just noting that something has changed.
Bug 1162282 is definitely unrelated to the change in behavior with today's nightly build. That's caused by bug 1124084 landing, combined with the backout of decode-on-draw-only on B2G. I'm doing my best to fix the issues as quickly as I can. I'll back DDD out again on B2G if I can't fix them fairly quickly.
Flags: needinfo?(seth)
Attached image Video app thumbnails
Video app shows this issue too when loaded with 200+ clips. One thing to note is that when clips with black thumbnail are tapped, and go back to the list view, the thumbnail begins to appear, as seen from the screenshot.
Had the same results as Comment 34 for video app.  Used 216 small videos.  Noticed the first time scrolling they'd all appear but after additional scroll events more would disappear.

Environmental Variables:
Device: Flame 3.0 KK 319mb
BuildID: 20150511010202
Gaia: 6089234ace8b294a8feef064387604bae16254e3
Gecko: d8420a541d1c
Gonk: 040bb1e9ac8a5b6dd756fdd696aa37a8868b5c67
Version: 40.0a1 (3.0) 
Firmware Version: v18D-1
User Agent: Mozilla/5.0 (Mobile; rv:40.0) Gecko/40.0 Firefox/40.0
Thanks Peter. I can triple-confirm the issue with thumbnails in the video app.

The video app uses background-image for thumbnails, but this hasn't really been a problem before, and I'd rather not have to change to <img> for 2.2.  (I'm just assuming that this will reproduce on 2.2, just as the gallery issue does.)

Seth: do you want a separate bug for the video app issue, or shall we treat it as part of this one?
Flags: needinfo?(seth)
Seth's actively working on these, but is on PTO today. Please ping me directly if we need to resolve anything before Monday.
Seth, please assign this to yourself if you are actively looking into this. Thanks much!
(In reply to David Flanagan [:djf] from comment #36)
> The video app uses background-image for thumbnails, but this hasn't really
> been a problem before, and I'd rather not have to change to <img> for 2.2. 
> (I'm just assuming that this will reproduce on 2.2, just as the gallery
> issue does.)

Is it terribly difficult to switch to <img> for 2.2? Because a "good" fix for background-image probably cannot be uplifted to 2.2.

What we *could* do is disable image locking just for background-image on 2.2. I'm open to that, but I still think we'd be better off updating the video app to use <img> in that case.

> Seth: do you want a separate bug for the video app issue, or shall we treat
> it as part of this one?

It probably needs a separate bug.

I'm going to create some tracking bugs for the underlying technical issues in a minute, because there are really only a couple of technical issues here, but more bugs are being filed for the *symptoms* than I can easily keep track of.
Flags: needinfo?(seth)
Assignee: nobody → seth
Depends on: 1166134
Depends on: 1166136
OK, bug 1166134 and bug 1166136 cover the underlying issues that are causing us to use too much memory in the Gallery app (and apparently the Video app, though I haven't personally investigated that much).

I expect bug 1166134 to be more-or-less fixed once the existing blocking bugs get r+'d and landed.

Bug 1166136 needs a more detailed investigation.

Once everything is fixed on master, we'll need to decide how to solve the problem on 2.2. (Obviously we can just uplift everything, but if we don't want to do that, we may need some 2.2-specific hacks.)
Setting ni? on :djf for the question in comment #39.
Flags: needinfo?(dflanagan)
It is probably not too hard to switch the Video app to use <img> instead of background-image for its thumbnails. I just don't think that anyone has touched that code in a long time, so no one who currently works on Video is familiar with it.

If we need to, we'll update the Video app.

Given all the other bug fixes that have landed, I think we should check again to see if there is a problem with Video before we do anything.

Peter: could you check whether the Video app symptom you reported in comment 35 still occurs on master or on 2.2?  And if it does, would you file a bug about that and assign it to me, please?
Flags: needinfo?(dflanagan) → needinfo?(pbylenga)
Or maybe KTucker would be better to ask, since they're the assigned QA person for this bug...
Flags: needinfo?(ktucker)
I have over 200 videos on my device and no flashing thumbnails were observed in the video app on the latest Flame 3.0 or 2.2.

However, there are randomly missing thumbnails in the video app as stated in Comment 35 so I will write up a separate issue on this and need info you David.

Device: Flame 3.0 (Full Flash)(KK)(319mb)
BuildID: 20150521010203
Gaia: 5a7f87b1505ba89b586372cbbbe9507d1016c40c
Gecko: b9424d63fe35
Gonk: 040bb1e9ac8a5b6dd756fdd696aa37a8868b5c67
Version: 41.0a1 (3.0) 
Firmware Version: v18D-1
User Agent: Mozilla/5.0 (Mobile; rv:41.0) Gecko/41.0 Firefox/41.0 

Environmental Variables:
Device: Flame 2.2(Full Flash)(KK)(319mb)
BuildID: 20150521002508
Gaia: bc42fbc12d622bffd7e8afcb8d56f8a1d9773c60
Gecko: 6e4eaf59efda
Gonk: bd9cb3af2a0354577a6903917bc826489050b40d
Version: 37.0 (2.2) 
Firmware Version: v18D-1
User Agent: Mozilla/5.0 (Mobile; rv:37.0) Gecko/37.0 Firefox/37.0
Flags: needinfo?(ktucker)
Kevin took care of the NI on me, NI me again if anything further is required.
Flags: needinfo?(pbylenga)
Hi Seth,
Should we close this one and track on new bug Kevin is going to open?
Thanks.
Flags: needinfo?(seth)
(In reply to Josh Cheng [:josh] from comment #46)
> Hi Seth,
> Should we close this one and track on new bug Kevin is going to open?
> Thanks.

Yeah, I think that's a good idea. There are too many Gallery bugs currently open, and we ended up uplifting the fix for this one to 2.2 in a different bug, so let's go ahead and resolve this.
Status: NEW → RESOLVED
Closed: 9 years ago
Flags: needinfo?(seth)
Resolution: --- → FIXED
Hi Seth,
I changed the resolution to WORKFORME as the current behavior is not the same as initial reported.
blocking-b2g: 2.2+ → ---
Resolution: FIXED → WORKSFORME
Clearing tag on resolved issue
Flags: needinfo?(ktucker)
QA Whiteboard: [QAnalyst-Triage+]
Flags: needinfo?(ktucker)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: