Closed Bug 1455597 Opened 6 years ago Closed 6 years ago

Crash: Didn't find a cached resource with that ID!

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla62

Tracking Flags:

Tracking

Status

firefox-esr52

---

unaffected

firefox-esr60

---

unaffected

firefox59

---

unaffected

firefox60

---

unaffected

firefox61

---

disabled

firefox62

---

disabled

People

(Reporter: jan, Assigned: kats)

References

(Blocks 1 open bug)

Details

(Keywords: crash, nightly-community)

Crash Data

Attachments

(1 file)

Bug 1455597 - Flush the transaction to remove the pipeline before shutting down the WebRenderAPI. 6 years ago Kartikaya Gupta (email:kats@mozilla.staktrace.com) 59 bytes, text/x-review-board-request	sotaro : review+	Details

Darkspirit

Reporter

Description

•

6 years ago

Seen on Socorro. bug 1446588 didn't stop this crash reason.

[@ static void core::option::expect_failed | static union core::result::Result<T> webrender::resource_cache::ResourceCache::get_cached_image ]
bp-a5506d84-c971-4af5-aa1e-4c7ed0180420 20180418230818 Win10
> Didn't find a cached resource with that ID!

[@ static void std::panicking::rust_panic_with_hook ]
bp-c51d97f8-b0da-4554-80e7-ea5640180418 20180417100054 Win10
> Didn't find a cached resource with that ID!

Milan Sreckovic [:milan] (needinfo for best results)

Updated

•

6 years ago

Blocks: stage-wr-trains

Priority: -- → P2

Darkspirit

Reporter

Updated

•

6 years ago

Crash Signature: [@ static void core::option::expect_failed | static union core::result::Result<T> webrender::resource_cache::ResourceCache::get_cached_image ] [@ static void std::panicking::rust_panic_with_hook ] → [@ static void core::option::expect_failed | static union core::result::Result<T> webrender::resource_cache::ResourceCache::get_cached_image ] [@ static void std::panicking::rust_panic_with_hook ] [@ mozalloc_abort | abort | panic_abort::__rust_start_pa…

OS: Windows 10 → All

Darkspirit

Reporter

Updated

•

6 years ago

Crash Signature: panic_abort::__rust_start_panic::abort::h091e61b1e9ef8f82 | panic_abort::__rust_start_panic | core::option::expect_failed::h665835dead85fc51 | webrender::resource_cache::ResourceCache::get_cached_image::hfd1d2328a8cb4c77 ] → panic_abort::__rust_start_panic::abort::h091e61b1e9ef8f82 | panic_abort::__rust_start_panic | core::option::expect_failed::h665835dead85fc51 | webrender::resource_cache::ResourceCache::get_cached_image::hfd1d2328a8cb4c77 ] [@ mozalloc_abort | abort | co…

Darkspirit

Reporter

Updated

•

6 years ago

Crash Signature: core::option::expect_failed::h7f635057bfba806a | webrender::resource_cache::ResourceCache::get_cached_image::hffd646c4c32aa8bc ] → core::option::expect_failed::h7f635057bfba806a | webrender::resource_cache::ResourceCache::get_cached_image::hffd646c4c32aa8bc ] [@ mozalloc_abort | abort | core::option::expect_failed::h7f635057bfba806a | webrender::resource_cache::ResourceCache::get_c…

status-firefox62: --- → disabled

Darkspirit

Reporter

Updated

•

6 years ago

Crash Signature: webrender::resource_cache::ResourceCache::get_cached_image::h6a69122e591884d8 ] [@ mozalloc_abort | abort | core::option::expect_failed::h7f635057bfba806a | webrender::resource_cache::ResourceCache::get_cached_image::hf33258fd364ddd14 ] → webrender::resource_cache::ResourceCache::get_cached_image::h6a69122e591884d8 ] [@ mozalloc_abort | abort | core::option::expect_failed::h7f635057bfba806a | webrender::resource_cache::ResourceCache::get_cached_image::hf33258fd364ddd14 ] [@ mozalloc_abo…

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 1

•

6 years ago

I'm seeing this crash pretty much every time I shut down or update WR-enabled nightly on macOS.

Darkspirit

Reporter

Updated

•

6 years ago

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 2

•

6 years ago

This is the #2 WR-specific topcrash. We should address this before turning on WR in nightly. I seem to be able to reproduce it so I'll take a look.

Assignee: nobody → bugmail

Blocks: stage-wr-nightly
No longer blocks: stage-wr-trains

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 3

•

6 years ago

This might actually be #1 after combining the signatures.

Darkspirit

Reporter

Updated

•

6 years ago

Crash Signature: webrender::resource_cache::ResourceCache::get_cached_image::h8b37fbc8303bff09 ] → webrender::resource_cache::ResourceCache::get_cached_image::h8b37fbc8303bff09 ] [@ mozalloc_abort | abort | core::option::expect_failed::hd528a7bd2cab115d | webrender::resource_cache::ResourceCache::get_cached_image::h4a711d8897b20ee7 ]

:gerard-majax

Comment 4

•

6 years ago

I have seen the crash 5 times within the last 60 minutes, linux/ubuntu 18.04 with Intel GPU. Even just moving away from my laptop.

:gerard-majax

Comment 5

•

6 years ago

And just to be clear, this started just after I upgraded from 20180530220105 to 20180531101452.

:gerard-majax

Comment 6

•

6 years ago

I'm wondering if there's any link with those kernel messages:
> Jun  1 11:39:44 portable-alex kernel: [1329561.799780] [drm] Reducing the compressed framebuffer size. This may lead to less power savings than a non-reduced-size. Try to increase stolen memory size if available in BIOS.
> Jun  1 11:40:03 portable-alex kernel: [1329581.395141] [drm] Reducing the compressed framebuffer size. This may lead to less power savings than a non-reduced-size. Try to increase stolen memory size if available in BIOS.

At least, it looks like each time there's one, I get a crash very soon after (< 60s). I've already set the maximum video memory in the BIOS (512MB).

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 7

•

6 years ago

I haven't been able to reproduce it on a local build yet, so I haven't been able to confirm. But my theory is this: when a tab is closed, we run through WRBP::ClearResources. This sends a transaction to clear out the display list and remove the pipeline at [1]. This transaction gets sent to the scene builder thread where it may take a bit of time.

Meanwhile, a few lines down, at [2], we release the WebRenderAPI clone for that tab, which should trigger the ClearNamespace at [3]. This happens in a message sent directly to the RenderBackend thread, and so can jump in front of the DL/pipeline removal, and put WR into a state where it still has the DL/pipeline, but cleared all the cached images. If, in that state, it tries to do a render (there might be one inflight already) then it might result in this error.

[1] https://searchfox.org/mozilla-central/rev/38bcf897f1fa19c1eba441a611cf309482e0d6e5/gfx/layers/wr/WebRenderBridgeParent.cpp#1607
[2] https://searchfox.org/mozilla-central/rev/38bcf897f1fa19c1eba441a611cf309482e0d6e5/gfx/layers/wr/WebRenderBridgeParent.cpp#1621
[3] https://searchfox.org/mozilla-central/rev/38bcf897f1fa19c1eba441a611cf309482e0d6e5/gfx/webrender_api/src/api.rs#988

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 8

•

6 years ago

I made a try push with logging [1] and I was able to reproduce on that. Indeed, it seems like the namespace is being cleared and then we request the image after that. Will try to put together a fix.

[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=0919751458bc2db546a9a3dc2aac55143e7d23a2

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 9

•

6 years ago

This might fix it: https://treeherder.mozilla.org/#/jobs?repo=try&revision=fd2323486caab76edd458c4b46c444f551d819c5

Comment hidden (mozreview-request)

Review commit: https://reviewboard.mozilla.org/r/248570/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/248570/

:gerard-majax

Comment 11

•

6 years ago

(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #9)
> This might fix it:
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=fd2323486caab76edd458c4b46c444f551d819c5

Thanks. So far, toggling the pref |gfx.webrender.async-scene-build| seems to have helped a lot: nearly three hours without any crash! I'm going to test with your binaries out of TaskCluster :)

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Updated

•

6 years ago

Blocks: 1452513

Manish Goregaokar [:manishearth]

Updated

•

6 years ago

Crash Signature: webrender::resource_cache::ResourceCache::get_cached_image::h8b37fbc8303bff09 ] [@ mozalloc_abort | abort | core::option::expect_failed::hd528a7bd2cab115d | webrender::resource_cache::ResourceCache::get_cached_image::h4a711d8897b20ee7 ] → webrender::resource_cache::ResourceCache::get_cached_image::h8b37fbc8303bff09 ] [@ mozalloc_abort | abort | core::option::expect_failed::hd528a7bd2cab115d | webrender::resource_cache::ResourceCache::get_cached_image::h4a711d8897b20ee7 ] [@ mozalloc_abo…

Darkspirit

Reporter

Updated

•

6 years ago

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 13

•

6 years ago

Hm, for some reason the try push seems to indicate the windows debug build is hanging on startup. Not sure what's going on there, will investigate.

:gerard-majax

Comment 14

•

6 years ago

(In reply to Alexandre LISSY :gerard-majax from comment #11)
> (In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #9)
> > This might fix it:
> > https://treeherder.mozilla.org/#/
> > jobs?repo=try&revision=fd2323486caab76edd458c4b46c444f551d819c5
> 
> Thanks. So far, toggling the pref |gfx.webrender.async-scene-build| seems to
> have helped a lot: nearly three hours without any crash! I'm going to test
> with your binaries out of TaskCluster :)

So, this has been rock stable since my comment. Looks good, :kats !

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 15

•

6 years ago

Shoot, the patch introduces a deadlock situation because it blocks on WR while holding the indirect layer trees lock at https://searchfox.org/mozilla-central/rev/38bcf897f1fa19c1eba441a611cf309482e0d6e5/gfx/layers/ipc/CompositorBridgeParent.cpp#504

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 16

•

6 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=a93a88b934822534d54155e73acf22cd81e1067a should be better

Comment hidden (mozreview-request)

Comment on attachment 8982595 [details]
Bug 1455597 - Flush the transaction to remove the pipeline before shutting down the WebRenderAPI.

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/248570/diff/1-2/

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 18

•

6 years ago

That's looking good :)

Comment hidden (Intermittent Failures Robot)

1 failures in 655 pushes (0.002 failures/push) were associated with this bug in the last 7 days.    

Repository breakdown:
* try: 1

Platform breakdown:
* windows10-64-qr: 1

For more details, see:
https://treeherder.mozilla.org/intermittent-failures.html#/bugdetails?bug=1455597&startday=2018-05-28&endday=2018-06-03&tree=trunk

Sotaro Ikeda [:sotaro]

Comment 20

•

6 years ago

mozreview-review

Comment on attachment 8982595 [details]
Bug 1455597 - Flush the transaction to remove the pipeline before shutting down the WebRenderAPI.

https://reviewboard.mozilla.org/r/248570/#review254944

Loos good!

Attachment #8982595 - Flags: review?(sotaro.ikeda.g) → review+

Pulsebot

Comment 21

•

6 years ago

Pushed by kgupta@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/bcebfd54b4e1
Flush the transaction to remove the pipeline before shutting down the WebRenderAPI. r=sotaro

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Updated

•

6 years ago

Blocks: 1440545

Dorel Luca [:dluca]

Comment 22

•

6 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/bcebfd54b4e1

Status: NEW → RESOLVED

Closed: 6 years ago

Resolution: --- → FIXED

Target Milestone: --- → mozilla62

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Updated

•

6 years ago

Blocks: 1467850

Sotaro Ikeda [:sotaro]

Updated

•

6 years ago