Crash in [@ AsyncShutdownTimeout | IOUtils: waiting for profileBeforeChange IO to complete | JSON store: writing data]
Categories
(Toolkit Graveyard :: OS.File, defect)
Tracking
(firefox-esr102 wontfix, firefox106 unaffected, firefox107 wontfix, firefox108+ verified, firefox109 verified)
People
(Reporter: bmaris, Assigned: nalexander)
References
Details
(Keywords: crash, topcrash)
Crash Data
Attachments
(2 files)
235 bytes,
text/javascript
|
Details | |
48 bytes,
text/x-phabricator-request
|
diannaS
:
approval-mozilla-beta+
|
Details | Review |
Crash report: https://crash-stats.mozilla.org/report/index/e3dc9b15-2dd2-44c5-9968-ee93f0221109
MOZ_CRASH Reason: [Parent 12616, Main Thread] ###!!! ABORT: file resource://gre/modules/JSONFile.sys.mjs:124
Top 10 frames of crashing thread:
0 xul.dll NS_DebugBreak xpcom/base/nsDebugImpl.cpp:496
1 xul.dll nsDebugImpl::Abort xpcom/base/nsDebugImpl.cpp:129
2 xul.dll XPTC__InvokebyIndex
3 xul.dll NS_InvokeByIndex xpcom/reflect/xptcall/md/win32/xptcinvoke_x86_64.cpp:57
3 xul.dll CallMethodHelper::Invoke js/xpconnect/src/XPCWrappedNative.cpp:1626
3 xul.dll CallMethodHelper::Call js/xpconnect/src/XPCWrappedNative.cpp:1179
3 xul.dll XPCWrappedNative::CallMethod js/xpconnect/src/XPCWrappedNative.cpp:1125
4 xul.dll XPC_WN_CallMethod js/xpconnect/src/XPCWrappedNativeJSOps.cpp:965
5 xul.dll CallJSNative js/src/vm/Interpreter.cpp:459
5 xul.dll js::InternalCallOrConstruct js/src/vm/Interpreter.cpp:547
Found in
- Firefox 107.0 RC
Affected versions
- Firefox 107.0 RC
- Latest Nightly 108.0a1
Unaffected versions
- Firefox 106.0 RC
Tested platforms
- Affected platforms: Windows 10
Steps to reproduce:
- Open Task Manager
- Create a new profile
- Add the user.js attached to the profile
- Open Firefox
- Exit firefox in ~2sec after it started
- Take a look in the Task Manager
Actual result
- The firefox process is still opened inside Task Manager and increasing the memory until it crashes.
Notes:
- I think this is a Remote Settings issue here.
- Essentially it crashes when switching from Remote Settings servers from Prod to Stage and I interrupt the process.
- If I install the extension from here and I wait for the switch/sync process for stage to finish, exiting will not cause a crash anymore.
- Please change the component if this is not the correct one. I took the component from bug 1782990
Reporter | ||
Comment 1•2 years ago
|
||
Comment 2•2 years ago
|
||
The bug is linked to a topcrash signature, which matches the following criteria:
- Top 20 desktop browser crashes on release
- Top 10 desktop browser crashes on nightly
:serg, could you consider increasing the severity of this top-crash bug?
For more information, please visit auto_nag documentation.
Comment 3•2 years ago
|
||
The bug is marked as tracked for firefox108 (nightly). We have limited time to fix this, the soft freeze is today. However, the bug still isn't assigned and has low severity.
:serg, could you please find an assignee and increase the severity for this tracked bug? If you disagree with the tracking decision, please talk with the release managers.
For more information, please visit auto_nag documentation.
Comment 4•2 years ago
|
||
Hi Nick:
I guess this issue is a duplicate of Bug 1782924? It looks like the crash still exists.
Assignee | ||
Comment 5•2 years ago
|
||
(In reply to Dimi Lee [:dimi][:dlee] from comment #4)
Hi Nick:
I guess this issue is a duplicate of Bug 1782924? It looks like the crash still exists.
It's possible. The JSONFile
apparatus is used by multiple features/files; is there any indication that we're seeing issues with targeting.state.json
vs. any other consumer? I don't see anything in the crash report, unfortunately. Keeping NI since the action here might be to annotate the crash reports more effectively.
Updated•2 years ago
|
:nalexander is there a still a chance to fix this for fx108?
Assignee | ||
Comment 7•2 years ago
|
||
(In reply to Dianna Smith [:diannaS] from comment #6)
:nalexander is there a still a chance to fix this for fx108?
For 108? I doubt it, we don't have a lot of time. I'll try to investigate this now. I think the first order of business is to get the actual JSONFile
instance into the crash report in some way, and then to figure out if we what we see on Nightly. Looks like that's Bug 1782990, which I've just attached a patch for. We could uplift to 108 pretty aggressively, I think, and hope to get data on which file is actually causing these crashes; if it is targeting.state.json
, it's likely the set of targeting attributes has grown and one of the new attributes is causing issues. We shall see.
Assignee | ||
Comment 8•2 years ago
|
||
This fixes a missing await
and tries to avoid updating snapshot data
during shutdown.
Neither seem particularly likely to impact shutdown crashes, but we
have no better theory for the underlying cause of these crashes. It's
possible that the amount of data in the targeting snapshot is
sufficiently large that we see sufficiently many sufficiently slow
writes to cause the shutdown mechanism to timeout.
Depends on D162891
Updated•2 years ago
|
Pushed by nalexander@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/de83175de08f Try to avoid shutdown crashes writing targeting snapshot. r=barret,bhearsum
Comment 10•2 years ago
|
||
Backed out for causing Geckoview failures.
Backout link: https://hg.mozilla.org/integration/autoland/rev/bc4345d1f7364dff4e87397b7917aa7ba557b34c
Failure log: https://treeherder.mozilla.org/logviewer?job_id=397936466&repo=autoland&lineNumber=2694
Assignee | ||
Comment 11•1 year ago
|
||
This was backed out due to a weirdness: the GeckoView autofill storage layer isn't really a JSONFile
, it just uses the bones to manage serialization. I've worked around that by exposing sanitizedBasename
to the constructor. Try build percolating at https://treeherder.mozilla.org/jobs?repo=try&revision=027c20e43762c7de0e2038bc57f440392480b54d.
Comment 12•1 year ago
|
||
bugherder |
Comment 13•1 year ago
|
||
The patch landed in nightly and beta is affected.
:nalexander, is this bug important enough to require an uplift?
- If yes, please nominate the patch for beta approval.
- If no, please set
status-firefox108
towontfix
.
For more information, please visit auto_nag documentation.
Assignee | ||
Comment 14•1 year ago
|
||
Comment on attachment 9305003 [details]
Bug 1799823 - Try to avoid shutdown crashes writing targeting snapshot. r?barret
Beta/Release Uplift Approval Request
- User impact if declined: Almost none: this is expected to reduce some crashes.
- Is this code covered by automated tests?: No
- Has the fix been verified in Nightly?: No
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: Bug 1782990
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): It should reduce a certain type of crash and do little else.
- String changes made/needed:
- Is Android affected?: No
Comment on attachment 9305003 [details]
Bug 1799823 - Try to avoid shutdown crashes writing targeting snapshot. r?barret
Approved for 108.0b9
Comment 16•1 year ago
|
||
bugherder uplift |
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Reporter | ||
Comment 17•1 year ago
|
||
Using the steps from comment 0 I'm still getting a crash but with two different signatures:
- 108.0b9 - https://crash-stats.mozilla.org/report/index/fa4d11e1-d4f8-4580-a979-559f40221205
- Latest Nightly 109.0a1 - https://crash-stats.mozilla.org/report/index/35b45c87-7c55-4ff9-8d8b-37eed0221205 / https://crash-stats.mozilla.org/report/index/02e6e442-1bf3-40a3-b1f6-84de80221205
Every time I test this I get a different crash signature though. Should we call this bug as verified fixed since I'm unable to get this particular crash signature?
Assignee | ||
Comment 18•1 year ago
|
||
(In reply to Bogdan Maris [:bogdan_maris], Release Desktop QA from comment #17)
Using the steps from comment 0 I'm still getting a crash but with two different signatures:
- 108.0b9 - https://crash-stats.mozilla.org/report/index/fa4d11e1-d4f8-4580-a979-559f40221205
- Latest Nightly 109.0a1 - https://crash-stats.mozilla.org/report/index/35b45c87-7c55-4ff9-8d8b-37eed0221205 / https://crash-stats.mozilla.org/report/index/02e6e442-1bf3-40a3-b1f6-84de80221205
Every time I test this I get a different crash signature though. Should we call this bug as verified fixed since I'm unable to get this particular crash signature?
Yes, I think so, particularly because Bug 1782990 specifically annotates this crash signature. Could you attach a screen capture of this? My guess is that switching the Remote Settings service endpoint invalidates a bunch of data and starts fetching new data. Something in that long chain is likely not shutdown-aware, causing eventual shutdown hangs. Would you mind filing a new ticket for these crashes so that we can keep this ticket for this specific JSONFile instance blocking shutdown? Thanks!
Comment 19•1 year ago
|
||
This landed (the autoland commit is missing here) and the signature changed to AsyncShutdownTimeout | IOUtils: waiting for profileBeforeChange IO to complete | JSON store: writing data for 'targeting.snapshot'
which is the third-most frequent crash for Nightly.
Should this bug be reopened or a new one created?
Assignee | ||
Comment 20•1 year ago
|
||
(In reply to Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout) from comment #19)
This landed (the autoland commit is missing here) and the signature changed to
AsyncShutdownTimeout | IOUtils: waiting for profileBeforeChange IO to complete | JSON store: writing data for 'targeting.snapshot'
which is the third-most frequent crash for Nightly.Should this bug be reopened or a new one created?
New ticket please. This is a pretty general "we're doing lots of things and sometimes we shut down in the middle, which is hard to handle" situation, but we might be able to annotate crashes with the actual task to hand, or we could restrict the set of tasks, sacrificing some future flexibility for current stability.
Reporter | ||
Comment 21•1 year ago
|
||
(In reply to Nick Alexander :nalexander [he/him] from comment #18)
(In reply to Bogdan Maris [:bogdan_maris], Release Desktop QA from comment #17)
Using the steps from comment 0 I'm still getting a crash but with two different signatures:
- 108.0b9 - https://crash-stats.mozilla.org/report/index/fa4d11e1-d4f8-4580-a979-559f40221205
- Latest Nightly 109.0a1 - https://crash-stats.mozilla.org/report/index/35b45c87-7c55-4ff9-8d8b-37eed0221205 / https://crash-stats.mozilla.org/report/index/02e6e442-1bf3-40a3-b1f6-84de80221205
Every time I test this I get a different crash signature though. Should we call this bug as verified fixed since I'm unable to get this particular crash signature?
Yes, I think so, particularly because Bug 1782990 specifically annotates this crash signature. Could you attach a screen capture of this? My guess is that switching the Remote Settings service endpoint invalidates a bunch of data and starts fetching new data. Something in that long chain is likely not shutdown-aware, causing eventual shutdown hangs. Would you mind filing a new ticket for these crashes so that we can keep this ticket for this specific JSONFile instance blocking shutdown? Thanks!
Not sure what screen capture should I make but I did added one with the whole steps in the bug 1804859 I logged for all the other crashes I get following the same steps as here. I feel strange marking this bug as verified fixed though after the comment from Sebastian but since there is a new bug logged for that signature I'll go ahead and change the status.
Reporter | ||
Updated•1 year ago
|
Updated•1 year ago
|
Comment 22•1 year ago
|
||
If this is at all related to bug 1804757 I am still hitting it in 110: https://crash-stats.mozilla.org/report/index/8e3ed12d-b7f3-400a-bd77-f3a360230227
Updated•1 year ago
|
Description
•