Closed Bug 1589092 Opened 5 years ago Closed 4 years ago

Crash in [@ gsignal]

Categories

(Core :: Widget: Gtk, defect)

71 Branch
Desktop
Linux
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla72
Tracking Status
firefox-esr68 --- fixed
firefox71 --- fixed
firefox72 --- fixed

People

(Reporter: atrif, Assigned: karlt)

References

Details

(4 keywords, Whiteboard: [adv-main71-][adv-esr68.3-])

Crash Data

This bug is for crash report bp-d362a261-33a1-4d44-87d6-df52f0191016.

Top 10 frames of crashing thread:

0 libc-2.27.so gsignal 
1 libc-2.27.so abort 
2 libc-2.27.so __fsetlocking 
3 libc-2.27.so __stack_chk_fail 
4 libc-2.27.so __stack_chk_fail 
5 libxul.so nsRetrievalContextX11::WaitForX11Content widget/gtk/nsClipboardX11.cpp
6  @0x3f04fa8fd22c 
7 libxul.so _fini 
8 libxul.so <name omitted> netwerk/cache2/CacheStorageService.cpp:373
9 libxul.so nsTString<char>* nsTArray_Impl<nsTString<char>, nsTArrayInfallibleAllocator>::AppendElements<nsTString<char>, nsTArrayInfallibleAllocator> xpcom/ds/nsTArray.h:2427

Unfortunately, I don't have some STR for this. Once happened randomly while navigating on random websites and then after another crash was stuck crashing at profile manager.
I don't know the exact component for this issue. Please feel free to change it.

Small volume crash in nightly - appears to be single installations crashing. Comments:

  • drag/drop file to google docs
  • Crashes immediately after I hit play
  • Tried to play music on YOuTube Music and the browser crashed almost immediately.

URLs:

I have managed to reproduce this crash while following this STR:

  1. Launch Firefox.
  2. Access the following website and navigate a little by scrolling up and down.

Expected Results

  • Firefox is stable.

Actual Results

  • Firefox crashes.

I looked at some of these crashes, and there are a number of places in Gecko code that are calling into this method and crashing, including mozilla::dom::SpeechDispatcherService::Setup and webrtc::videocapturemodule::DeviceInfoLinux::EventCheck.

Looking slightly closer to the top of the stack, these crashes seem to have one or more of __stack_chk_fail and __chk_fail in them, which are somehow related to stack protection, and there are some mentions of it in breakpad.

Gabriele, do you have any idea what might be going on with the crashes, or who might know? Thanks.

Flags: needinfo?(gsvelto)

I have inspected the stack manully in the minidump and there's a few things worth of note:

  • The stack has been somehow smashed, that's why we have __stack_chk_fail calls at the top
  • The code seem to be related to a clipboard operation, not only because there's a pointer to nsClipboardX11.cpp but also because manually inspecting the stack reveals this truncated string: selection-clear-. That might be from a GTK selection-clear-event which is related to the clipboard
  • Whatever happened messed up the data in the stack, the area where that string is present has some parts cleared out with zeros and some parts holding the poison pattern e5e5e5e5 so me might be dealing with some user-after-free bug

A few guesses based on this information: this might be a GTK widget bug and there's a high chance it's related to the clipboard-handling code. It's possible that we take the address of an object on the stack and keep using it after it's been destroyed which leads to stack corruption.

I don't know much about clipboard handling in the GTK widget so somebody who knows it better should have a look. Also because there's the poison pattern on the stack, and because this might be triggered by data copied by the user to the clipboard I'd say this is security-sensitive.

Flags: needinfo?(gsvelto)
Component: General → Widget: Gtk

I filed bug 1589604 to get better signatures out of this. Once it lands the signature on this bug will change and we'll be able to tell this particular stack apart from the crashes in the speech code. I've had a look at those stacks too and those look like a bug in libspeechd rather than in Firefox.

A few more hints about the clipboard issue: all crashes seem to happen early, uptime is usually very short.

This crash has a better stack: https://crash-stats.mozilla.org/report/index/a58c28b7-466c-4328-8a5c-5ba550191016

This is on the stack: https://hg.mozilla.org/mozilla-central/annotate/8398e65a3d7e02ad66d0225a1ef2a8312e16ef8f/editor/libeditor/TextEditorDataTransfer.cpp#l467

Which reinforces the idea that this is clipboard related. I also found more truncated strings on the stack that look like this:

dget.CLIPBOARD_M
................
selection-event-
................

All the dots are bytes set to zero. The two rows are 16 bytes wide as if someone had cleared out those areas.

If you can reproduce the crash, can you run Firefox with "MOZ_LOG="Widget:5, WidgetClipboard:5" env variable and attach the clipboard log here? Thanks.

Bugbug thinks this bug is a regression, but please revert this change in case of error.

Keywords: regression
Group: dom-core-security

Improvements in our symbol-scraping scripts plus the prefix changes in bug 1589604 changed the signature of this crash so I've updated it. Unfortunately the signature is still covering multiple unrelated stacks so we'll have to add more functions to the prefix list still. In the meantime I found these three crashes with interesting comments:

https://crash-stats.mozilla.com/report/index/083fccd7-f5db-4e6f-b1a4-99fd10191015

Crash happened as I was pasting text into the search bar of one window, and was shortly after dragging and dropping a youtube link on a tab in another window.

https://crash-stats.mozilla.com/report/index/f163146a-48f2-4d54-a712-11cc80191015

Pasting text into the search / url bar again.

https://crash-stats.mozilla.com/report/index/8cc0d484-9182-4b39-95fe-783370191015

Oh, pasting text in a web page's text box also causes crashes. (If it helps, the text is "libvdpau_i965.so", so fairly short and just printable ASCII characters.)

Unfortunately the reporter did not leave his e-mail address, otherwise I would have contacted him.

Crash Signature: [@ gsignal] → [@ gsignal] [@ raise | abort | __libc_message]

Here's the proper signature for this crash. I'll try to reprocess all the old crashes so that we can tell them apart.

Crash Signature: [@ gsignal] [@ raise | abort | __libc_message] → [@ gsignal] [@ raise | abort | __libc_message] [@ raise | abort | __libc_message | __fortify_fail_abort | __stack_chk_fail | nsRetrievalContextX11::WaitForX11Content | nsRetrievalContextX11::GetTargets]

https://searchfox.org/mozilla-central/rev/d7537a9cd2974efa45141d974e5fc7c727f1a78f/widget/gtk/nsClipboardX11.cpp#136

I expect something is freed at WaitForX11Content() loop. This is supposed to process only clipboard events but selection-clear-event event is obtained when clipboard content is no longer valid, maybe the clipboard source/target window is closed/release so context->cbWidget is invalid. I think Karl may have more insight here.

Flags: needinfo?(karlt)

While looking at the crashes again I noticed that we're skipping over the nsRetrievalContextX11::WaitForX11Content and because the stack is corrupted we get multiple signatures depending on what stack scanning finds in the frame above. I'll try and add all the relevant signatures. In the meantime this crash has the best stack trace I could find:

https://crash-stats.mozilla.org/report/index/1f301810-b152-4751-a5f4-0e9c80191021

Crash Signature: [@ gsignal] [@ raise | abort | __libc_message] [@ raise | abort | __libc_message | __fortify_fail_abort | __stack_chk_fail | nsRetrievalContextX11::WaitForX11Content | nsRetrievalContextX11::GetTargets] → [@ gsignal] [@ raise | abort | __libc_message] [@ raise | abort | __libc_message | __fortify_fail_abort | __stack_chk_fail | nsRetrievalContextX11::WaitForX11Content | _fini] [@ raise | abort | __libc_message | __fortify_fail_abort | __stack_chk_fail |…

More signatures. Note that the old ones like gsignal are disappearing. The new signatures have revealed at least three unrelated crashes hiding behind that one.

Crash Signature: [@ gsignal] [@ raise | abort | __libc_message] [@ raise | abort | __libc_message | __fortify_fail_abort | __stack_chk_fail | nsRetrievalContextX11::WaitForX11Content | _fini] [@ raise | abort | __libc_message | __fortify_fail_abort | __stack_chk_fail |… → [@ gsignal] [@ raise | abort | __libc_message | __fortify_fail_abort | __stack_chk_fail | nsRetrievalContextX11::WaitForX11Content | <name omitted> | mozilla::TextEditor::CanPaste] [@ raise | abort | __libc_message | __fortify_fail_abort | __stack_chk_f…
Crash Signature: [@ gsignal] [@ raise | abort | __libc_message | __fortify_fail_abort | __stack_chk_fail | nsRetrievalContextX11::WaitForX11Content | <name omitted> | mozilla::TextEditor::CanPaste] [@ raise | abort | __libc_message | __fortify_fail_abort | __stack_chk_f… → [@ gsignal] [@ nsRetrievalContextX11::GetTargets] [@ nsRetrievalContextX11::WaitForX11Content] [@ raise | abort | __libc_message | __fortify_fail_abort | __stack_chk_fail | nsRetrievalContextX11::WaitForX11Content | <name omitted> | mozilla::TextEditor…

I've seen crash reports with libgtk-3.so.0.2200.30 and with libgtk-3.so.0.2404.7. The latter corresponds to GTK+ 3.24.11, which is the latest version available, and the former suggests this is not due to any recent change in GTK.

I looked though the code involved, but only saw a couple of potential issues and I can't see that either would lead to this.

DispatchSelectionNotifyEvent() doesn't initialize the GdkWindow *requestor field of the GDK_SELECTION_NOTIFY GdkEvent. The event could be zero initialized to completely rule out such problems, but I don't see the field used by GTK with GDK_SELECTION_NOTIFY events.

It is usual to hold strong references when dispatching events, but DispatchSelectionNotifyEvent() and DispatchPropertyNotifyEvent() don't hold strong references to the widget or window objects. However, I haven't found a path where these objects could be released.

(In reply to Martin Stránský [:stransky] from comment #11)

I expect something is freed at WaitForX11Content() loop. This is supposed to process only clipboard events but selection-clear-event event is obtained when clipboard content is no longer valid, maybe the clipboard source/target window is closed/release so context->cbWidget is invalid.

Holding a strong reference to the cbWidget object during the dispatch would be sensible.

We are not expecting "selection-clear-event"s during the loop, but it is plausible that the string remains on the stack from a recent gtk_clipboard_set_with_data() call overriding a previous selection.
Similarly, a recent gtk_clipboard_set_can_store() could plausibly leave CLIPBOARD_MANAGER on the stack.
I haven't found "selection-event-" in GTK source code.

Flags: needinfo?(karlt)
Depends on: 1592502

(In reply to Karl Tomlinson (:karlt) from comment #14)

I haven't found "selection-event-" in GTK source code.

That was a typo on my part, the string fragment I found on the stack is selection-clear- which I assume corresponds to GTK's selection-clear-event.

Adding yet another signature

Crash Signature: [@ gsignal] [@ nsRetrievalContextX11::GetTargets] [@ nsRetrievalContextX11::WaitForX11Content] [@ raise | abort | __libc_message | __fortify_fail_abort | __stack_chk_fail | nsRetrievalContextX11::WaitForX11Content | <name omitted> | mozilla::TextEditor… → [@ gsignal] [@ nsRetrievalContextX11::GetTargets] [@ nsRetrievalContextX11::WaitForX11Content] [@ raise | abort | __libc_message] [@ raise | abort | __libc_message | __fortify_fail_abort | __stack_chk_fail | nsRetrievalContextX11::WaitForX11Content | …

Calling this sec-audit because it looks like there are multiple issues and individual sub bugs are getting filed on them.

Keywords: sec-audit

Two brand new signatures, the upper frames in the stacks are the same so same issue.

Crash Signature: [@ gsignal] [@ nsRetrievalContextX11::GetTargets] [@ nsRetrievalContextX11::WaitForX11Content] [@ raise | abort | __libc_message] [@ raise | abort | __libc_message | __fortify_fail_abort | __stack_chk_fail | nsRetrievalContextX11::WaitForX11Content | … → [@ gsignal] [@ nsRetrievalContextX11::GetTargets] [@ nsRetrievalContextX11::WaitForX11Content] [@ raise | abort | __libc_message] [@ raise | abort | __libc_message | __fortify_fail_abort | __fortify_fail | __chk_fail | __fdelt_chk | nsRetrievalContext…

(In reply to Gabriele Svelto [:gsvelto] from comment #12)

While looking at the crashes again I noticed that we're skipping over the nsRetrievalContextX11::WaitForX11Content

Do you know why that is being skipped?

(In reply to Karl Tomlinson (:karlt) from comment #19)

Do you know why that is being skipped?

It must be in Socorro's prefix list. I don't know when it was added but it's possible that in the past we had multiple crash stacks starting with nsRetrievalContextX11::WaitForX11Content and so someone added it to the list.

The prefix list contains ".*WaitFor" (presumably for WaitForSingleObject/WaitForMultipleObjects and friends).

There's been only one crash this week under the signatures and while it does have stack corruption which seems to be related to the clipboard I couldn't find the selection-clear-event string fragment on the stack so my guess is that bug 1592502 fixed this. As for the rest of the crashes that hid behind the original gsignal signature I fixed most of them under bug 1591869. I think it might be time to close this bug.

No crashes in a while, I'm closing this one.

Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Group: dom-core-security → core-security-release
Assignee: nobody → karlt
Target Milestone: --- → mozilla72
Whiteboard: [adv-main71-][adv-esr68.3-]
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.