Closed Bug 1048164 Opened 10 years ago Closed 9 years ago

"Greasemonkey" add-on does not work with e10s

Categories

(Firefox :: Extension Compatibility, defect)

x86_64
Windows 7
defect
Not set
normal

Tracking

()

RESOLVED FIXED
Tracking Status
e10s + ---

People

(Reporter: colsson1990, Unassigned)

References

Details

(Keywords: addon-compat, dogfood, Whiteboard: [e10s-top-addon])

I tested Greasemonkey using the script "Manga OnlineViewer" (https://greasyfork.org/scripts/1319-manga-onlineviewer, the only one I use)
Greasemonkey does realize that the script matches the site, but it does not execute the script.
Thanks for testing, colsson! I'll update our list of add-ons.

Can you please share a link that points directly to an affected manga page and does not require a membership? That will make our testing easier.
Blocks: e10s-addons
tracking-e10s: --- → +
Keywords: addon-compat
Of course:
http://www.mangahere.co/manga/ansatsu_kyoushitsu/c096/
When working properly, the script should transform this site after about one second.
hi Anthony and Aaron, if you have any questions about add-on support for multiprocess Firefox (e10s), just drop by the #e10s IRC channel on irc.mozilla.org. MDN also has a good introduction:

https://developer.mozilla.org/en-US/Add-ons/Working_with_multiprocess_Firefox
Status: UNCONFIRMED → NEW
Ever confirmed: true
https://github.com/greasemonkey/greasemonkey/issues/2004

What's the time frame for this becoming a big deal?

P.S. Aaron was the original creator but has not been maintainer for several years.
Our current plan is to enable e10s in Firefox 36 or 37. The Firefox 36 development cycle will start on the Nightly release channel in October, but won't hit the Beta and Release channels until January and February 2015, respectively:

https://wiki.mozilla.org/RapidRelease/Calendar
Depends on: 1048755
No longer depends on: 1048755
(In reply to Anthony Lieuallen from comment #4)
> https://github.com/greasemonkey/greasemonkey/issues/2004
> 
> What's the time frame for this becoming a big deal?
> 
> P.S. Aaron was the original creator but has not been maintainer for several
> years.

The current plan calls for enabling on Nightly for a few days, probably next week.

Do you know what function is breaking exactly? It may be possible for me to make your rewrite easier.
Flags: needinfo?(arantius)
I've only had time to skim the docs, but my guess is this will need to be a big rewrite.  Lots of message passing to know about A) when to start scripts running and B) export data there back to the UI.
Flags: needinfo?(arantius)
Ok I:

* Downloaded ftp://ftp.mozilla.org/pub/firefox/nightly/latest-mozilla-aurora/firefox-34.0a2.en-US.linux-x86_64.tar.bz2 ( Mozilla/5.0 (X11; Linux x86_64; rv:34.0) Gecko/20100101 Firefox/34.0 / Built from https://hg.mozilla.org/releases/mozilla-aurora/rev/e20869e87e23 )
* Started it with a new blank profile.
* Enabled browser.tabs.remote.autostart
* Restarted

The entire browser fails to work.  No web pages will load.  Things look funny (tabs titles are underlined?).  Have I just hit a bad build?  Have I done something wrong?  I certainly can't develop an extension this way.
Whiteboard: [e10s-top-addon]
tab titles are underlined to denote e10s so people can tell them apart.
You also need to use Nightly. Good stuff isn't on Aurora yet.

Flipping that pref on aurora, beta, or release is not recommended. :)
*facepalm* wrong "latest" link.  Sorry for noise.
please NEEDINFO me if you can't get Nightly to behave and/or you come across an api that if it just 'magically' worked correctly in e10s, would fix all your problems.
TL/DR: How do I observe "document-element-inserted" under e10s?


Ok some (meager) progress made.  First and foremost is the issue that I have absolutely no idea now how the scope/lifetime of components/modules/chrome scripts/frame scripts relate to each other.  Nor what to expect to work where or how.  Things that used to work don't.

That said, what's apparently most broken is our flow for detecting content loading into tabs.  Today we:

1) Load a chrome script which
2) Ensures an object in a component is set up which
3) Sets up an observer for "document-element-inserted" which
4) Detects new content windows and runs some scripts, then
5) Attaches a DOMContentLoaded (and load) listener to that window and
6) Uses whichever fires first (e.g. images never DOMContentLoad) to remove both then run other scripts

It appears that the existing observer in the component only sees chrome windows now.  I tried to move the observer to the chrome script but it never observes anything.  I then tried to move it into the frame script but that completely fails with:

    JavaScript error: , line 0: Error: Permission denied to access property 'toString'
    JavaScript error: , line 0: Error: Permission denied to access property 'message'
    JavaScript error: , line 0: uncaught exception: unknown (can't convert to string)
    
Which specifically appears when I include (the last line of)

    var observerService = Components.classes['@mozilla.org/observer-service;1']
       .getService(Components.interfaces.nsIObserverService);
    observerService.addObserver(this, 'document-element-inserted', true);

in the frame script, and disappears when I remove it.  I _can_ successfully listen for just DOMContentLoaded in the frame script, but that's not good enough to support all our features.

In the mean time I'll continue trying to make the rest of the features that only depend on DOMContentLoaded work, to see how close I can get there.
Flags: needinfo?(ally)
I'm working on a patch to make Greasemonkey work better without any changes to itself, via CPOWs and some other shim machinery. However, it would be great to make it work with just frame scripts.

I'm not sure why the document-element-inserted observer wouldn't work from a frame script. I think it ought to. Unfortunately, the error message is totally broken because of bug 1067574. Hopefully we can get that fixed in the next few days and make progress here.

By the way, by "chrome script" do you mean a script attached to a XUL overlay with a <script> tag?
> By the way, by "chrome script" do you mean a script attached to a XUL overlay with a <script> tag?

Exactly.  Is there a better word/phrase?
Not that I know of. Makes sense.
Flags: needinfo?(ally)
FYI: My above failure with observing document-element-inserted was totally PEBKAC, it works in a frame script.  I didn't set up QueryInterface on my newly defined object, so the observer refused to use it.  It was indeed bug 1067574 that ate the error otherwise I probably would have figured it out ~instantly.
New failure, from inside a frame script:

    dump('\n\n\nTry to get profile directory:\n');
    var x;
    try {
      x = Components.classes["@mozilla.org/file/directory_service;1"];
    } catch(e) {
      dump('FAILED 1:\n'+e+'\n');
    }
    try {
      x = x.getService(Components.interfaces.nsIProperties);
    } catch(e) {
      dump('FAILED 2:\n'+e+'\n');
    }
    try {
      x = x.get("ProfD", Components.interfaces.nsIFile);
    } catch(e) {
      dump('FAILED 3:\n'+e+'\n');
    }
    dump('\n\n');
  

This (expanded test) code logs:

    Try to get profile directory:
    FAILED 3:
    [Exception... "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIProperties.get]"  nsresult: "0x80004005 (NS_ERROR_FAILURE)"  location: "JS frame :: chrome://greasemonkey/content/frame.js :: FrameRunner.prototype.runScripts :: line 119"  data: no]
    
I even moved it down into the (now working) observer method to make sure it wasn't running too early; it always does this.  It can't get the ProfD property?  NS_ERROR_FAILURE doesn't tell me a lot.  This, of course, works from a chrome script, just not from a frame script.

Is this perhaps an intentional frame/chrome access block?  If so, how would/could I have known this without asking?
Yes, frame scripts can't access the profile directory. We eventually want to sandbox the content process, so frame scripts won't be able to access the disk at all. Until that work is finished, we're trying to make it harder to access files so people aren't surprised by sandboxing.

We're working on drafts of documentation right now. I'll try to get this information included.
Oh, ****.  If only the frame script can access the content window, and the frame script also can't read files, then Greasemonkey is effectively broken.  The scripts it runs are files.  This is an important feature: script authors need to be able to edit these files.

Are we going to need to read the files contents then post them as messages from chrome (return value of a sync message?) into content every single time?
(In reply to Anthony Lieuallen from comment #21)
> Are we going to need to read the files contents then post them as messages
> from chrome (return value of a sync message?) into content every single time?

Yes. Although you could avoid sending the contents if the modification date hasn't changed.
Anthony, if you have a moment, would you mind posting the code you have that causes the bad exception message? I foolishly lost my testcase for bug 1067574 and now we're having trouble reproducing it.
Flags: needinfo?(arantius)
Should be https://github.com/arantius/greasemonkey/blob/4c22d03f136177cfb0c4bae38ebf36e55b8d36cf/content/frame-runner.js#L17

Just trying to .addObserver() where the first argument is a javascript object with no QueryInterface.
Flags: needinfo?(arantius)
Adding "dogfood" keyword so this bug shows up on the e10s wiki's list of known issues:

https://wiki.mozilla.org/Electrolysis#Known_Issues
Keywords: dogfood
Finally getting back to actively working on this.  The situation currently is:

1) Firefox of "Mozilla/5.0 (X11; Linux x86_64; rv:36.0) Gecko/20100101 Firefox/36.0" with "Built from https://hg.mozilla.org/mozilla-central/rev/7e3c85754d32"
2) about:preferences, general, Enable E10S currently OFF

If I use the latest Greasemonkey at the master branch (no e10s support yet), it has an "observerService.addObserver(aService, 'document-element-inserted', false);" in our component's code.  That component observes exactly one document-element-inserted per page being loaded in the browser.

If I switch to the in-progress e10s branch, that same component does a "messageManager.loadFrameScript("chrome://greasemonkey/content/framescript.js", true);", and that frame script at the top level does "observerService.addObserver(observer, 'document-element-inserted', false);".  This frame script observes eleven document-element-inserted for each page loaded in the browser.  (Turning E10S on seems to produce 8 rather than 11, but anything beyond 1 is an issue for us.)

Behavior seems the same all the way back to LTS 31.1 (which generates 13).  Putting the observer in the frame script (which is the only way it works once e10s is enabled) seems to be the problem.
It's hard to know what's going on without a more specific example. I tried this myself and it seemed to work. One thing to keep in mind is that frame scripts get run once per frame, and there can be more frames than just the browser tabs. Observers, though, are global. So if there are three tabs open and they all have frame scripts that do the addObserver, then you'll get three notifications.

One way to see if this is happening is to put a dump statement right before you call addObserver. If you're adding the observer N times, then it will get notified N times.

One way to work around this would be to have the frame scripts load a JSM and put your observer there. JSMs are global per-process, so only one will ever be added.
Ok. I can indeed move the observer into a module.  But then I have no message manager, how do I communicate with the rest of the world when the observer triggers?  The frame which can do so is now a separate scope.

I tried, but turns out I was developing with e10s off and when I turn it back on Firefox just crashes.
(In reply to Anthony Lieuallen from comment #28)
> Ok. I can indeed move the observer into a module.  But then I have no
> message manager, how do I communicate with the rest of the world when the
> observer triggers?  The frame which can do so is now a separate scope.

Once the observer fires, you can get the frame script scope and message manager via this code:

function chromeGlobalForDocument(document)
{
    return document
      .defaultView
      .QueryInterface(Ci.nsIInterfaceRequestor)
      .getInterface(Ci.nsIWebNavigation)
      .QueryInterface(Ci.nsIDocShellTreeItem)
      .rootTreeItem
      .QueryInterface(Ci.nsIInterfaceRequestor)
      .getInterface(Ci.nsIContentFrameMessageManager);
}

(Sorry, it's almost comically long and weird.)

So your observer could do something like this:

observe: function(document, topic, data) {
    var chromeGlobal = chromeGlobalForDocument(document);
    chromeGlobal.sendAsyncMessage("GreaseMonkey:DocumentLoaded", {foo: bar});

    // If the frame script contains the line `this.bar = 10;` at the top level,
    // then we'll get 10 here.
    var bar = chromeGlobal.bar;
}

> I tried, but turns out I was developing with e10s off and when I turn it
> back on Firefox just crashes.

Could you please file a separate bug with a link to the crash dump?

Sorry this is so rocky, but it's really helpful to get an outside perspective on all this stuff. It will help us improve our documentation and focus development on the pain points you're experiencing.
Progress!  My existing code was finding the message manager by:

  return aContentWin.QueryInterface(Ci.nsIInterfaceRequestor)
      .getInterface(Ci.nsIWebNavigation)
      .QueryInterface(Ci.nsIDocShell)
      .QueryInterface(Ci.nsIInterfaceRequestor)
      .getInterface(Ci.nsIContentFrameMessageManager);

But when I use your version I don't crash.  Switching nsIDocShell to nsIDocShellTreeItem and then .rootTreeItem seems to be key, not that I have any idea why.  Also: I don't know how to generate/save/link to a crash dump.  Maybe I once did before the crash was isolated inside a tab, but now all I get is a 'reload' button.


However, it seems like the JSM is getting imported twice.  Which attaches the observer twice; one works and one does not (as best I can tell).  When it fails, it looks like:

  JavaScript error: resource://greasemonkey/contentObserver.js, line 219: NS_NOINTERFACE: Component returned failure code: 0x80004002 (NS_NOINTERFACE) [nsIInterfaceRequestor.getInterface]

Line 219 is the ".QueryInterface(Ci.nsIInterfaceRequestor)" line.  And a bunch of "Error: Permission denied to access property" or "can't convert to string" coming from I'm not sure where yet.

Checking whether I can successfully pull out the nsIContentFrameMessageManager immediately when the observer fires (and stopping when I can't) works around that failure.


I have no idea how to diagnose why the JSM is getting imported twice.  What's happening now is:


1) Component registers itself as app-startup observer
   https://github.com/arantius/greasemonkey/blob/5492694dae81d5cdca297a1ff15d57d6ecb34ddd/components/greasemonkey.js#L75
2) Which loads a frame script with the global message manager at that time
   https://github.com/arantius/greasemonkey/blob/5492694dae81d5cdca297a1ff15d57d6ecb34ddd/components/greasemonkey.js#L56
3) Which imports a JSM in the content process (right?) once it runs
   https://github.com/arantius/greasemonkey/blob/5492694dae81d5cdca297a1ff15d57d6ecb34ddd/content/framescript.js#L15
4) Which registers an observer:
   https://github.com/arantius/greasemonkey/blob/5492694dae81d5cdca297a1ff15d57d6ecb34ddd/modules/contentObserver.js#L238


Except step 4 happens twice, where I'd expect step 3 to import the already-existing module, where the observer has already been registered.
It looks like the frame script is running in some kind of chrome docshell tree(?) in addition to the content stuff. By only attaching the observer when rootTreeItem.itemType == typeContent (see https://github.com/Ventero/greasemonkey/commit/cebadca54bb8c2dd9c664cb8202a025c52aa0ee4), the problems mentioned by Anthony in comment 30 disappear.

Is this behavior of the frame script also running in some kind of chrome-ish docshell intended? Or is there maybe a problem with how we load the frame script?
OK, I think I know what's going on.

1. The reason the module is being imported twice is because it gets imported once in the chrome process and once in the content process. The chrome process loads frames for things like about:newtab in its own process.

2. The NS_NOINTERFACE exception is my fault. Upon further testing, it looks like the code I provided only works in the content process. If you run it in the chrome process, it gets the root docshell for the browser.xul window, which doesn't have a message manager.

So I don't think what I proposed before will work. However, I think there's a simpler way to do this. In the frame script, it seems like the following should work:

let obs = {
  observe: function(doc, topic, data) {
    if (!doc.defaultView) return;
    if (doc.defaultView.top !== content) return;
    // do stuff
  }
};
Services.obs.addObserver(obs, "document-element-inserted", false);
addEventListener("unload", () => {
  Services.obs.removeObserver(obs, 'document-element-inserted')
}, false);

Sorry for all the confusion. Hopefully this new version will work better.
FYI current status is now that at least basic script injection seems to function.  Various other things are broken still.  Right now the top thing is that Greasemonkey has its own persistent-value-store API.  A while back when Firefox discouraged use of the preferences store, we moved into a SQLite store (whatever Services.storage.openDatabase() does).  But from content scope, I can't get to the file where that's stored.  Can't create the nsIFile to pass in.  Trying to reference ProfD gives:

   [Exception... "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIProperties.get]" nsresult: "0x80004005 (NS_ERROR_FAILURE)" location: "JS frame :: resource://greasemonkey/util/scriptDir.js :: :: line 6" data: no]

Ugh.


A secondary concern: when this error gets raised, attempting to try/catch it and referencing the error object gives

    Permission denied to access property 'valueOf' 

On the caught error object.  This is also new.
Moved everything back out of the JSM and into the frame script as comment #32 suggested and it seems to function generally as expected.


But I'm completely stuck again.  Greasemonkey relies on nsIContentPolicy to detect and redirect load of a ".user.js" file, for installing new scripts.  This interface is network related I guess, so the existing component implementing this interface sees only chrome/about/etc. resources for the main chrome process, and no actual content.

But how do I implement this interface in a frame script?  How do I generate a component with a class ID and register it with the category manager in a content process?

So I gave up and switched to an http-on-modify-request observer.  But it fires twice.  I'm *probably* adding the observer twice (logging shows so), but I only have one frame script and it only attaches once.  But ... frame scripts, how do they work?! [insert ICP magnet meme image]  Why would a line that only exists once get called twice?  Or do I perhaps need another magical condition to ignore certain ignored events?  Whichever "if (!doc.defaultView) return; if (doc.defaultView.top !== content) return;" is ignoring?  But I don't have a document in scope for that in this observer?

In fact, when I load a new tab, I see only one "attach" logged, but still observe every event twice.
Ally: do you know how Anthony can use nsIContentPolicy from a frame script? Or why his observer frame script is seeing multiple http-on-modify-request events?
Flags: needinfo?(ally)
http-on-* events we don't have shims for, but have considered them. Bill is more of an authority on the subject than I am.
Flags: needinfo?(ally) → needinfo?(wmccloskey)
Sorry I dropped the ball on this. I intended to respond much earlier.

Here's an example of registering an nsIContentPolicy in the child process:
http://mxr.mozilla.org/mozilla-central/source/toolkit/components/addoncompat/RemoteAddonsChild.jsm#137

Unfortunately, this has the same problem as before where each tab's frame script will try to re-register the content policy, which won't work. So you would need to do the nsIContentPolicy registration from a JSM that would get loaded by the frame script.

The http-on-modify-request issue is more tricky. You're only allowed to register http-on-modify-request observers in the parent process. I think it should see all HTTP traffic though, including stuff initiated by the content process. Since this would run in the parent process, there's no reason to do it from a frame script.
Flags: needinfo?(wmccloskey)
Hmm, I probably shouldn't have said "The http-on-modify-request issue is more tricky." It might be the best way to go. I just haven't tried it myself.
It's probably too late for any of this feedback to matter, but I need to vent.

This is absolutely no question the worst change I've ever seen to Firefox, as an extension developer for at least eight years.  There's effectively no documentation.  There's a bunch about the message manager and passing messages around, but that's simple/obvious stuff.  A mention that it exists and a link to the relevant IDLs would be sufficient.  The things that actually matter are not documented.

Let's take a recent case as an example.

1. My nsIContentPolicy, running in a component in the parent process, doesn't work.  I don't know this because it was mentioned in the list of things that don't work anymore.  I know it because I had to test every single possible feature and discover what does and does not work.
2. It does "work", it's just that the completely undocumented scope of _how_ it works now is confusing.  I can still set it up and it still gets called, but it only sees chrome requests that happen in the parent process.
3. Move it into the frame script in the child process.  Whoops, for some undocumented reason, the frame script is imported several times and it breaks.
4. Move it into a JSM which gets imported into the frame script -- once per actual process now.  Phew.
5. Oh, but I want to do a simple check to see whether the URI I'm handling is a file:// inside the temporary directory, and act conditionally on that.  Too bad!  No file access!  Just think about instantiating an nsIFile and you get an angry exception.
6. Ugh, fine, I'll set up a message name, put a handler for it in just the right place, and then I have to pass a message, pick the "arguments" out of it to handle this check in the parent process which can create an nsIFile, use that to return a value, then unwrap the list of potentially many return values (even though I *know* I'm only going to return one boolean) in the child.  (And I only want to call .contains() on two file references!).
7. Nope!  How does the code running in a JSM send a message?!  The documentation conveniently forgets to mention in what scope the keywords it references are actually defined.  They're not defined in modules, which I'm forced to use.
8. Aha, it's inscrutable series of .QueryInterface()s and nsIInterfaceRequestors and .getInterface()s.
9. Finally!  After all that I've unwound all the undocumented changes to simply restore the existing functionality.
10. Turn e10s off to test.  Everything is broken again.  The behavior is different in several more undocumented ways, all the careful work from above fails.  Forget about ever releasing this code, unless you're willing to start over from step one again (several times).

This is the sort of process I've gone through dozens of times already.  As a developer, this STINKS.

I can't instantiate an nsIFile at all, even if I'm not going to read nor write a file.  I can't import the AddonManager at all, even just to check what my own installed version is.  The nsIContentPolicy shouldCall() method receives a <browser> when e10s is off, but an nsIDomWindow when it's on (repeat the whole "how do I get a message manager _now_?!" frustration).  The list goes on but this is just a few that are top-of-mind.
Another one I'm totally lost on:

We implement a custom protocol [1], so that (among other things) a script can e.g. put an image or CSS resource into a page without leaking the file:/// path inside the profile to the content page.

With E10S off (in nightly 38 as of today) this works fine.  With E10S on it complains "Firefox doesn't know how to open this address, because one of the following protocols (greasemonkey-script) isn't associated with any program or is not allowed in this context."  When I navigate directly to such a URL.

Do I need to force the component to be loaded and registered in every child process?  I gave that a shot, but if it's the right approach I don't quite know how to do it.

[1] https://github.com/greasemonkey/greasemonkey/blob/master/components/scriptProtocol.js
Additional info on similar issue:

Script insert a button and addeventlistenr. On activation it resets the GM saved data.
It was working fine and stopped working with FF 35.0

The code is simple:

function resetDB() {
  
  var keys = GM_listValues();
  console.log(keys.length + ' ' + keys);
  for (var i = 0, len = keys.length; i < len; i++) {
    GM_deleteValue(keys[i]);
  }
  var keys = GM_listValues();
  console.log(keys.length + ' ' + keys);
}

The error:
Error: Permission denied to access property 'toString'
Error: Permission denied to access property 'message'
uncaught exception: unknown (can't convert to string)
I'm pretty sure this bug is now completed.  Check out our nightly builds:

https://arantius.com/misc/gm-nightly/

to confirm.  We might not be 100% feature complete, but we appear to be compatible.  It was quite an exercise to reach this point.
The fixed version is now on AMO. I did a fairly basic test and it worked well.

Thank you, Anthony!
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.