Closed Bug 1255605 Opened 8 years ago Closed 8 years ago

Search bar broken on upgrade from 44 to 45.

Categories

(Firefox :: Search, defect)

45 Branch
Unspecified
All
defect
Not set
normal

Tracking

()

VERIFIED FIXED
Firefox 48
Tracking Status
firefox45 blocking verified
firefox46 + verified
firefox47 --- verified
firefox48 --- verified
firefox-esr45 --- verified
relnote-firefox --- 45+

People

(Reporter: amosonn, Assigned: florian)

References

Details

(Keywords: regression)

Attachments

(6 files, 3 obsolete files)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:44.0) Gecko/20100101 Firefox/44.0
Build ID: 20160220080452

Steps to reproduce:

I upgraded from 44.0.2 to 45 (on arch-linux).


Actual results:

The search bar stopped working - all search engines disappeared, and the menu of about:preferences#search was all grayed-out. Switch to safe-mode didn't help.
Print to stdout was:
*************************
A coding exception was thrown and uncaught in a Task.

Full message: TypeError: engine is null
Full stack: this.ContentSearch._currentEngineObj<@resource://app/modules/ContentSearch.jsm:487:9
TaskImpl_run@resource://gre/modules/Task.jsm:315:40
TaskImpl@resource://gre/modules/Task.jsm:276:3
createAsyncFunction/asyncFunction@resource://gre/modules/Task.jsm:250:14
this.ContentSearch._currentStateObj<@resource://app/modules/ContentSearch.jsm:468:28
TaskImpl_run@resource://gre/modules/Task.jsm:315:40
TaskImpl@resource://gre/modules/Task.jsm:276:3
createAsyncFunction/asyncFunction@resource://gre/modules/Task.jsm:250:14
this.ContentSearch._onMessageGetState@resource://app/modules/ContentSearch.jsm:258:12
this.ContentSearch._onMessage<@resource://app/modules/ContentSearch.jsm:252:13
TaskImpl_run@resource://gre/modules/Task.jsm:315:40
Handler.prototype.process@resource://gre/modules/Promise.jsm -> resource://gre/modules/Promise-backend.js:933:23
this.PromiseWalker.walkerLoop@resource://gre/modules/Promise.jsm -> resource://gre/modules/Promise-backend.js:812:7
this.PromiseWalker.scheduleWalkerLoop/<@resource://gre/modules/Promise.jsm -> resource://gre/modules/Promise-backend.js:746:1

*************************
Does the search bar work in a fresh profile?
https://support.mozilla.org/en-US/kb/profile-manager-create-and-remove-firefox-profiles
Component: Untriaged → Search
Flags: needinfo?(amosonn)
Yes, even on my other non-fresh profiles, which have different search-engines configured.
Flags: needinfo?(amosonn)
I tried downgrading, "Restore Default Search Engines" and upgrading, but to no avail.
Could you try to use the tool mozregression to find a regression range in FF45, please.
See http://mozilla.github.io/mozregression/ for the FAQ. Run the command "mozregression --good=44" then copy the final pushlog provided in the console output.
Flags: needinfo?(amosonn)
https://hg.mozilla.org/integration/fx-team/pushloghtml?fromchange=dcd113d0102e773bb3ff1806bf4a5c484039c492&tochange=2d9c58098eb3fd5d79a05ccea2c1457dd4b2ec25

The regression tool also said 

17:56.48 INFO: Looks like the following bug has the changes which introduced the regression:
https://bugzilla.mozilla.org/show_bug.cgi?id=1203167

but that's pretty clear from the pushlog.
One (maybe) surprising thing is that in the nightly builds of the regression, the "bad" builds had default search engines, not a broken search bar as I've encountered.
Flags: needinfo?(amosonn)
Florian, your thoughts about this regression?
Blocks: 1203167
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: needinfo?(florian)
OS: Unspecified → Linux
This appears to be happening with a specific profile, so in order to figure out what's happening, I would need to have access to that profile.

Could you please attach or email me the search-related files of your profile that reproduces the bug?

I will need:
- search.json
- search-metadata.json
- the searchplugins/ folder
Flags: needinfo?(florian) → needinfo?(amosonn)
Attached file search.json (obsolete) —
Attached file search-metadata.json (obsolete) —
Flags: needinfo?(amosonn)
Thanks! I'll need the searchplugins/ folder too to try to reproduce (you can zip it to attach it as a single file).
Yeah, I'm sorry I didn't write before, I intended to write just after uploading the file and then things happened :). The json-s I uploaded are after I've deleted all the search engines, and clicked "Restore Default Search Engines", so the searchplugins/ folder is empty. Still, the regression is there; for my local build the search bar is broken; for the nightly builds you can check the order of the engines, if it correctly loads the profile then Google is first, then Yahoo, then the rest; if it defaults, then it's something else.
I can upload the previous version, if you think it's better.
Thanks! Do you have a backup of the profile before the update to 45? That would be the best place to start debugging.

But I'll definitely also look at the files you already attached. Firefox should never end-up in a state with no search plugin left.

By the way, builds after bug 1203167 landed will not use any of the files I requested, they are just read once for migration. The only relevant file (once it exists) is search.json.mozlz4. So... this file may also be a good place to start figuring out why you end-up with no search plugin at all (since bug 438599, we should at least make Google or Yahoo re-appear automatically if the profile somehow contains search settings deleting all engines).

Do you have add-ons installed that could play a role here? (I assumed not because comment 0 said safe-mode didn't help; but they could have had an impact during the migration.)
A user said he fixed it by deleting the file "search.json.mozlz4" from is profile folder.
http://forums.mozillazine.org/viewtopic.php?p=14539391#p14539391

Can you test by renaming this file (add .old e.g.) then restart Firefox.
Flags: needinfo?(amosonn)
(In reply to Loic from comment #13)
> A user said he fixed it by deleting the file "search.json.mozlz4" from is
> profile folder.
> http://forums.mozillazine.org/viewtopic.php?p=14539391#p14539391
> 
> Can you test by renaming this file (add .old e.g.) then restart Firefox.

If this "fixes" it, then I would really like to have a look at the broken search.json.mozlz4 file.
It indeed fixed it.
I'll upload the two mozlz4-s.
Attached file search.json
Attachment #8729847 - Attachment is obsolete: true
Attached file search-metadata.json
Attachment #8729849 - Attachment is obsolete: true
Attached file searchplugins
Flags: needinfo?(amosonn)
Tracking as it is critical.
Not Linux specific (of the two people who mentioned this on the mozillazine forum, one is running Windows 10 and the other Windows 7).
OS: Linux → All
There doesn't seem to be anything broken in the old files (search.json, search-metadata.json, searchplugins/*), as after removing search.json.mozlz4 the migration worked fine for you. I also tried them in a profile created with Firefox 44.0.2, and couldn't reproduce the bug.

Here is the pretty printed content of the broken search.json.mozlz4 file:
{
	version: 1,
	buildID: "20160308181531",
	locale: "en-US",
	visibleDefaultEngines: ["amazondotcom", "bing", "eBay", "google", "twitter", "wikipedia", "yahoo", "ddg"],
	metaData: {
		searchDefault: "Google",
		searchDefaultHash: "EbFeLgNOVI9RTSWJZh4LctYWQNvevvckIeT6lFzimwM=",
		searchDefaultExpir: 1460239051466
	},
	engines: []
}

So there are 2 problems here:
1. Why did we ever save this file with an empty array of engines? Given that the profile isn't the problem, and you can't reproduce, it's probably something that happened at the time of the migration. For example, maybe Firefox 45 was closed very quickly during its first startup, before the migration was completed?
2. With such a broken file in the profile, Firefox should still be functional, and should overwrite the file at the next startup.
This is the safest patch I would write. For something that'll be uplifted with limited testing, I didn't want to write a patch based on guesses.

The attached patch:
- prevents writing a cache file that doesn't contain any engine.
This case should never occur, but this bug noticed by at least 3 different users proves that there's somehow a race condition that causes us to attempt to do it. This may cause us to lose some data (likely unimportant data, as all the data contained in attachment 8729970 [details] was downloaded from Mozilla servers anyway), but this seems much better than breaking the file containing the user's search settings.
- ignores search.json.mozlz4 files containing no engine.
I had some hesitation on this part: should we take into account the metadata from such a file and force a cache update (to re-add the built-in engines from the omni.ja file), or should we ignore the cache completely and re-attempt to migrate older settings? I've decided to do the latter, because in the case reported here it would be the right thing to do: search-metadata.json had user valid data that the broken search.json.mozlz4 file didn't contain. In the future, such broken files shouldn't occur again, so we should never have a profile where search.json.mozlz4 has no engine and we don't have search.json/search-metadata.json to re-migrate from.

Interestingly, we had a case in our xpcshell tests (test_hidden.js) where we were writing a cache file without any engine. This was the result of a race between the code unInitializing the search service asynchronously, and the code re-initializing it synchronously. I fixed the test to wait for the uninit to be finished before forcing a sync init. For this race to happen for real users, there would need to be an add-on changing the value of the general.useragent.locale preference (this triggers an asynchronous re-initialization of the search service) and forcing a synchronous initialization of the search service immediately afterwards. This doesn't seem likely, so I still think we probably have another race (eg. between async-init and shutdown).
Attachment #8730246 - Flags: feedback?(adw)
Assignee: nobody → florian
Status: NEW → ASSIGNED
As far as I know, I didn't close FF too quickly. I do have a user-agent changed extension, however, which might change the useragent at startup, and this might trigger the race condition you described. Still doesn't explain why it worked the second time around.
(In reply to amosonn from comment #25)
> As far as I know, I didn't close FF too quickly. I do have a user-agent
> changed extension, however, which might change the useragent at startup, and
> this might trigger the race condition you described.

Can you give me the name of this extension?
Attached patch Patch v2Splinter Review
Now with tests.
Attachment #8730296 - Flags: review?(adw)
Attachment #8730246 - Attachment is obsolete: true
Attachment #8730246 - Flags: feedback?(adw)
Comment on attachment 8730296 [details] [diff] [review]
Patch v2

Review of attachment 8730296 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for explaining.  Would you mind throwing new Error("message") instead of just "message" so that we get stack traces when that happens?
Attachment #8730296 - Flags: review?(adw) → review+
Florian, could you land this asap and ask for an uplift to 45 & 45esr? Merci!
Flags: needinfo?(florian)
Is this broken for 46/47/48 as well? Or only on 45?
It's late and I assume that Florian is gone for the day. Push to Try looks good, so after discussion with Drew on IRC, I'm landing it as-is on fx-team to keep this moving.
Comment on attachment 8730296 [details] [diff] [review]
Patch v2

Approval Request Comment
[Feature/regressing bug #]: bug 1203167
[User impact if declined]: Some users may be left with a Firefox unable to access any search plugin.
[Describe test coverage new/current, TreeHerder]: the patch contains xpcshell tests.
[Risks and why]: Risks as limited as I could; the patch only adds exceptions in cases that 'should never happen', to prevent user-visible consequences when these cases do happen for unknown reasons.
[String/UUID change made/needed]: none.

(In reply to Ryan VanderMeulen [:RyanVM] from comment #32)
> It's late and I assume that Florian is gone for the day. Push to Try looks
> good, so after discussion with Drew on IRC, I'm landing it as-is on fx-team
> to keep this moving.

Thanks!
Flags: needinfo?(florian)
Attachment #8730296 - Flags: approval-mozilla-release?
Attachment #8730296 - Flags: approval-mozilla-esr45?
Attachment #8730296 - Flags: approval-mozilla-beta?
Attachment #8730296 - Flags: approval-mozilla-aurora?
Flags: in-testsuite+
https://hg.mozilla.org/mozilla-central/rev/b21c3e5856ae
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 48
Comment on attachment 8730296 [details] [diff] [review]
Patch v2

Needed for the dot releases.
Attachment #8730296 - Flags: approval-mozilla-release?
Attachment #8730296 - Flags: approval-mozilla-release+
Attachment #8730296 - Flags: approval-mozilla-esr45?
Attachment #8730296 - Flags: approval-mozilla-esr45+
Attachment #8730296 - Flags: approval-mozilla-beta?
Attachment #8730296 - Flags: approval-mozilla-beta+
Attachment #8730296 - Flags: approval-mozilla-aurora?
Attachment #8730296 - Flags: approval-mozilla-aurora+
(In reply to Florian Quèze [:florian] [:flo] from comment #26)
> (In reply to amosonn from comment #25)
> > As far as I know, I didn't close FF too quickly. I do have a user-agent
> > changed extension, however, which might change the useragent at startup, and
> > this might trigger the race condition you described.
> 
> Can you give me the name of this extension?

"User Agent Switcher". But as I said, it worked the second time around, without disabling extensions.
Added to the release notes "Fix an issue which could cause the list of search providers to be empty (1255605)"
Don't hesitate to propose a better wording
Verified that updating to Firefox 45.0.1 RC (Build ID: 20160315153207) and Firefox 45.0.1 ESR (Build ID: 20160316151906) fixes the search bar for the users with the search bar broken because of a bad search.json.mozlz4 file. The verification was done by updating from Firefox 45 RC and Firefox 45 ESR to Firefox 45.0.1 RC and Firefox 45.0.1 ESR.

Verified on Mac OS X 10.11, Windows 10 and Ubuntu 14.04.
Verified also that updating to Firefox 46 beta 2 (Build ID: 20160316065941), Firefox Developer Edition 47 (Build ID: 20160317004016) and Nightly 48 (20160317030235) fixes the search bar broken because of a bad search.json.mozlz4 file. The verification was done by updating from Firefox 46 beta 1 (Build ID: 20160307215824), Firefox Developer Edition 47 (Build ID: 20160311004048) and Nightly 48 (Build ID: 20160311030233).

Verified on Mac OS X 10.11, Windows 10 and Ubuntu 14.04.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: