Closed Bug 1118958 Opened 9 years ago Closed 8 years ago

Sync fails on unitymedia DS-Lite networks

Categories

(Firefox :: Sync, defect, P4)

36 Branch
defect

Tracking

()

RESOLVED FIXED

People

(Reporter: ultraviolett2, Assigned: bobm)

References

Details

Attachments

(2 files)

User Agent: Mozilla/5.0 (X11; Linux i686; rv:34.0) Gecko/20100101 Firefox/34.0
Build ID: 20141201180357

Steps to reproduce:

* I am running Arch Linux with 32 bit, Firefox Version 34.0.5.
* I ran rm -rf ~/.mozilla/ to remove all user config files.
* I changed logonsuccess in about:config to true to see all logs
* After that I created a new sync account, and under manage I excluded everything from beeing synced except booksmarks.
* Only this client is connected to this account.
* Then I triggered manually a sync


Actual results:

1420667811202	Sync.ErrorHandler	DEBUG	Flushing file log.
1420667811205	Sync.Status	DEBUG	Status.service: error.sync.failed_partial => success.status_ok
1420667811208	Sync.ErrorHandler	DEBUG	Log cleanup threshold time: 1419803811208
1420667811208	Sync.ErrorHandler	DEBUG	No logs to clean up.
1420667812194	Sync.Tracker.Bookmarks	DEBUG	Saving changed IDs to bookmarks
1420668057692	Sync.Status	DEBUG	Status.service: success.status_ok => success.status_ok
1420668057692	Sync.Status	DEBUG	Status.service: success.status_ok => success.status_ok
1420668057695	Sync.ErrorHandler	DEBUG	Beginning user-triggered sync.
1420668057695	Sync.Service	DEBUG	User-Agent: Firefox/34.0.5 FxSync/1.36.0.20141201180357.
1420668057696	Sync.Service	INFO	Starting sync at 2015-01-07 23:00:57
1420668057696	Sync.SyncScheduler	DEBUG	Clearing sync triggers and the global score.
1420668057699	Sync.Status	INFO	Resetting Status.
1420668057699	Sync.Status	DEBUG	Status.service: success.status_ok => success.status_ok
1420668057701	Sync.BrowserIDManager	DEBUG	_ensureValidToken already has one
1420668059155	Sync.Resource	DEBUG	mesg: GET success 200 https://sync-189-us-west-2.sync.services.mozilla.com/1.5/17661343/info/collections
1420668059155	Sync.Resource	DEBUG	GET success 200 https://sync-189-us-west-2.sync.services.mozilla.com/1.5/17661343/info/collections
1420668059158	Sync.Service	DEBUG	Fetching global metadata record
1420668059159	Sync.Service	DEBUG	Weave Version: 1.36.0 Local Storage: 5 Remote Storage: 5
1420668059159	Sync.Service	INFO	Sync key is up-to-date: no need to upgrade.
1420668059159	Sync.Service	DEBUG	Fetching and verifying -- or generating -- symmetric keys.
1420668059159	Sync.Service	INFO	Testing info/collections: {"meta":1420667325.98,"crypto":1420666971.7,"clients":1420667685.57}
1420668059159	Sync.CollectionKeyManager	INFO	Testing for updateNeeded. Last modified: 1420666971.7
1420668059159	Sync.Synchronizer	DEBUG	Refreshing client list.
1420668059161	Sync.Engine.Clients	INFO	0 outgoing items pre-reconciliation
1420668059162	Sync.Engine.Clients	INFO	Records: 0 applied, 0 successfully, 0 failed to apply, 0 newly failed to apply, 0 reconciled.
1420668059163	Sync.Synchronizer	INFO	Updating enabled engines: 2 clients.
1420668059168	Sync.Engine.Bookmarks	DEBUG	First sync, uploading all items
1420668059178	Sync.Store.Bookmarks	DEBUG	No node for GUID mobile; returning no children.
1420668059178	Sync.Engine.Bookmarks	INFO	12 outgoing items pre-reconciliation
1420668059179	Sync.Engine.Bookmarks	DEBUG	Bookmarks backup starting.
1420668059197	Sync.Engine.Bookmarks	DEBUG	Bookmarks backup done.
1420668059198	Sync.Engine.Bookmarks	INFO	Records: 0 applied, 0 successfully, 0 failed to apply, 0 newly failed to apply, 0 reconciled.
1420668059205	Sync.Store.Bookmarks	DEBUG	No node for GUID mobile; returning no children.
1420668059242	Sync.Engine.Bookmarks	INFO	Uploading all of 12 records
1420668059243	Sync.BrowserIDManager	DEBUG	_ensureValidToken already has one
1420668059245	Sync.Collection	DEBUG	POST Length: 6066
1420668060161	Sync.Tracker.Clients	DEBUG	Saving changed IDs to clients
1420668060178	Sync.Tracker.Bookmarks	DEBUG	Saving changed IDs to bookmarks
1420668184605	Sync.Status	DEBUG	Status.sync: success.sync => error.login.reason.network
1420668184605	Sync.Status	DEBUG	Status.service: success.status_ok => error.sync.failed
1420668184606	Sync.Status	DEBUG	Status for engine bookmarks: error.engine.reason.unknown_fail
1420668184606	Sync.Status	DEBUG	Status.service: error.sync.failed => error.sync.failed_partial
1420668184606	Sync.ErrorHandler	DEBUG	bookmarks failed: Error: NS_ERROR_NET_RESET (resource://services-sync/resource.js:394:18) JS Stack trace: Res_post@resource.js:428:12 < SyncEngine.prototype._uploadOutgoing/doUpload<@engines.js:1384:20 < innerBind@util.js:525:42 < SyncEngine.prototype._uploadOutgoing@engines.js:1432:9 < SyncEngine.prototype._sync@engines.js:1483:7 < wrappedSync@bookmarks.js:214:11 < _sync@bookmarks.js:211:1 < WrappedNotify@util.js:148:21 < Engine.prototype.sync@engines.js:655:5 < _syncEngine@enginesync.js:199:7 < sync@enginesync.js:149:15 < onNotify@service.js:1275:7 < WrappedNotify@util.js:148:21 < WrappedLock@util.js:103:16 < _lockedSync@service.js:1265:12 < sync/<@service.js:1257:14 < WrappedCatch@util.js:77:16 < sync@service.js:1245:5
1420668184607	Sync.Synchronizer	INFO	Sync completed at 2015-01-07 23:03:04 after 126.91 secs.
1420668184608	Sync.Declined	DEBUG	Handling remote declined: []
1420668184609	Sync.Declined	DEBUG	Handling local declined: []
1420668184609	Sync.Declined	DEBUG	Declined changed? false
1420668184609	Sync.Service	INFO	No change to declined engines. Not reuploading meta/global.
1420668184610	Sync.SyncScheduler	DEBUG	Next sync in 600000 ms.
1420668184611	Sync.ErrorHandler	DEBUG	Some engines did not sync correctly.



Expected results:

I am having this bug since a few month, I think it started someday in November. I thought its something temporarely, but its not.
Component: Untriaged → Sync
I narrowed down the problem a bit. Receiving Bookmarks works, but sending the causes the problem. Furthermore, it seems to be related with the network connection. My Computers are connected via a cable modem to the internet. If I connect the via my mobile hotspot sync works. DNS settings are the same for both gateways, so its apparently depending on the connection type.

Again to mention, Sync has worked in the past but since a few month it generates the posted error log. I´d like to provide more information if it would be of any help. Maybe one of you guys can have a look into my account logs on server side. I or you could just create a test account. Furthermore I could try it with an own sync server, but I am not sure this would be of any benefit.

As far as I interpret the error log it has nothing to do with the sync itself, rather with login to the account. But that would not explain why receiving bookmarks does work.
follow up here: https://support.mozilla.org/en-US/questions/1045372#answer-697258
Version: 34 Branch → 36 Branch
Summary: Clean Profile and new account, still Firefox 34.0.5 sync error → Clean Profile and new account, still Firefox 34.0.5,35 and 36 sync error
Yes, these logs indicate a network error between you and the various sync servers.
Update: I narrowed down the problem. My guess is that the problem lies in my internet providers network (http://www.unitymedia.de/kabelbw/). I am connected through DS-Lite (http://de.wikipedia.org/wiki/IPv6#Dual-Stack_Lite_.28DS-Lite.29). However the support says nothing was changed on my configuration when my issues began, However, you never know. Or did something change on the FF-sync-servers, what might cause DS-Lite issues? Have you heard about other people having issues with DS-Lite?

I took one of my notebooks to a friend, who is also connected to  Unitymedia, but with DS without "Lite", and FF sync did work!! Unfortunately Unitymedia does reject my request to switch my DS-Lite, for what I never asked for, to real DS.
Tarek,
  Your team looks after the Sync servers now, right? Do you have any insights into comment 4?
Flags: needinfo?(tarek)
I can confirm ultraviolett2's findings. I am also connected through DS-Lite from the same ISP.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Summary: Clean Profile and new account, still Firefox 34.0.5,35 and 36 sync error → Sync fails on unitymedia DS-Lite networks
 ultraviolett2@arcor.de,  seapossession@gmail.com,

Any chance you would be available to try some live troubleshooting on this issue?  I'm generally available during US/Pacific business hours, and marginally otherwise.  Please feel free to join #sync on Mozilla's IRC network: https://wiki.mozilla.org/IRC and hit us up there.
Flags: needinfo?(ultraviolett2)
Flags: needinfo?(seapossession)
https://support.mozilla.org/en-US/questions/1059072 is another user with the same problem with the same ISP and using DS-Lite
I made this bug report a while ago. Same problem (different title).

https://bugzilla.mozilla.org/show_bug.cgi?id=1141272
After contacting the support of unitymedia and giving them a link to this bug report they finaly switched me back to IPv4 about 2 weeks ago. Sync works now, so I can not assist you in live troubleshooting anymore.
Flags: needinfo?(ultraviolett2)
Flags: needinfo?(tarek)
Flags: needinfo?(seapossession)
(In reply to seapossession from comment #12)
> Created attachment 8604185 [details]
> error-1431364438347.txt - Last sync-log after testing with bobm

This log indicates that the problem was in talking to the Token service. And, consequently, explains why we weren't seeing the users' traffic to the Sync node.  We'll need to think of a way to track this on the Token service side.
> 1431455326772	FirefoxAccounts	ERROR	HAWK.signCertificate error: {"error":{},"errorString":"[Exception... \"NS_ERROR_NET_RESET\"  nsresult: \"0x804b0014 (NS_ERROR_NET_RESET)\"  location: \"JS frame :: resource://services-common/rest.js :: onStopRequest :: line 460\"  data: no]","message":null,"code":null,"errno":null}

This error appears to be happening even earlier in the flow, when talking to the FxA auth server to get a fresh signed certificate.
Here are two observations I made that might help to fix this bug:

1. This only happens on my laptop running linux (Arch and Ubuntu). Firefox sync is working fine on my desktop running Windows 8.1. It also works on my Android hardware as far as I can tell.
2. When I connect my laptop to the internet using my android phone as a hotspot, sync also work just fine.

I hope this helps somehow.
(In reply to seapossession from comment #16)
> I hope this helps somehow.

It does, thanks.
Summary: Sync fails on unitymedia DS-Lite networks → Sync fails on linux on unitymedia DS-Lite networks
Please let me know if you need me for further testing.
(In reply to seapossession from comment #18)
> Please let me know if you need me for further testing.

Can you try lowering your MTU?  We suspect you may be running in to the problem outlined here: https://rachelbythebay.com/w/2015/05/15/pmtud/
You are absolutely correct! Enabling tcp_mtu_probing did fix the problem. Firefox's syncing fine now. Thank you so much!!!
Another user on IRC via philipp today. Same setup.
there are other reports describing what appears to be a general bug in the UnityMedia network (a german cable provider with 3 million broadband-subscribers - not sure how man of those have DS-Lite). 
for example at https://freifunk-troisdorf.de/probleme-mit-unitymedia-ds-lite-kunden-behoben/#disqus_thread (german) - translation: "there's a problem with IPv4 to DS-Lite connections - packets over the MTU don't get fragmented (like they are supposed to be by the ipv4 standard), but will be rejected. In case a "don't fragment"-bit is set, no ICMP-packet is generated.
the problem can be worked around by reducing the packet size of the tunnel, by experience a MTU of 1312 in fastd worked as a solution."

not sure if this information makes it possible to address the problem from our side...
triage: setting to P4 - would love a contributor to help - MTV/SF offices don't have repro case in house.
Priority: -- → P4
I had a similar issue, but instead of it being ISP related - I simply had an MTU set to 9k on OSX 10.10.4 using Firefox 40.0b1 on bug id: 1180425
¡Hola!

From a Debian user on IRC:

"hg42: FYI: it was indeed a unitymedia provider problem, which is solved by setting sysctl -w net.ipv4.tcp_mtu_probing=1 (or add this to sysctl.conf), thanks, and bye"
Couldn't we get IPv6 on the Sync servers to sidestep this problem for UnityMedia users? That would also have the advantage of being somewhat faster [1] and of providing some resilience in the case of v4-only or v6-only network outages.

[1]: http://tinyurl.com/nrosrhn
Hallo,
it seems the outcome of my problem is quite similar to what is described here although the root cause is properly different.
Here it comes:
My company's network as been recently upgraded with the Zscaler proxy solution. I was able to regain internet access by following that thread [1]. Unfortunately that dind't worked out for Firefox Sync. I'm able to log-in (verified by getting a mail from Firefox Accounts) but Firefox isn't stil able to sycn. I suspect that is somehow related to the certification signing process/getting token8gaining access what is properly blocked by zscaler.
Before FF 29, it seems it was possible to solve that issue see here [2].

BTW. I've already started a thread on mozilla support [3].

(Getting IT involved is not an option. I'm glad that FF runs instead using the outdated IE 9. I'm fully aware of IT policies etc, that hasn't been a problem for some years.)

[1] https://superuser.com/questions/115349/firefox-this-connection-is-untrusted-behind-corporate-firewall
[2] http://www.discourse.net/2010/12/how-to-solve-the-dreaded-firefox-sync-error-while-signing-in-problem/#comment-1358006
[3] https://support.mozilla.org/de/questions/1088686#answer-796661

Here is the sync log:
1445238627600	Sync.Service	INFO	Loading Weave 1.43.0
1445238627603	Sync.Engine.Clients	DEBUG	Engine initialized
1445238627605	Sync.Engine.Clients	DEBUG	Resetting clients last sync time
1445238627658	Sync.Engine.Bookmarks	DEBUG	Engine initialized
1445238627661	Sync.Engine.Forms	DEBUG	Engine initialized
1445238627664	Sync.Engine.History	DEBUG	Engine initialized
1445238627666	Sync.Engine.Passwords	DEBUG	Engine initialized
1445238627668	Sync.Engine.Prefs	DEBUG	Engine initialized
1445238627671	Sync.Engine.Tabs	DEBUG	Engine initialized
1445238627671	Sync.Engine.Tabs	DEBUG	Resetting tabs last sync time
1445238627676	Sync.Engine.Addons	DEBUG	Engine initialized
1445238627677	Sync.Service	INFO	Mozilla/5.0 (Windows NT 6.1; WOW64; rv:41.0) Gecko/20100101 Firefox/41.0
1445238627679	Sync.SyncScheduler	DEBUG	Clearing sync triggers and the global score.
1445238627680	Sync.Status	DEBUG	Status.service: success.status_ok => success.status_ok
1445238627680	Sync.Service	DEBUG	Caching URLs under storage user base: https://sync-264-us-west-2.sync.services.mozilla.com/1.5/31260094/
1445238627680	Sync.Status	DEBUG	Status.service: success.status_ok => success.status_ok
1445238627680	Sync.AddonsReconciler	INFO	Registering as Add-on Manager listener.
1445238627680	Sync.AddonsReconciler	DEBUG	Adding change listener.
1445238627681	Sync.Tracker.History	INFO	Adding Places observer.
1445238627796	Sync.Engine.Greasemonkey	DEBUG	Engine initialized
1445238627799	Sync.Engine.AdblockPlus	DEBUG	Engine initialized
1445238627800	Sync.Status	DEBUG	Status.service: success.status_ok => success.status_ok
1445238629895	FirefoxAccounts	DEBUG	FxAccountsWebChannel registered: account_updates with origin https://accounts.firefox.com
1445238631431	Sync.Tracker.Greasemonkey	WARN	Changed IDs file greasemonkey contains non-object value.
1445238631525	FirefoxAccounts	INFO	returning partial account data as the login manager is locked.
1445238631677	FirefoxAccounts	DEBUG	FxAccountsProfileClient: Initialized
1445238631678	FirefoxAccounts	DEBUG	FxAccountsProfileClient: Requested profile
1445238631679	FirefoxAccounts	DEBUG	getOAuthToken enter
1445238631702	FirefoxAccounts	INFO	returning partial account data as the login manager is locked.
1445238631738	FirefoxAccounts	DEBUG	FxAccountsOAuthGrantClient Initialized
1445238631739	FirefoxAccounts	DEBUG	getOAuthToken fetching new token from: https://oauth.accounts.firefox.com/v1
1445238631739	FirefoxAccounts	DEBUG	enter getAssertion()
1445238632211	Sync.Status	DEBUG	Status.service: success.status_ok => success.status_ok
1445238632213	FirefoxAccounts	INFO	returning partial account data as the login manager is locked.
1445238632247	Sync.Service	DEBUG	User-Agent: Firefox/41.0.2 FxSync/1.43.0.20151014143721.
1445238632247	Sync.Service	INFO	Starting sync at 2015-10-19 09:10:32
1445238632247	Sync.Service	DEBUG	In sync: should login.
1445238632248	Sync.Status	DEBUG	Status.service: success.status_ok => success.status_ok
1445238632248	Sync.Status	DEBUG	Status.service: success.status_ok => success.status_ok
1445238632309	Sync.BrowserIDManager	INFO	Waiting for user to be verified.
1445238632309	FirefoxAccounts	DEBUG	already verified
1445238632309	Sync.BrowserIDManager	INFO	Starting fetch for key bundle.
1445238632310	Sync.BrowserIDManager	INFO	Unable to fetch keys (master-password locked?), so aborting token fetch
1445238632310	Sync.BrowserIDManager	INFO	Background fetch for key bundle done
1445238632310	Sync.Status	DEBUG	Status.login: success.login => success.login
1445238632310	Sync.Status	DEBUG	Status.service: success.status_ok => success.status_ok
1445238632310	Sync.Service	INFO	Logging in the user.
1445238632310	Sync.Service	DEBUG	Caching URLs under storage user base: https://sync-264-us-west-2.sync.services.mozilla.com/1.5/31260094/
1445238632431	FirefoxAccounts	DEBUG	got keyPair
1445238632431	FirefoxAccounts	DEBUG	getCertificateSigned: true true
1445238634221	Hawk	DEBUG	(Response) /certificate/sign: code: 401 - Status text: Unauthorized
1445238634221	Sync.RESTResponse	TRACE	Processing response headers.
1445238634222	Hawk	DEBUG	Clock offset vs https://api.accounts.firefox.com/v1: -2222
1445238634222	Hawk	DEBUG	Received 401 for /certificate/sign: retrying
1445238634657	Hawk	DEBUG	(Response) /certificate/sign: code: 401 - Status text: Unauthorized
1445238634657	Sync.RESTResponse	TRACE	Processing response headers.
1445238634657	Hawk	DEBUG	Clock offset vs https://api.accounts.firefox.com/v1: -1657
1445238634658	FirefoxAccounts	ERROR	error POSTing /certificate/sign: {"code":401,"errno":109,"error":"Unauthorized","message":"Missing authentication","info":"https://github.com/mozilla/fxa-auth-server/blob/master/docs/api.md#response-format"}
1445238634658	FirefoxAccounts	ERROR	HAWK.signCertificate error: {"code":401,"errno":109,"error":"Unauthorized","message":"Missing authentication","info":"https://github.com/mozilla/fxa-auth-server/blob/master/docs/api.md#response-format"}
1445238634658	FirefoxAccounts	ERROR	FxA rejecting with error AUTH_ERROR, details: {"code":401,"errno":109,"error":"Unauthorized","message":"Missing authentication","info":"https://github.com/mozilla/fxa-auth-server/blob/master/docs/api.md#response-format"}
1445238634660	FirefoxAccounts	ERROR	Could not retrieve profile data: Error: AUTH_ERROR (resource://gre/modules/FxAccounts.jsm:1377:18) JS Stack trace: FxAccountsInternal.prototype._error@FxAccounts.jsm:1377:18 < FxAccountsInternal.prototype._errorToErrorClass@FxAccounts.jsm:1365:14 < FxAccountsInternal.prototype.getOAuthToken<@FxAccounts.jsm:1309:13 < TaskImpl_run@Task.jsm:315:40 < Handler.prototype.process@Promise-backend.js:925:21 < this.PromiseWalker.walkerLoop@Promise-backend.js:801:7 < this.PromiseWalker.scheduleWalkerLoop/<@Promise-backend.js:740:39 < openModalWindow@nsPrompter.js:370:5 < ModalPrompter.prototype.openPrompt@nsPrompter.js:553:9 < ModalPrompter.prototype.nsIPrompt_promptPassword@nsPrompter.js:774:9 < ModalPrompter.prototype.promptPassword@nsPrompter.js:586:20 < ensureMPUnlocked@util.js:571:7 < this.BrowserIDManager.prototype.unlockAndVerifyAuthState@browserid_identity.js:464:10 < verifyLogin@service.js:707:5 < onNotify@service.js:1011:12 < WrappedNotify@util.js:148:21 < WrappedLock@util.js:103:16 < WrappedCatch@util.js:77:16 < login@service.js:1022:12 < sync/<@service.js:1272:14 < WrappedCatch@util.js:77:16 < sync@service.js:1268:5 < exports.Utils.yield@utils.js:361:12 < INIParser.prototype.process@filterStorage.js:870:7 < exports.IO.readFromFile/onProgress@io.js:97:15 < exports.IO.readFromFile/<@io.js:182:11 < TaskImpl_run@Task.jsm:314:40 < Handler.prototype.process@Promise-backend.js:922:23 < this.PromiseWalker.walkerLoop@Promise-backend.js:801:7 < this.PromiseWalker.scheduleWalkerLoop/<@Promise-backend.js:740:39 < this.Utils.openWindow@utils.jsm:154:10 < this.Utils.prompt@utils.jsm:287:1 < sm_restorePrompt@SessionManagerHelper.js:744:37 < sm_check_for_crash@SessionManagerHelper.js:679:32 < SessionManagerHelper.observe@SessionManagerHelper.js:243:5 < SessionStartup.prototype._onSessionFileRead@nsSessionStartup.js:132:5 < Handler.prototype.process@Promise-backend.js:922:23 < this.PromiseWalker.walkerLoop@Promise-backend.js:801:7 < this.PromiseWalker.scheduleWalkerLoop/<@Promise-backend.js:740:39
1445238634664	FirefoxAccounts	ERROR	FxA rejecting with error UNKNOWN_ERROR, details: Error: AUTH_ERROR (resource://gre/modules/FxAccounts.jsm:1377:18) JS Stack trace: FxAccountsInternal.prototype._error@FxAccounts.jsm:1377:18 < FxAccountsInternal.prototype._errorToErrorClass@FxAccounts.jsm:1365:14 < FxAccountsInternal.prototype.getOAuthToken<@FxAccounts.jsm:1309:13 < TaskImpl_run@Task.jsm:315:40 < Handler.prototype.process@Promise-backend.js:925:21 < this.PromiseWalker.walkerLoop@Promise-backend.js:801:7 < this.PromiseWalker.scheduleWalkerLoop/<@Promise-backend.js:740:39 < openModalWindow@nsPrompter.js:370:5 < ModalPrompter.prototype.openPrompt@nsPrompter.js:553:9 < ModalPrompter.prototype.nsIPrompt_promptPassword@nsPrompter.js:774:9 < ModalPrompter.prototype.promptPassword@nsPrompter.js:586:20 < ensureMPUnlocked@util.js:571:7 < this.BrowserIDManager.prototype.unlockAndVerifyAuthState@browserid_identity.js:464:10 < verifyLogin@service.js:707:5 < onNotify@service.js:1011:12 < WrappedNotify@util.js:148:21 < WrappedLock@util.js:103:16 < WrappedCatch@util.js:77:16 < login@service.js:1022:12 < sync/<@service.js:1272:14 < WrappedCatch@util.js:77:16 < sync@service.js:1268:5 < exports.Utils.yield@utils.js:361:12 < INIParser.prototype.process@filterStorage.js:870:7 < exports.IO.readFromFile/onProgress@io.js:97:15 < exports.IO.readFromFile/<@io.js:182:11 < TaskImpl_run@Task.jsm:314:40 < Handler.prototype.process@Promise-backend.js:922:23 < this.PromiseWalker.walkerLoop@Promise-backend.js:801:7 < this.PromiseWalker.scheduleWalkerLoop/<@Promise-backend.js:740:39 < this.Utils.openWindow@utils.jsm:154:10 < this.Utils.prompt@utils.jsm:287:1 < sm_restorePrompt@SessionManagerHelper.js:744:37 < sm_check_for_crash@SessionManagerHelper.js:679:32 < SessionManagerHelper.observe@SessionManagerHelper.js:243:5 < SessionStartup.prototype._onSessionFileRead@nsSessionStartup.js:132:5 < Handler.prototype.process@Promise-backend.js:922:23 < this.PromiseWalker.walkerLoop@Promise-backend.js:801:7 < this.PromiseWalker.scheduleWalkerLoop/<@Promise-backend.js:740:39
1445238677089	Sync.BrowserIDManager	DEBUG	unlockAndVerifyAuthState re-fetched credentials and is returning: success.status_ok
1445238677090	Sync.Service	DEBUG	Fetching unlocked auth state returned success.status_ok
1445238677091	FirefoxAccounts	DEBUG	already verified
1445238677092	Sync.BrowserIDManager	INFO	Getting an assertion from: https://token.services.mozilla.com/1.0/sync/1.5
1445238677092	FirefoxAccounts	DEBUG	enter getAssertion()
1445238677139	FirefoxAccounts	DEBUG	getKeyPair: already have a keyPair
1445238677139	FirefoxAccounts	DEBUG	getCertificateSigned: true true
1445238677746	Hawk	DEBUG	(Response) /certificate/sign: code: 401 - Status text: Unauthorized
1445238677746	Sync.RESTResponse	TRACE	Processing response headers.
1445238677747	Hawk	DEBUG	Clock offset vs https://api.accounts.firefox.com/v1: -1746
1445238677747	Hawk	DEBUG	Received 401 for /certificate/sign: retrying
1445238678157	Hawk	DEBUG	(Response) /certificate/sign: code: 401 - Status text: Unauthorized
1445238678157	Sync.RESTResponse	TRACE	Processing response headers.
1445238678157	Hawk	DEBUG	Clock offset vs https://api.accounts.firefox.com/v1: -1157
1445238678158	FirefoxAccounts	ERROR	error POSTing /certificate/sign: {"code":401,"errno":109,"error":"Unauthorized","message":"Missing authentication","info":"https://github.com/mozilla/fxa-auth-server/blob/master/docs/api.md#response-format"}
1445238678158	FirefoxAccounts	ERROR	HAWK.signCertificate error: {"code":401,"errno":109,"error":"Unauthorized","message":"Missing authentication","info":"https://github.com/mozilla/fxa-auth-server/blob/master/docs/api.md#response-format"}
1445238678159	Sync.BrowserIDManager	ERROR	Authentication error in _fetchTokenForUser: {"details":{"code":401,"errno":109,"error":"Unauthorized","message":"Missing authentication","info":"https://github.com/mozilla/fxa-auth-server/blob/master/docs/api.md#response-format"}}
1445238678159	Sync.Status	DEBUG	Status.login: success.login => error.login.reason.account
1445238678159	Sync.Status	DEBUG	Status.service: success.status_ok => error.login.failed
1445238678232	Sync.BrowserIDManager	ERROR	Failed to fetch a token for authentication: {"details":{"code":401,"errno":109,"error":"Unauthorized","message":"Missing authentication","info":"https://github.com/mozilla/fxa-auth-server/blob/master/docs/api.md#response-format"}}
1445238678327	FirefoxAccounts	DEBUG	FxAccountsProfileClient: Requested profile
1445238678327	FirefoxAccounts	DEBUG	getOAuthToken enter
1445238678335	FirefoxAccounts	DEBUG	FxAccountsOAuthGrantClient Initialized
1445238678335	FirefoxAccounts	DEBUG	getOAuthToken fetching new token from: https://oauth.accounts.firefox.com/v1
1445238678335	FirefoxAccounts	DEBUG	enter getAssertion()
1445238678365	FirefoxAccounts	DEBUG	getKeyPair: already have a keyPair
1445238678365	FirefoxAccounts	DEBUG	getCertificateSigned: true true
1445238678527	Sync.Resource	DEBUG	Channel redirect: https://sync-264-us-west-2.sync.services.mozilla.com/1.5/31260094/info/collections, https://gateway.zscaler.net/auD?origurl=https%3A%2F%2Fsync%2d264%2dus%2dwest%2d2%2esync%2eservices%2emozilla%2ecom%2f1%2e5%2f31260094%2finfo%2fcollections, 1
1445238678527	Sync.Resource	DEBUG	Ensuring load flags are set.
1445238678603	Sync.Resource	DEBUG	Channel redirect: https://gateway.zscaler.net/auD?origurl=https%3A%2F%2Fsync%2d264%2dus%2dwest%2d2%2esync%2eservices%2emozilla%2ecom%2f1%2e5%2f31260094%2finfo%2fcollections, https://gateway.zscaler.net/auT?origurl=https%3A%2F%2Fsync%2d264%2dus%2dwest%2d2%2esync%2eservices%2emozilla%2ecom%2f1%2e5%2f31260094%2finfo%2fcollections, 1
1445238678603	Sync.Resource	DEBUG	Ensuring load flags are set.
1445238678684	Sync.Resource	DEBUG	Channel redirect: https://gateway.zscaler.net/auT?origurl=https%3A%2F%2Fsync%2d264%2dus%2dwest%2d2%2esync%2eservices%2emozilla%2ecom%2f1%2e5%2f31260094%2finfo%2fcollections, https://sync-264-us-west-2.sync.services.mozilla.com/1.5/31260094/info/collections?_sm_byp=iVV6f6FjnQ7v61VM, 1
1445238678684	Sync.Resource	DEBUG	Ensuring load flags are set.
1445238679045	Hawk	DEBUG	(Response) /certificate/sign: code: 401 - Status text: Unauthorized
1445238679045	Sync.RESTResponse	TRACE	Processing response headers.
1445238679046	Hawk	DEBUG	Clock offset vs https://api.accounts.firefox.com/v1: -2046
1445238679046	Hawk	DEBUG	Received 401 for /certificate/sign: retrying
1445238679377	Sync.Resource	DEBUG	mesg: GET fail 401 https://sync-264-us-west-2.sync.services.mozilla.com/1.5/31260094/info/collections?_sm_byp=iVV6f6FjnQ7v61VM
1445238679378	Sync.Resource	DEBUG	GET fail 401 https://sync-264-us-west-2.sync.services.mozilla.com/1.5/31260094/info/collections?_sm_byp=iVV6f6FjnQ7v61VM
1445238679379	Sync.Service	WARN	401: login failed.
1445238679379	Sync.BrowserIDManager	DEBUG	_findCluster found existing clusterURL, so discarding the current token
1445238679380	FirefoxAccounts	DEBUG	already verified
1445238679380	Sync.BrowserIDManager	INFO	Getting an assertion from: https://token.services.mozilla.com/1.0/sync/1.5
1445238679380	FirefoxAccounts	DEBUG	enter getAssertion()
1445238679389	FirefoxAccounts	DEBUG	getKeyPair: already have a keyPair
1445238679390	FirefoxAccounts	DEBUG	getCertificateSigned: true true
1445238680481	Hawk	DEBUG	(Response) /certificate/sign: code: 401 - Status text: Unauthorized
1445238680481	Sync.RESTResponse	TRACE	Processing response headers.
1445238680481	Hawk	DEBUG	Clock offset vs https://api.accounts.firefox.com/v1: -1481
1445238680482	FirefoxAccounts	ERROR	error POSTing /certificate/sign: {"code":401,"errno":109,"error":"Unauthorized","message":"Missing authentication","info":"https://github.com/mozilla/fxa-auth-server/blob/master/docs/api.md#response-format"}
1445238680482	FirefoxAccounts	ERROR	HAWK.signCertificate error: {"code":401,"errno":109,"error":"Unauthorized","message":"Missing authentication","info":"https://github.com/mozilla/fxa-auth-server/blob/master/docs/api.md#response-format"}
1445238680483	FirefoxAccounts	ERROR	FxA rejecting with error AUTH_ERROR, details: {"code":401,"errno":109,"error":"Unauthorized","message":"Missing authentication","info":"https://github.com/mozilla/fxa-auth-server/blob/master/docs/api.md#response-format"}
1445238680484	FirefoxAccounts	ERROR	Could not retrieve profile data: Error: AUTH_ERROR (resource://gre/modules/FxAccounts.jsm:1377:18) JS Stack trace: FxAccountsInternal.prototype._error@FxAccounts.jsm:1377:18 < FxAccountsInternal.prototype._errorToErrorClass@FxAccounts.jsm:1365:14 < FxAccountsInternal.prototype.getOAuthToken<@FxAccounts.jsm:1309:13 < TaskImpl_run@Task.jsm:315:40 < Handler.prototype.process@Promise-backend.js:925:21 < this.PromiseWalker.walkerLoop@Promise-backend.js:801:7 < this.PromiseWalker.scheduleWalkerLoop/<@Promise-backend.js:740:39 < waitForSyncCallback@async.js:102:7 < makeSpinningCallback/callback.wait@async.js:145:27 < BrowserIDClusterManager.prototype._findCluster@browserid_identity.js:775:12 < setCluster@cluster.js:77:19 < verifyLogin@service.js:761:35 < onNotify@service.js:1011:12 < WrappedNotify@util.js:148:21 < WrappedLock@util.js:103:16 < WrappedCatch@util.js:77:16 < login@service.js:1022:12 < sync/<@service.js:1272:14 < WrappedCatch@util.js:77:16 < sync@service.js:1268:5 < exports.Utils.yield@utils.js:361:12 < INIParser.prototype.process@filterStorage.js:870:7 < exports.IO.readFromFile/onProgress@io.js:97:15 < exports.IO.readFromFile/<@io.js:182:11 < TaskImpl_run@Task.jsm:314:40 < Handler.prototype.process@Promise-backend.js:922:23 < this.PromiseWalker.walkerLoop@Promise-backend.js:801:7 < this.PromiseWalker.scheduleWalkerLoop/<@Promise-backend.js:740:39 < this.Utils.openWindow@utils.jsm:154:10 < this.Utils.prompt@utils.jsm:287:1 < sm_restorePrompt@SessionManagerHelper.js:744:37 < sm_check_for_crash@SessionManagerHelper.js:679:32 < SessionManagerHelper.observe@SessionManagerHelper.js:243:5 < SessionStartup.prototype._onSessionFileRead@nsSessionStartup.js:132:5 < Handler.prototype.process@Promise-backend.js:922:23 < this.PromiseWalker.walkerLoop@Promise-backend.js:801:7 < this.PromiseWalker.scheduleWalkerLoop/<@Promise-backend.js:740:39
1445238680487	FirefoxAccounts	ERROR	FxA rejecting with error UNKNOWN_ERROR, details: Error: AUTH_ERROR (resource://gre/modules/FxAccounts.jsm:1377:18) JS Stack trace: FxAccountsInternal.prototype._error@FxAccounts.jsm:1377:18 < FxAccountsInternal.prototype._errorToErrorClass@FxAccounts.jsm:1365:14 < FxAccountsInternal.prototype.getOAuthToken<@FxAccounts.jsm:1309:13 < TaskImpl_run@Task.jsm:315:40 < Handler.prototype.process@Promise-backend.js:925:21 < this.PromiseWalker.walkerLoop@Promise-backend.js:801:7 < this.PromiseWalker.scheduleWalkerLoop/<@Promise-backend.js:740:39 < waitForSyncCallback@async.js:102:7 < makeSpinningCallback/callback.wait@async.js:145:27 < BrowserIDClusterManager.prototype._findCluster@browserid_identity.js:775:12 < setCluster@cluster.js:77:19 < verifyLogin@service.js:761:35 < onNotify@service.js:1011:12 < WrappedNotify@util.js:148:21 < WrappedLock@util.js:103:16 < WrappedCatch@util.js:77:16 < login@service.js:1022:12 < sync/<@service.js:1272:14 < WrappedCatch@util.js:77:16 < sync@service.js:1268:5 < exports.Utils.yield@utils.js:361:12 < INIParser.prototype.process@filterStorage.js:870:7 < exports.IO.readFromFile/onProgress@io.js:97:15 < exports.IO.readFromFile/<@io.js:182:11 < TaskImpl_run@Task.jsm:314:40 < Handler.prototype.process@Promise-backend.js:922:23 < this.PromiseWalker.walkerLoop@Promise-backend.js:801:7 < this.PromiseWalker.scheduleWalkerLoop/<@Promise-backend.js:740:39 < this.Utils.openWindow@utils.jsm:154:10 < this.Utils.prompt@utils.jsm:287:1 < sm_restorePrompt@SessionManagerHelper.js:744:37 < sm_check_for_crash@SessionManagerHelper.js:679:32 < SessionManagerHelper.observe@SessionManagerHelper.js:243:5 < SessionStartup.prototype._onSessionFileRead@nsSessionStartup.js:132:5 < Handler.prototype.process@Promise-backend.js:922:23 < this.PromiseWalker.walkerLoop@Promise-backend.js:801:7 < this.PromiseWalker.scheduleWalkerLoop/<@Promise-backend.js:740:39
1445238680585	Hawk	DEBUG	(Response) /certificate/sign: code: 401 - Status text: Unauthorized
1445238680585	Sync.RESTResponse	TRACE	Processing response headers.
1445238680585	Hawk	DEBUG	Clock offset vs https://api.accounts.firefox.com/v1: -1585
1445238680585	Hawk	DEBUG	Received 401 for /certificate/sign: retrying
1445238680974	Hawk	DEBUG	(Response) /certificate/sign: code: 401 - Status text: Unauthorized
1445238680974	Sync.RESTResponse	TRACE	Processing response headers.
1445238680975	Hawk	DEBUG	Clock offset vs https://api.accounts.firefox.com/v1: -1974
1445238680975	FirefoxAccounts	ERROR	error POSTing /certificate/sign: {"code":401,"errno":109,"error":"Unauthorized","message":"Missing authentication","info":"https://github.com/mozilla/fxa-auth-server/blob/master/docs/api.md#response-format"}
1445238680975	FirefoxAccounts	ERROR	HAWK.signCertificate error: {"code":401,"errno":109,"error":"Unauthorized","message":"Missing authentication","info":"https://github.com/mozilla/fxa-auth-server/blob/master/docs/api.md#response-format"}
1445238680975	Sync.BrowserIDManager	ERROR	Authentication error in _fetchTokenForUser: {"details":{"code":401,"errno":109,"error":"Unauthorized","message":"Missing authentication","info":"https://github.com/mozilla/fxa-auth-server/blob/master/docs/api.md#response-format"}}
1445238680975	Sync.Status	DEBUG	Status.login: error.login.reason.account => error.login.reason.account
1445238680975	Sync.Status	DEBUG	Status.service: error.login.failed => error.login.failed
1445238680984	Sync.BrowserIDManager	INFO	Failed to fetch the cluster URL: {"details":{"code":401,"errno":109,"error":"Unauthorized","message":"Missing authentication","info":"https://github.com/mozilla/fxa-auth-server/blob/master/docs/api.md#response-format"}}
1445238680984	Sync.Service	DEBUG	Cluster value = null
1445238680984	Sync.Status	DEBUG	Status.login: error.login.reason.account => error.login.reason.account
1445238680984	Sync.Status	DEBUG	Status.service: error.login.failed => error.login.failed
1445238680985	Sync.SyncScheduler	DEBUG	Clearing sync triggers and the global score.
1445238680985	Sync.ErrorHandler	ERROR	Sync encountered a login error
Bug 1239790 implies this isn't just linux, so adjusting the title accordingly
OS: Linux → All
Hardware: x86 → All
Summary: Sync fails on linux on unitymedia DS-Lite networks → Sync fails on unitymedia DS-Lite networks
See Also: → 1263079
users affected by this issue are still a regular occurrence on various support channels.
Just dropping by to say that the workaround pointed out in comment 27 fixed the bug for me. Sync works now again!
(In reply to est31 from comment #35)
> Just dropping by to say that the workaround pointed out in comment 27 fixed
> the bug for me. Sync works now again!

Is there a windows equivalent to the line in the mentioned comment?
Since I don't see it stated anywhere explicitly in the comments - :bobm, could you give us a definitive yes/no/maybe on whether there's a plausible way for us to work around the issue on the server side?
Flags: needinfo?(bobm)
(In reply to Ryan Kelly [:rfkelly] from comment #37)
> Since I don't see it stated anywhere explicitly in the comments - :bobm,
> could you give us a definitive yes/no/maybe on whether there's a plausible
> way for us to work around the issue on the server side?

Short of lowering our MTU and therefore the throughput on the rest of our users, I don't think so.
Flags: needinfo?(bobm)
(In reply to Bob Micheletto [:bobm] from comment #39)
> Short of lowering our MTU and therefore the throughput on the rest of our
> users, I don't think so.

How much would it need to be lowered? While I'm making this up as I go along, it looks like the server is using an MTU of ~1500 bytes, and https://blog.cloudflare.com/path-mtu-discovery-in-practice/ implies a value of 1200 might resolve the problem (and comment 23 implies 1312 (+-28?) might even work). If that's true, is the reduction in throughput from that change so great that it justifies us effectively blocking all users who are using this kind of tunneling? A google search implies UnityMedia isn't the only ISP with this offering.

https://rachelbythebay.com/w/2015/05/15/pmtud/ also ends with the ominous "Filter ICMP at your peril, folks" - am I correct that we are doing such filtering, and if so, would not doing so help fix this?

I understand this is a tradeoff, but if it is at all possible, I don't think we should rely on affected users finding this bug and working out how to adjust their local MTU just to use Sync, even if there is a slight reduction in performance for some other users - much like the position cloudfare took (if I'm reading that article correctly, which I admit is a dubious proposition)
Flags: needinfo?(bobm)
(In reply to Mark Hammond [:markh] from comment #40)
> (In reply to Bob Micheletto [:bobm] from comment #39)

According to this document: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/network_mtu.html#path_mtu_discovery it appears that AWS is doing the ICMP filtering, and we can add a rule to enable it to pass through.  That does not guarantee a solution to all cases, but it should be big improvement.  I much prefer that to a blanket MTU reduction.  I'll get that going in stage and we can probably move on this in production soon.

As far as measuring success, we should see a significant drop in our 499 rate if it's doing anything positive.
Flags: needinfo?(bobm)
(In reply to Bob Micheletto [:bobm] from comment #41)
> (In reply to Mark Hammond [:markh] from comment #40)
> > (In reply to Bob Micheletto [:bobm] from comment #39)

Set up in stage.  :kthiessen what kind of testing should we do?
Flags: needinfo?(kthiessen)
(In reply to est31 from comment #38)
> lukas, this might help:
> https://technet.microsoft.com/en-us/library/cc958871.aspx

Sadly, this did not do the trick.
From the comments I see above, I'm not sure how we'd trigger the failure case from a properly-configured network.

Mark, what do you have in mind for a test scenario here?
Flags: needinfo?(kthiessen)
(In reply to Karl Thiessen [:kthiessen] from comment #44)
> From the comments I see above, I'm not sure how we'd trigger the failure
> case from a properly-configured network.
> 
> Mark, what do you have in mind for a test scenario here?

I can reproduce the basic issue on Linux:
* |sudo ifconfig| tells me my default MTU is 1500 bytes.
* |sudo ifconfig eth0 mtu 3000| to change it to 3000
* Start a Firefox already configured with Sync, set |services.sync.log.appender.dump="Trace"|, restart and do a Sync.
* Console shows:
> 1466994459787	FirefoxAccounts	DEBUG	got keyPair
> 1466994459788	FirefoxAccounts	DEBUG	getCertificateSigned: true true
* Then after a minute or so:
> 1466994520980	Hawk	WARN	hawk request error: [Exception... "NS_ERROR_NET_RESET"  nsresult: "0x804b0014 (NS_ERROR_NET_RESET)"  location: ...

* Reset MTU to 1500 and restart the browser - it starts working again.

I think/hope that we expect it to work correctly on stage with a large MTU.

Note also that for some reason I could *not* reproduce this on Windows.
Flags: needinfo?(kthiessen)
I will check this out on Linux sometime this week.  Thank you, :markh, for the list of steps.
Flags: needinfo?(kthiessen)
Assignee: nobody → kthiessen
Yes, this appears to work fine in stage.  This config change can be pushed to prod whenever it is convenient.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Thanks Karl, but let's keep this open until it actually does hit prod, so people experiencing the problem in the meantime can find this bug. I'm assigning to bobm in the meantime - Bob, please re-close this once it hits prod.
Assignee: kthiessen → bobm
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(In reply to Mark Hammond [:markh] from comment #48)
> Thanks Karl, but let's keep this open until it actually does hit prod, so
> people experiencing the problem in the meantime can find this bug. I'm
> assigning to bobm in the meantime - Bob, please re-close this once it hits
> prod.

Pushed to production and looking good.
Status: REOPENED → RESOLVED
Closed: 8 years ago8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: