1316564 - [tier-3] Intermittent test_direct_update.py TestDirectUpdate.test_update, test_fallback_update.py TestFallbackUpdate.test_update | AssertionError: Additional update found due to watershed release 51.0a2

Reporter

Description

•

8 years ago

treeherder

Filed by: hskupin [at] gmail.com

https://treeherder.mozilla.org/logviewer.html#?job_id=4073548&repo=mozilla-aurora

https://firefox-ui-tests.s3.amazonaws.com/37a639b5-62bf-4e1a-b456-306964f102dc/log_info.log

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 1

•

8 years ago

I think that this is a side-effect from bug 1314550 now for nightly and aurora builds. Underlying reason should be a possible second nightly build on that channel within a day, and the delayed offering for partial updates due to slow processing of funsize tasks (bug 1284516).

There are two things to think about:

1. We should make the failure message more general. Which means that we do not always talk about watershed but just that another update got offered. We can then add the version and buildid of the currently updated version to at least being able to more easily verify that situation.

2. Bug 1285340 which covers a different expected target build id could be related here or even be the underlying issue. Given that we moved the check for further updates up by some lines we do it first now. See https://hg.mozilla.org/mozilla-central/rev/fa64968dc3a1. But I'm not fully sure about that theory because even when our tests run late, Balrog should offer us an update to the most recent version immediately. Maybe those two affected update tests were running at a time when the new updates were close to be ready, so we see an overlap.

I wonder if we could simply ignore further updates in case of a target version and buildid have been given and those match the ones for the updated builds. For beta and release builds it would mean that for testing watersheds we do not get the failure anymore, which shows us that other updates are available. But I doubt that this should be a problem, and we can trigger a second ondemand run for watersheds only. With that we would have green results even - failures are always strange.

Rail and Florin, what do you think?

Flags: needinfo?(rail)

Flags: needinfo?(florin.mezei)

[SV Manager] Florin Mezei, QA (:FlorinMezei)

Comment 2

•

8 years ago

(In reply to Henrik Skupin (:whimboo) from comment #1)
> 
> I wonder if we could simply ignore further updates in case of a target
> version and buildid have been given and those match the ones for the updated
> builds. For beta and release builds it would mean that for testing
> watersheds we do not get the failure anymore, which shows us that other
> updates are available. But I doubt that this should be a problem, and we can
> trigger a second ondemand run for watersheds only. With that we would have
> green results even - failures are always strange.
> 
> Rail and Florin, what do you think?

I agree with your statement here - I think it's fine to ignore further updates.

Flags: needinfo?(florin.mezei)

Rail Aliiev [:rail]

Comment 3

•

8 years ago

Sounds sane to me. I think with this approach you can actually test the watersheds as well. We would just need to schedule multiple tests. For example, if we have a watershed at 47.0.2 for windows, then we can schedule 2 tests: 1) 46.0 -> 47.0.2 2) 47.0.2 -> 49.0.2 (latest).

Flags: needinfo?(rail)

Comment hidden (Intermittent Failures Robot)

5 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-aurora: 4
* mozilla-central: 1

Platform breakdown:
* windows8-64: 3
* windows8-32: 2

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1316564&startday=2016-11-07&endday=2016-11-13&tree=all

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Updated

•

8 years ago

Summary: Intermittent test_fallback_update.py TestFallbackUpdate.test_update | AssertionError: Additional update found due to watershed release 51.0a2 → Intermittent test_direct_update.py TestDirectUpdate.test_update, test_fallback_update.py TestFallbackUpdate.test_update | AssertionError: Additional update found due to watershed release 51.0a2

Comment hidden (Intermittent Failures Robot)

5 failures in 715 pushes (0.007 failures/push) were associated with this bug in the last 7 days.  

Repository breakdown:
* mozilla-central: 3
* mozilla-aurora: 2

Platform breakdown:
* windows8-64: 4
* windows7-64: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1316564&startday=2016-11-14&endday=2016-11-20&tree=all

Comment hidden (Intermittent Failures Robot)

7 failures in 623 pushes (0.011 failures/push) were associated with this bug in the last 7 days.  

Repository breakdown:
* mozilla-central: 6
* mozilla-aurora: 1

Platform breakdown:
* windows8-64: 6
* windows7-32: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1316564&startday=2016-11-21&endday=2016-11-27&tree=all

Comment hidden (Intermittent Failures Robot)

6 failures in 694 pushes (0.009 failures/push) were associated with this bug in the last 7 days.  

Repository breakdown:
* mozilla-central: 5
* mozilla-aurora: 1

Platform breakdown:
* windows8-64: 6

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1316564&startday=2016-11-28&endday=2016-12-04&tree=all

Comment hidden (Intermittent Failures Robot)

7 failures in 289 pushes (0.024 failures/push) were associated with this bug in the last 7 days.  

Repository breakdown:
* mozilla-aurora: 4
* mozilla-central: 3

Platform breakdown:
* windows8-64: 5
* osx-10-11: 1
* osx-10-10: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1316564&startday=2016-12-05&endday=2016-12-11&tree=all

Comment hidden (Intermittent Failures Robot)

10 failures in 749 pushes (0.013 failures/push) were associated with this bug in the last 7 days.  

Repository breakdown:
* mozilla-release: 10

Platform breakdown:
* windowsxp: 10

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1316564&startday=2017-01-23&endday=2017-01-29&tree=all

Comment hidden (Intermittent Failures Robot)

6 failures in 790 pushes (0.008 failures/push) were associated with this bug in the last 7 days.   

Repository breakdown:
* mozilla-aurora: 5
* mozilla-central: 1

Platform breakdown:
* windows8-64: 6

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1316564&startday=2017-03-06&endday=2017-03-12&tree=all

Comment hidden (Intermittent Failures Robot)

6 failures in 777 pushes (0.008 failures/push) were associated with this bug in the last 7 days.   

Repository breakdown:
* mozilla-central: 3
* mozilla-aurora: 3

Platform breakdown:
* windows8-64: 5
* linux64: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1316564&startday=2017-03-13&endday=2017-03-19&tree=all

Comment hidden (Intermittent Failures Robot)

6 failures in 898 pushes (0.007 failures/push) were associated with this bug in the last 7 days.   

Repository breakdown:
* mozilla-aurora: 5
* mozilla-central: 1

Platform breakdown:
* windows8-64: 4
* windows7-64: 2

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1316564&startday=2017-03-20&endday=2017-03-26&tree=all

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 13

•

7 years ago

We see a large increase of those failures lately and they even made it into beta now for Firefox 53.0b7:

https://treeherder.mozilla.org/#/jobs?repo=mozilla-beta&revision=46d5fe92c82d&group_state=expanded&filter-tier=3&filter-searchStr=Fxup-beta-cdntest(

So specifically fallback update tests are affected here. As the log show the pre build and post build are identically. So an upgrade did not really happen:

07:18:32     INFO - [{'build_post': {'buildid': u'20170321004003',
07:18:32     INFO -                  'channel': u'aurora',
07:18:32     INFO -                  'disabled_addons': None,
07:18:32     INFO -                  'locale': u'de',
07:18:32     INFO -                  'mar_channels': set(['firefox-mozilla-aurora']),
07:18:32     INFO -                  'update_snippet': '<?xml version="1.0"?>\n<updates>\n    <update type="minor" displayVersion="54.0a2" appVersion="54.0a2" platformVersion="54.0a2" buildID="20170324004022">\n        <patch type="complete" URL="https://mozilla-nightly-updates.s3.amazonaws.com/mozilla-aurora/20170324004022/Firefox-mozilla-aurora-54.0a2-win32-de.complete.mar?versionId=7PqVcXh7_Mnu6EFXCoTtjcXlaPqpW7tj" hashFunction="sha512" hashValue="25b5f09c3ef977dacb6a67a3968b83031aa7a08df36ff1e6dccfb2d9f0a1fd026cdaf3556855ffe10cd9d45ac21e5386bfbb3fd37e5b5f074e42a2e089e70292" size="53720874"/>\n        <patch type="partial" URL="https://mozilla-nightly-updates.s3.amazonaws.com/mozilla-aurora/20170324004022/Firefox-mozilla-aurora-54.0a2-win32-de-20170321004003-20170324004022.partial.mar?versionId=MEsbg4wmbbelCIvFBUnZB6NMbz6reZda" hashFunction="sha512" hashValue="7d126dcaedce229d7a03230f80e048b9dc091a1a29fd6425618d01959e896c0a49387f3b1cea8c287f7d5e7815d4b7776de881e608c6af4b46dd781085bfbd83" size="11158787"/>\n    </update>\n</updates>',
07:18:32     INFO -                  'update_url': u'https://aus5.mozilla.org/update/6/Firefox/54.0a2/20170321004003/WINNT_x86-msvc-x64/de/aurora/Windows_NT%206.3.0.0%20(x64)(noBug1296630v1)(nowebsense)/SSE3/default/default/update.xml?force=1',
07:18:32     INFO -                  'user_agent': u'Mozilla/5.0 (Windows NT 6.3; WOW64; rv:54.0) Gecko/20100101 Firefox/54.0',
07:18:32     INFO -                  'version': u'54.0a2'},
07:18:32     INFO -   'build_pre': {'buildid': u'20170321004003',
07:18:32     INFO -                 'channel': u'aurora',
07:18:32     INFO -                 'disabled_addons': None,
07:18:32     INFO -                 'locale': u'de',
07:18:32     INFO -                 'mar_channels': set(['firefox-mozilla-aurora']),
07:18:32     INFO -                 'update_snippet': '<?xml version="1.0"?>\n<updates>\n    <update type="minor" displayVersion="54.0a2" appVersion="54.0a2" platformVersion="54.0a2" buildID="20170324004022">\n        <patch type="complete" URL="https://mozilla-nightly-updates.s3.amazonaws.com/mozilla-aurora/20170324004022/Firefox-mozilla-aurora-54.0a2-win32-de.complete.mar?versionId=7PqVcXh7_Mnu6EFXCoTtjcXlaPqpW7tj" hashFunction="sha512" hashValue="25b5f09c3ef977dacb6a67a3968b83031aa7a08df36ff1e6dccfb2d9f0a1fd026cdaf3556855ffe10cd9d45ac21e5386bfbb3fd37e5b5f074e42a2e089e70292" size="53720874"/>\n        <patch type="partial" URL="https://mozilla-nightly-updates.s3.amazonaws.com/mozilla-aurora/20170324004022/Firefox-mozilla-aurora-54.0a2-win32-de-20170321004003-20170324004022.partial.mar?versionId=MEsbg4wmbbelCIvFBUnZB6NMbz6reZda" hashFunction="sha512" hashValue="7d126dcaedce229d7a03230f80e048b9dc091a1a29fd6425618d01959e896c0a49387f3b1cea8c287f7d5e7815d4b7776de881e608c6af4b46dd781085bfbd83" size="11158787"/>\n    </update>\n</updates>',
07:18:32     INFO -                 'update_url': u'https://aus5.mozilla.org/update/6/Firefox/54.0a2/20170321004003/WINNT_x86-msvc-x64/de/aurora/Windows_NT%206.3.0.0%20(x64)(noBug1296630v1)(nowebsense)/SSE3/default/default/update.xml?force=1',
07:18:32     INFO -                 'user_agent': u'Mozilla/5.0 (Windows NT 6.3; WOW64; rv:54.0) Gecko/20100101 Firefox/54.0',
07:18:32     INFO -                 'version': u'54.0a2'},
07:18:32     INFO -   'fallback': True,
07:18:32     INFO -   'patch': {'buildid': u'20170324004022',
07:18:32     INFO -             'channel': u'aurora',
07:18:32     INFO -             'download_duration': 2.213,
07:18:32     INFO -             'is_complete': True,
07:18:32     INFO -             'size': 53720874,
07:18:32     INFO -             'type': u'minor',
07:18:32     INFO -             'url_mirror': u'https://mozilla-nightly-updates.s3.amazonaws.com/mozilla-aurora/20170324004022/Firefox-mozilla-aurora-54.0a2-win32-de.complete.mar?versionId=7PqVcXh7_Mnu6EFXCoTtjcXlaPqpW7tj',
07:18:32     INFO -             'version': u'54.0a2'},
07:18:32     INFO -   'success': False,
07:18:32     INFO -   'update_log': ['Performing a staged update',
07:18:32     INFO -                  'PATCH DIRECTORY C:\\Users\\mozauto\\AppData\\Local\\Mozilla\\updates\\5FC44D6E9F1A2441\\updates\\0',
07:18:32     INFO -                  'INSTALLATION DIRECTORY c:\\jenkins\\workspace\\mozilla-aurora_update\\build\\application.copy',
07:18:32     INFO -                  'WORKING DIRECTORY c:\\jenkins\\workspace\\mozilla-aurora_update\\build\\application.copy\\updated',
07:18:32     INFO -                  'Update already in progress! Exiting']}]

So as it looks like there is an active update already in progress.

All this affects only Windows but not Linux and OS X. Matt, do you know of any changes for the updater on Windows which might have caused those failures? It's not always happening but with a higher frequency compared to passing tests.

Flags: needinfo?(mhowell)

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 14

•

7 years ago

Maybe this is not related to any changes for the updater but bug 1333014. So it would be a different test failure than this one. I will file a new bug once it's clearer to me.

Flags: needinfo?(mhowell)

Molly Howell (she/her) (no longer active)

Comment 15

•

7 years ago

Just for the record, I did take a look in the logs and I don't see any substantial Windows-specific changes in the updater since August (bug 1246972). The only recent change was a small fix for a moderate security issue (bug 1321814), but if that were causing problems I really don't think they would look like this failure.

[SV Manager] Florin Mezei, QA (:FlorinMezei)

Comment 16

•

7 years ago

For the record, the issue started showing today. Didn't encounter this problem for 52.0.2 and 52.0.2 ESR at all yesterday when testing on esr-localtest, or release-localtest, with these same versions. 53b7 and 52.0.2 ESR have had some serious problems today on beta-cdntest and esr-cdntest.

Note that the problem was also seen on release today, but seemed to have a diminished impact, as the tests passed on next re-runs.

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 17

•

7 years ago

This is really a very strange behavior. For now I will hijack this bug because it might still be related to what we partially see for aurora and central too.

So we download the partial update, and make it invalid. As result after a restart Firefox complains about it that it cannot be applied and a complete update has to be downloaded. When I click `Next` as usual, the is no download screen but the update is there already. In this case I see now logging info from the updater at all about this download. Which code downloads it? Clicking restart afterward to apply the update, restarts Firefox but without updating it.

What I also noticed are two separate update staging directories which are getting created. One is in `%appdata%/local/Mozilla/update/%hash%` which is empty, and there is another one in `%appdata/local/Mozilla/Firefox/firefox/update` which contains the downloaded mar files.

Something else I noticed when the restart is happening the update should be applied is that the last-update.log afterward says that it tried to apply a partial update (UPDATE TYPE partial). And at the end it says succeeded. Whatever we do here is clearly wrong.

Flags: needinfo?(mhowell)

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 18

•

7 years ago

Btw, I also tried this manually now and I can also reproduce it. We do definitely not upgrade Firefox after a restart in such a situation.

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 19

•

7 years ago

Ok, so in the manual case I might know what's going on. It may also apply to the update tests.

After downloading the partial update we apply the update immediately. A restart of Firefox would simply swap the versions, and immediately use the new one. When I now mark the update.status file as "failed:6" AFTER the update has been applied, Firefox still knows about the updated version after a restart, but doesn't use it. As such the files are remaining in that subfolder. Now trying to apply the complete mar fails because there is still an update process active.

I think that we have to change our steps in how to test fallbacks, or turn off applying updates before Firefox starts.

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 20

•

7 years ago

What I really wonder is why the heck we have all those failures now! The underlying code to handle the applying step are in our tests since ages and it never failed since recently.

Molly Howell (she/her) (no longer active)

Comment 21

•

7 years ago

There was a recent change (bug 1348609) which should have made us prefer the path "%appdata%/local/Mozilla/update/%hash%" and never use the other one. I'm not immediately certain whether a problem with that patch would be able to cause these symptoms, but it's the most recent relevant change. Plus that change wasn't uplifted, so it should only be affecting nightly. Still working on it, leaving needinfo in place.

Molly Howell (she/her) (no longer active)

Comment 22

•

7 years ago

It sounds like this is also affecting beta and ESR. I had a look through the pushlog and the only change to recently hit beta that's even vaguely related to update is bug 1350064, but that's add-on update, and it looks like it didn't even make it in for beta 7 if I'm reading the log correctly. Also it hasn't been uplifted to ESR at all. So I can't identify any app update code change that could be responsible here.

Flags: needinfo?(mhowell)

Liz Henry (:lizzard) (relman/hg->git project)

Comment 23

•

7 years ago

From talking with everyone testing here, it sounds like we think this is likely to be a test failure across several channels, and we should go ahead with the release as planned. mhowell and whimboo can keep investigating.

Rail Aliiev [:rail]

Comment 24

•

7 years ago

Comparing 2 tests for en-US b6 to b7:

green https://firefox-ui-tests.s3.amazonaws.com/6ba85c35-2ef0-4efd-ac41-f6c3f8ac937a/log_info.log

04:41:49     INFO -   'update_log': ['Performing a replace request',
04:41:49     INFO -                  'PATCH DIRECTORY C:\\Users\\mozauto\\AppData\\Local\\Mozilla\\updates\\5B85A4075EAE3983\\updates\\0',
04:41:49     INFO -                  'INSTALLATION DIRECTORY c:\\jenkins\\workspace\\ondemand_update\\build\\application.copy',
04:41:49     INFO -                  'WORKING DIRECTORY c:\\jenkins\\workspace\\ondemand_update\\build\\application.copy\\updated',
04:41:49     INFO -                  'Begin moving destDir (c:\\jenkins\\workspace\\ondemand_update\\build\\application.copy) to tmpDir (c:\\jenkins\\workspace\\ondemand_update\\build\\application.copy.bak)',
04:41:49     INFO -                  'rename_file: proceeding to rename the directory',
04:41:49     INFO -                  'Begin moving newDir (c:\\jenkins\\workspace\\ondemand_update\\build\\application.copy.bak/updated) to destDir (c:\\jenkins\\workspace\\ondemand_update\\build\\application.copy)',
04:41:49     INFO -                  'rename_file: proceeding to rename the directory',
04:41:49     INFO -                  'Now, remove the tmpDir',
04:41:49     INFO -                  'ensure_remove: failed to remove file: c:\\jenkins\\workspace\\ondemand_update\\build\\application.copy.bak/api-ms-win-crt-convert-l1-1-0.dll, rv: -1, err: 13',
04:41:49     INFO -                  'ensure_remove: failed to remove file: c:\\jenkins
......



vs failed: https://firefox-ui-tests.s3.amazonaws.com/b38b3bd5-6665-4dd1-b007-79b2de9196ef/log_info.log

00:59:39     INFO -   'update_log': ['Performing a staged update',
00:59:39     INFO -                  'PATCH DIRECTORY C:\\Users\\mozauto\\AppData\\Local\\Mozilla\\updates\\5B85A4075EAE3983\\updates\\0',
00:59:39     INFO -                  'INSTALLATION DIRECTORY c:\\jenkins\\workspace\\ondemand_update\\build\\application.copy',
00:59:39     INFO -                  'WORKING DIRECTORY c:\\jenkins\\workspace\\ondemand_update\\build\\application.copy\\updated',
00:59:39     INFO -                  'UPDATE TYPE partial',
00:59:39     INFO -                  'PREPARE PATCH xul.dll',
00:59:39     INFO -                  'PREPARE PATCH wow_helper.exe',
00:59:39     INFO -                  'PREPARE PATCH voucher.bin',

....


The latter log looks like it applies a partial MAR, but it probably shouldn't?

Molly Howell (she/her) (no longer active)

Comment 25

•

7 years ago

Those two logs are from different parts of the update procedure. The green run also starts off by applying a partial, because that's how the test works. But the failed run didn't get as far as the step that's logged in the green run's update log before failing out for still mysterious reasons.

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 26

•

7 years ago

Yeah, most likely this is the case here because we do not upgrade Firefox. So the last-update.log is not getting updated. I'm off for the rest of the week, so I cannot investigate anything until Monday.

Molly Howell (she/her) (no longer active)

Comment 27

•

7 years ago

I've been trying to repro this locally on beta, and could not. Tried both actually running the test and also performing the steps manually; no failure either way.

I do get a test failure on aurora, but I don't think it's for the same reason. What I'm seeing is that the partial MAR at
https://mozilla-nightly-updates.s3.amazonaws.com/mozilla-aurora/20170329004027/Firefox-mozilla-aurora-54.0a2-win32-en-US-20170328004003-20170329004027.partial.mar?versionId=8DuWJW3ppUzMQPjxtEuptdRO_RMmCDpP
doesn't apply cleanly. I get this in my update.log:

Performing a staged update
PATCH DIRECTORY C:\Users\Matt Howell\AppData\Local\Mozilla\updates\26E70F8ABC2D3A34\updates\0
INSTALLATION DIRECTORY C:\Users\Matt Howell\Downloads\firefox
WORKING DIRECTORY C:\Users\Matt Howell\Downloads\firefox\updated
UPDATE TYPE partial
PREPARE PATCH xul.dll
[...]
EXECUTE PATCH omni.ja
LoadSourceFile: destination file size 9799029 does not match expected size 9799042
LoadSourceFile failed
### execution failed

which means that the size of the old omni.ja wasn't the same as what the patch file said to expect. I verified myself that the size in the MAR is actually wrong, so the updater didn't parse it incorrectly, the MAR itself was somehow not generated correctly.

Molly Howell (she/her) (no longer active)

Comment 28

•

7 years ago

Note that the above does not break normal updates, because the update service immediately falls back to the complete MAR.

Nick Thomas [:nthomas] (UTC+12)

Comment 29

•

7 years ago

Some late-to-the-party suggestions - rerun the 52.0.2 tests that passed earlier to see if they fail now. This is complicated by the change to the websense rules.

For our convenience:
52.0.2 - https://treeherder.mozilla.org/#/jobs?repo=mozilla-release&revision=e81854d6ce91f3174774a50c9c5813c33b9aac58&filter-tier=1&filter-tier=2&filter-tier=3&filter-searchStr=update
53.0b7 - https://treeherder.mozilla.org/#/jobs?repo=mozilla-beta&revision=46d5fe92c82d36b922703faeb7dd3a17b5df55e1&exclusion_profile=false&filter-tier=1&filter-tier=2&filter-tier=3&filter-searchStr=update

(In reply to Henrik Skupin (:whimboo) from comment #13)
> All this affects only Windows but not Linux and OS X. Matt, do you know of
> any changes for the updater on Windows which might have caused those
> failures? It's not always happening but with a higher frequency compared to
> passing tests.

Are we using the maintenance service of Windows ? If the machines are reused for many tests, is it possible that the app and service are out of sync  ? This could lead to test-history dependent results, like a change from pass to fail if a nightly test sneaks in between the 52.0.2 and 53.0b7/52esr tests.

Molly Howell (she/her) (no longer active)

Comment 30

•

7 years ago

The logs indicate that the maintenance service is being used; it's not really needed for these tests to work, but it's probably good to have it so that we more closely simulate a "real" update, which would generally go through the service.

I don't think this is the issue though; the service and the application don't really need to be in sync by specific version, just their code signing certificates need to match up, and those haven't changed since 43.

Molly Howell (she/her) (no longer active)

Comment 31

•

7 years ago

Comment 27 was a red herring; the same fallback mechanism that rescues normal updates also kicks in to save this test.

Molly Howell (she/her) (no longer active)

Comment 32

•

7 years ago

The failure I saw earlier appears to have happened because I forgot to update my tree to aurora before switching to running the test against an aurora build. That means I do not actually have a local repro at all. :(

Molly Howell (she/her) (no longer active)

Comment 33

•

7 years ago

It looks as if after the first update is applied and the browser is shut down, there isn't any logging from the new instance, either from Marionette or gecko. That's making it pretty much impossible to tell anything from the logs, because the first update staging appears to be going fine (as in, looks identical to a passing run).

Comment hidden (Intermittent Failures Robot)

43 failures in 123 pushes (0.35 failures/push) were associated with this bug yesterday.   

Repository breakdown:
* mozilla-esr52: 29
* mozilla-beta: 14

Platform breakdown:
* windows8-32: 23
* windows7-32: 9
* windowsxp: 5
* windows8-64: 4
* windows7-64: 2

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1316564&startday=2017-03-30&endday=2017-03-30&tree=all

Julien Cristau [:jcristau]

Comment 35

•

7 years ago

[Tracking Requested - why for this release]: having our update tests routinely fail for every beta and esr build can't really go on for too long...

status-firefox53: --- → affected

status-firefox-esr52: --- → affected

tracking-firefox53: --- → ?

tracking-firefox-esr52: --- → ?

Liz Henry (:lizzard) (relman/hg->git project)

Updated

•

7 years ago

tracking-firefox53: ? → +

Comment hidden (Intermittent Failures Robot)

34 failures in 146 pushes (0.233 failures/push) were associated with this bug yesterday.   

Repository breakdown:
* mozilla-beta: 34

Platform breakdown:
* windows8-64: 19
* windows8-32: 14
* windows7-64: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1316564&startday=2017-03-31&endday=2017-03-31&tree=all

Comment hidden (Intermittent Failures Robot)

129 failures in 845 pushes (0.153 failures/push) were associated with this bug in the last 7 days. 

This is the #9 most frequent failure this week. 

** This failure happened more than 75 times this week! Resolving this bug is a very high priority. **

** Try to resolve this bug as soon as possible. If unresolved for 1 week, the affected test(s) may be disabled. **  

Repository breakdown:
* mozilla-beta: 81
* mozilla-esr52: 48

Platform breakdown:
* windows8-64: 46
* windows8-32: 43
* windows7-32: 30
* windowsxp: 5
* windows7-64: 5

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1316564&startday=2017-03-27&endday=2017-04-02&tree=all

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 38

•

7 years ago

(In reply to Nick Thomas [:nthomas] from comment #29)
> Are we using the maintenance service of Windows ? If the machines are reused
> for many tests, is it possible that the app and service are out of sync  ?

I uninstalled the maintenance service while testing this last week on one of the machines. The problem was still persistent afterward. So I agree that disabling it shouldn't be an option right now, also given that we would loose coverage.

I asked Florin to re-run updates for source builds from 53.0b1 to 53.0b4. As it looks like those never raised those issues.

What I can do later is to add some logs to the bug with only the fallback update test running and finally failing. Maybe that helps a bit more compared to the logs on Treeherder which also include the partial update tests.

Flags: needinfo?(florin.mezei)

[SV Manager] Florin Mezei, QA (:FlorinMezei)

Comment 39

•

7 years ago

(In reply to Henrik Skupin (:whimboo) from comment #38)
> I asked Florin to re-run updates for source builds from 53.0b1 to 53.0b4. As
> it looks like those never raised those issues.

I've re-run today all update tests for 53b8, on the beta channel, using only source builds < 53.0b5. There were no failures encountered whatsoever, on either Linux or Windows.

Flags: needinfo?(florin.mezei)

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 40

•

7 years ago

Interesting. Can you repeat now with b5 and b6 => b8? I would like to know if those are also working now, or if those are still affected. If the latter is the case we have to check what landed between b4 and b5. Last time we actually only checked the changelog between b6 and b7, but as it looks like the failure came in already for b5 but was somewhat not visible at this time.

Flags: needinfo?(florin.mezei)

[SV Manager] Florin Mezei, QA (:FlorinMezei)

Comment 41

•

7 years ago

(In reply to Henrik Skupin (:whimboo) from comment #40)
> Interesting. Can you repeat now with b5 and b6 => b8? I would like to know
> if those are also working now, or if those are still affected. If the latter
> is the case we have to check what landed between b4 and b5. Last time we
> actually only checked the changelog between b6 and b7, but as it looks like
> the failure came in already for b5 but was somewhat not visible at this time.

Did this just now, and got 5 failed jobs out of 44, all on Windows (not classified in Treeherder):
- Win 7, 32-bit: en-US-53.0b6, fr-53.0b5
- Win 7, 64-bit: es-MX-53.0b5
- Win 8, 32-bit: en-US-53.0b6
- Win 8, 64-bit: zh-CN-53.0b5

Treeherder run: https://treeherder.mozilla.org/#/jobs?repo=mozilla-beta&revision=0a36cc120d6e&group_state=expanded&filter-tier=3&filter-searchStr=Fxup-beta(

Flags: needinfo?(florin.mezei)

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 42

•

7 years ago

Ok, so mind giving us a link with the hg pushlog for all the changes between 53.0b4 and 53.0b5?

[SV Manager] Florin Mezei, QA (:FlorinMezei)

Comment 43

•

7 years ago

(In reply to Henrik Skupin (:whimboo) from comment #42)
> Ok, so mind giving us a link with the hg pushlog for all the changes between
> 53.0b4 and 53.0b5?

https://hg.mozilla.org/releases/mozilla-beta/pushloghtml?fromchange=484f49860df6&tochange=eb20f71a4f43

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 44

•

7 years ago

Nothing in this pushlog actually shows something suspicious to me.

So I think we have to iterate step-wise. Therefor I want to re-ask the question from comment 19. Matt, can you please check this again and tell me if we should avoid using staged updates at least in the case of fallback updates?

Flags: needinfo?(mhowell)

Molly Howell (she/her) (no longer active)

Comment 45

•

7 years ago

Re: comment 19
The updater doesn't consider it a problem when the directory it's going to try and stage into already exists; it just deletes the existing directory before it starts patching. See [0]. So I don't think that's the issue.

[0] https://dxr.mozilla.org/mozilla-central/rev/38894655c89e68bcd8f45d31a0d3005f2c2b53db/toolkit/mozapps/update/updater/updater.cpp#3366

Flags: needinfo?(mhowell)

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 46

•

7 years ago

Matt, not sure if you gave the correct link here. The code I'm referring to is the one at this stage:

1) partial update has been downloaded
2) partial update has been correctly applied 
3) update.status file got modified with `failed:6\n`
4) Firefox restarts

What happens with step 4? The partial has been successfully applied and a new subdir is still around under the installation dir which contains the updated Firefox version. So my questions:

1) Where and when is this subdir getting deleted?
2) Where is the complete patch getting downloaded? We don't see anything in the software update window. It's just there already. Maybe the download is really that fast?

Flags: needinfo?(mhowell)

Molly Howell (she/her) (no longer active)

Comment 47

•

7 years ago

(In reply to Henrik Skupin (:whimboo) from comment #46)
> 1) Where and when is this subdir getting deleted?

I was answering the theory put forward in comment 19, which ends with:
> As such the files are remaining in that subfolder. Now trying to apply the complete mar fails because there is still an update process active.

The updater binary is the only thing that ever works with the staging directory, and the first thing it does there is to either create that directory if it doesn't exist or empty it out if it does. That's the code at the link I posted. So the fact that there's still a "failed" copy in there is never a problem; it's never even really noticed.

> 2) Where is the complete patch getting downloaded? We don't see anything in
> the software update window. It's just there already. Maybe the download is
> really that fast?

It starts downloading when the initial window appears, even though you don't see a progress bar until after you click OK. But note that, because of the situation I talked about in comment 27, there's am exception to the usual flow right now just for aurora. Since it's a complete update that we're setting to failed in that case, and not a partial as it would usually be, nothing is downloaded initially (because it's the complete update that would be the fallback) until the about box is brought up again to start the manual check.

Flags: needinfo?(mhowell)

[SV Manager] Florin Mezei, QA (:FlorinMezei)

Comment 48

•

7 years ago

While running ondemand update tests on beta-cdntest for 53 Beta 9 [1], I've encountered again this issue for 5 Windows jobs:
- on the first re-build => 1 job passed, the other 4 failed
- on the second re-build - I changed the TEST_PACKAGES_URL to point to the tinderbox builds containing the fix from bug 1351335 ([2] and [3]) => all 4 tests passed

So it would seem like the fix from bug 1351335 has helped with these test failures.

[1] - https://treeherder.mozilla.org/#/jobs?repo=mozilla-beta&revision=3a2e5c6e4ed3&group_state=expanded&filter-tier=3&filter-searchStr=Fxup-beta-cdntest(
[2] - https://archive.mozilla.org/pub/firefox/tinderbox-builds/mozilla-beta-win32/1491229643/firefox-53.0.en-US.win32.test_packages.json
[3] - https://archive.mozilla.org/pub/firefox/tinderbox-builds/mozilla-beta-win64/1491229643/firefox-53.0.en-US.win64.test_packages.json

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 49

•

7 years ago

(In reply to Florin Mezei, QA (:FlorinMezei) from comment #48)
> - on the second re-build - I changed the TEST_PACKAGES_URL to point to the
> tinderbox builds containing the fix from bug 1351335 ([2] and [3]) => all 4
> tests passed
> 
> So it would seem like the fix from bug 1351335 has helped with these test
> failures.

That would be interesting because I don't really see a relationship here. So could you please retrigger those passing tests with the other test package a couple of times? Maybe 10? I would like to know the failure rate. Thanks.

[SV Manager] Florin Mezei, QA (:FlorinMezei)

Comment 50

•

7 years ago

(In reply to Henrik Skupin (:whimboo) from comment #49)
> That would be interesting because I don't really see a relationship here. So
> could you please retrigger those passing tests with the other test package a
> couple of times? Maybe 10? I would like to know the failure rate. Thanks.

I've retriggered all 4 jobs that passed with the modified TEST_PACKAGES_URL, but this time with the original value (that does not have the fix from bug 1351335), 3 times each (so a total of 12 jobs) - I got 11 failures for a failure rate of 92%.

[SV Manager] Florin Mezei, QA (:FlorinMezei)

Comment 51

•

7 years ago

In reply to Florin Mezei, QA (:FlorinMezei) from comment #48)
> So it would seem like the fix from bug 1351335 has helped with these test
> failures.

After running the ondemand update tests on the beta channel for Firefox 53 beta 9, it seems that my theory above does not stand, as I still got a bunch of failures even with the modified TEST_PACKAGES_URL. It seems that was just a very lucky coincidence on beta-cdntest, and Henrik was right to think that's not related to this issue.

Comment hidden (Intermittent Failures Robot)

45 failures in 134 pushes (0.336 failures/push) were associated with this bug yesterday.   

Repository breakdown:
* mozilla-beta: 45

Platform breakdown:
* windows8-64: 21
* windows7-32: 12
* windows8-32: 10
* windows7-64: 2

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1316564&startday=2017-04-04&endday=2017-04-04&tree=all

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 53

•

7 years ago

(In reply to Henrik Skupin (:whimboo) from comment #1)
> I think that this is a side-effect from bug 1314550 now for nightly and
> aurora builds. Underlying reason should be a possible second nightly build
> on that channel within a day, and the delayed offering for partial updates
> due to slow processing of funsize tasks (bug 1284516).
> 
> There are two things to think about:
> 
> 1. We should make the failure message more general. Which means that we do
> not always talk about watershed but just that another update got offered. We
> can then add the version and buildid of the currently updated version to at
> least being able to more easily verify that situation.
> 
> 2. Bug 1285340 which covers a different expected target build id could be
> related here or even be the underlying issue. Given that we moved the check
> for further updates up by some lines we do it first now. See
> https://hg.mozilla.org/mozilla-central/rev/fa64968dc3a1. But I'm not fully
> sure about that theory because even when our tests run late, Balrog should
> offer us an update to the most recent version immediately. Maybe those two
> affected update tests were running at a time when the new updates were close
> to be ready, so we see an overlap.
> 
> I wonder if we could simply ignore further updates in case of a target
> version and buildid have been given and those match the ones for the updated
> builds. For beta and release builds it would mean that for testing
> watersheds we do not get the failure anymore, which shows us that other
> updates are available. But I doubt that this should be a problem, and we can
> trigger a second ondemand run for watersheds only. With that we would have
> green results even - failures are always strange.
> 
> Rail and Florin, what do you think?

Given that we high-jacked this bug for the recent update test failures on beta I'm going to clone the bug to get the above done separately.

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 54

•

7 years ago

Ok, so I did a lot of testing on staging today for the latest Firefox 53.0 beta 9 and closely observed every single update test job on the Windows machine. By doing that I was able to see a substantial difference between a passing and a failing update test:

Passing test:
=============

We download the partial update, which gets applied. Then we make it invalid by storing `failed:6\n` in update.status. After a restart Firefox displays the old software update dialog with the failure message. Clicking on next closes the dialog, and opens the about window, which is used to download the complete mar file. A restart of Firefox results in a successful update.

Failing test:
=============

We download the partial update, which gets applied. Then we make it invalid by storing `failed:6\n` in update.status. After a restart Firefox also displays the old software update dialog with the failure message. Clicking on next, does NOT close the window, but keeps it open with a restart button. We somehow do not fail in not being able to interact with the about window, and restart Firefox. Strangely the update is not getting applied this way.

So in the failing case we have the situation that the complete mar file is downloaded that fast, that it is already stored locally when we hit the `next` button in the old software update dialog. See comment 17.

Also when I check the results I can see that the failure is ONLY happening when we originally serve a partial update, but not a complete one. For beta9 all source builds earlier than beta6 are getting a complete update offered only. And what works fine.

When downloading an update via the old software update window our tests are modifying the pref `app.update.altwindowtype` and set its value to the software update dialogs window type.

https://dxr.mozilla.org/mozilla-central/rev/b043233ec04f06768d59dcdfb9e928142280f3cc/testing/firefox-ui/harness/firefox_ui_harness/testcases.py#187

In line 217 we check if the update is already downloaded which would be the case in the above situation, and we return to `download_and_apply_forced_update()` after resetting the preference. Finally the restart is triggered, still not via the restart button due to issues with builds < Firefox 52 (see bug 1304656 comment 7).

Matt, do you see anything obviously wrong with those lines of code, or using this preferences?

Flags: needinfo?(mhowell)

Molly Howell (she/her) (no longer active)

Comment 55

•

7 years ago

Thanks for doing what must have been a really tedious analysis to get that info. Using it, I've reproduced _a_ bug locally, but I'm not certain if it's the same one that you saw. Here's the process I'm seeing:

1) Starting from 53.0b8, update via the about window and close Firefox instead of clicking the restart button. This results in a partial update being successfully staged.
2) Overwrite the update.status file with "failed: 6".
3) Start the browser. Observe the "Update Failed" dialog, but don't click OK yet. The complete update downloads and stages while this window is showing.
4) Click OK, but only _after_ the update is downloaded and staged.
5) The progress window that appears next is broken; it hangs on "Connecting to the update server..."
6) Open the About window and click the restart button that is already showing there.
7) Observe that the update has not been applied.

The reason the update hasn't been applied is that step 4 causes the update.status file to be overwritten with "null". This is clearly a bug.

Henrik, from what you observed, does it seem plausible that the download of the complete update in step 3 is running so fast that it's done before the OK button gets clicked? If so, that would seem to explain the failures.

Right now I don't see any reason to think that the altwindowtype pref is causing trouble; it wouldn't affect the about window, and I'm not setting it for my manual tests.

Flags: needinfo?(mhowell) → needinfo?(hskupin)

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 56

•

7 years ago

What I also noticed is that the fallback update tests for initial partial mars are passing if the complete update is not getting downloaded that fast! When the software update dialog shows the progress bar it works all fine. It only fails when the MAR is already in the cache.

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 57

•

7 years ago

(In reply to Matt Howell [:mhowell] from comment #55)
> Henrik, from what you observed, does it seem plausible that the download of
> the complete update in step 3 is running so fast that it's done before the
> OK button gets clicked? If so, that would seem to explain the failures.

Yes, that is exactly what I see on our Windows boxes when the test failure is happening! The situation might not occur in the wild but for our tests its annoying. Knowing this problem, might you know why it has been regressed lately? I assume you have to do more investigation before you can give an answer.

Flags: needinfo?(hskupin)

Molly Howell (she/her) (no longer active)

Comment 58

•

7 years ago

I'm not certain why this would have regressed recently. I don't see any recent enough changes to the updater UI/frontend code, except things like eslint fixes that shouldn't make any difference. There could have been a change somewhere else that ended up altering the caching behavior we get so as to make the download much faster, or something like that.

If we're right about this being the cause, then I think I have a patch that should fix it, by skipping the progress page that's not expecting to find an update already staged and going straight to the restart prompt instead. I'll post that shortly.

Comment hidden (mozreview-request)

The update wizard's fallback mode (where it appears on startup to download and
apply a complete update if applying a partial update has failed) begins to
download and stage the update without waiting for the user to confirm. This
means it can get ahead of itself and actually finish downloading and staging
before the user can click OK to bring up the progress page. The progress page
does not expect this and errors out in a way that breaks the otherwise valid
update. This change therefore skips over that page if the update is finished
staging when it would normally be shown.

Review commit: https://reviewboard.mozilla.org/r/126970/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/126970/

Molly Howell (she/her) (no longer active)

Updated

•

7 years ago

Attachment #8855063 - Attachment is obsolete: true

Attachment #8855063 - Flags: review?(robert.strong.bugs)

Molly Howell (she/her) (no longer active)

Comment 60

•

7 years ago

Filing a new bug for the above patch, since it may turn out to be unrelated to this test failure.

Molly Howell (she/her) (no longer active)

Updated

•

7 years ago

Depends on: 1353917

Molly Howell (she/her) (no longer active)

Comment 61

•

7 years ago

The patch discussed above (bug 1353917) is currently on autoland, so it should be in tonight's nightly. Henrik, what needs to happen to be able to verify whether that patch makes a difference for this failure? I don't feel great about just pushing it to beta and ESR before we actually know it fixes anything.

Flags: needinfo?(hskupin)

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 62

•

7 years ago

If it will be in the next nightly build on central, we would hopefully see a difference on Saturday (+2 days), because we need this nightly as source build, and the over-next nightly as target build.

On central we only run en-US builds for updates, which means only 4 updates per platform version. But on Aurora we have 5 locales, which means 20 updates per platform version. I would suggest we get it uplifted to aurora and re-check everything on Monday. If the watershed test failures have stopped on central and aurora it should be a good indication. But as said earlier, I'm not sure if those are identical. Anyway... if all works fine on both channels over the weekend, I would have a good feeling to get it uplifted to beta. Do you have specific unit tests too, which can check that?

Flags: needinfo?(hskupin)

Molly Howell (she/her) (no longer active)

Comment 63

•

7 years ago

We do have unit tests that cover this code. But of course those are just unit tests, they don't verify the entire fallback procedure.

On your suggestion, I'll request aurora uplift, and we can see what we've got on Monday.

Comment hidden (Intermittent Failures Robot)

59 failures in 867 pushes (0.068 failures/push) were associated with this bug in the last 7 days. 

This is the #27 most frequent failure this week.  

** This failure happened more than 30 times this week! Resolving this bug is a high priority. **

** Try to resolve this bug as soon as possible. If unresolved for 2 weeks, the affected test(s) may be disabled. ** 

Repository breakdown:
* mozilla-beta: 45
* mozilla-aurora: 10
* mozilla-central: 4

Platform breakdown:
* windows8-64: 34
* windows7-32: 13
* windows8-32: 10
* windows7-64: 2

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1316564&startday=2017-04-03&endday=2017-04-09&tree=all

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 65

•

7 years ago

I think something else we should do here is a check that Firefox has really been upgraded after the restart. This would have given us a way better failure message and lesser confusion. I will try to get a test for this added today.

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Updated

•

7 years ago

Depends on: 1355009

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 66

•

7 years ago

(In reply to Henrik Skupin (:whimboo) from comment #53)
> Given that we high-jacked this bug for the recent update test failures on
> beta I'm going to clone the bug to get the above done separately.

Missed to add that this is bug 1353717 now.

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 67

•

7 years ago

I had a look at Orange Factor for this bug and this is how it looks like:

https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1316564&startday=2016-10-01&endday=2017-04-12&tree=all

So it started November 5th, and recently has a huge spike because those failures are visible on beta. The first failing test was en-US-2, which means the source build was from November 3rd.

Checking a couple of test failures for beta leads all to bug 1355818 which seems to show the exact same underlying issue. It's that the staged update is not correctly applied during a restart of Firefox.

I think we should keep this bug open until the first beta update tests for Firefox 54.0. I strongly believe that we can close it by then and track everything via bug 1355818.

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 68

•

7 years ago

The changelog is:

https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=3e73fd638e687a4d7f46613586e5156b8e2af846&tochange=ade8d4a63e57560410de106450f37b50ed71cca5

So there is bug 1314550 which exactly added this assertion check. But at this time we didn't have issues with applying the update during a restart.

There is also bug 1304656, but it got backed out and never landed again so far. 

Otherwise I don't see anything obvious. The application updater changes are mostly es-lint and are looking fine.

Comment hidden (Intermittent Failures Robot)

59 failures in 154 pushes (0.383 failures/push) were associated with this bug yesterday.   

Repository breakdown:
* mozilla-release: 56
* mozilla-aurora: 3

Platform breakdown:
* windows7-32: 18
* windows8-64: 16
* windows8-32: 15
* windows7-64: 10

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1316564&startday=2017-04-12&endday=2017-04-12&tree=all

Comment hidden (Intermittent Failures Robot)

102 failures in 205 pushes (0.498 failures/push) were associated with this bug yesterday.   

Repository breakdown:
* mozilla-release: 102

Platform breakdown:
* windows8-32: 37
* windows7-32: 36
* windows8-64: 19
* windows7-64: 10

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1316564&startday=2017-04-13&endday=2017-04-13&tree=all

Comment hidden (Intermittent Failures Robot)

182 failures in 894 pushes (0.204 failures/push) were associated with this bug in the last 7 days. 

This is the #6 most frequent failure this week. 

** This failure happened more than 75 times this week! Resolving this bug is a very high priority. **

** Try to resolve this bug as soon as possible. If unresolved for 1 week, the affected test(s) may be disabled. **  

Repository breakdown:
* mozilla-release: 169
* mozilla-aurora: 6
* mozilla-beta: 4
* mozilla-central: 2
* mozilla-esr52: 1

Platform breakdown:
* windows7-32: 61
* windows8-32: 57
* windows8-64: 39
* windows7-64: 25

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1316564&startday=2017-04-10&endday=2017-04-16&tree=all

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Updated

•

7 years ago

Summary: Intermittent test_direct_update.py TestDirectUpdate.test_update, test_fallback_update.py TestFallbackUpdate.test_update | AssertionError: Additional update found due to watershed release 51.0a2 → [tier-3] Intermittent test_direct_update.py TestDirectUpdate.test_update, test_fallback_update.py TestFallbackUpdate.test_update | AssertionError: Additional update found due to watershed release 51.0a2

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 72

•

7 years ago

The real underlying failure (bug 1355818) was hidden by this wrongly placed assertion. So the patch on bug 1355009 fixed that.

status-firefox53: affected → wontfix

status-firefox54: --- → fixed

status-firefox55: --- → fixed

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Updated

•

7 years ago

Assignee: nobody → hskupin

Status: NEW → RESOLVED

Closed: 7 years ago

Resolution: --- → FIXED

Comment hidden (Intermittent Failures Robot)

16 failures in 186 pushes (0.086 failures/push) were associated with this bug yesterday.   

Repository breakdown:
* mozilla-esr52: 11
* mozilla-release: 5

Platform breakdown:
* windows8-32: 5
* windowsxp: 4
* windows7-32: 4
* windows8-64: 3

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1316564&startday=2017-04-19&endday=2017-04-19&tree=all

Comment hidden (Intermittent Failures Robot)

25 failures in 817 pushes (0.031 failures/push) were associated with this bug in the last 7 days.   

Repository breakdown:
* mozilla-esr52: 20
* mozilla-release: 5

Platform breakdown:
* windowsxp: 11
* windows7-32: 6
* windows8-32: 5
* windows8-64: 3

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1316564&startday=2017-04-17&endday=2017-04-23&tree=all

Ryan VanderMeulen [:RyanVM]

Comment 75

•

7 years ago

Do we need to do something for ESR52 here still?

Flags: needinfo?(hskupin)

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 76

•

7 years ago

(In reply to Ryan VanderMeulen [:RyanVM] from comment #75)
> Do we need to do something for ESR52 here still?

No, the patch which fixed that got already landed on esr52 via bug 1355009.

status-firefox-esr52: affected → fixed

tracking-firefox-esr52: ? → ---

Flags: needinfo?(hskupin)

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Updated

•

7 years ago

Target Milestone: --- → mozilla55

Comment hidden (Intermittent Failures Robot)

96 failures in 150 pushes (0.64 failures/push) were associated with this bug yesterday.   

Repository breakdown:
* mozilla-release: 55
* mozilla-esr52: 41

Platform breakdown:
* windows8-64: 35
* windows7-64: 21
* windows8-32: 19
* windowsxp: 14
* windows7-32: 7

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1316564&startday=2017-05-05&endday=2017-05-05&tree=all

Comment hidden (Intermittent Failures Robot)

98 failures in 770 pushes (0.127 failures/push) were associated with this bug in the last 7 days. 

This is the #12 most frequent failure this week. 

** This failure happened more than 75 times this week! Resolving this bug is a very high priority. **

** Try to resolve this bug as soon as possible. If unresolved for 1 week, the affected test(s) may be disabled. **  

Repository breakdown:
* mozilla-release: 56
* mozilla-esr52: 42

Platform breakdown:
* windows8-64: 32
* windows7-64: 21
* windows8-32: 20
* windowsxp: 18
* windows7-32: 7

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1316564&startday=2017-05-01&endday=2017-05-07&tree=all

Comment hidden (Intermittent Failures Robot)

19 failures in 166 pushes (0.114 failures/push) were associated with this bug yesterday.   

Repository breakdown:
* mozilla-esr52: 19

Platform breakdown:
* windowsxp: 19

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1316564&startday=2017-05-18&endday=2017-05-18&tree=all

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 80

•

7 years ago

What I don't understand is why this is still a remaining failure for esr52 and especially only for Windows XP. This specific failure message doesn't exist anymore on that branch. So where are those boxes getting this failure from?

When I check the following failure:

https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-esr52&job_id=100093785

I can see in the mozmill-ci job that the test package refers to:

https://archive.mozilla.org/pub/firefox/tinderbox-builds/mozilla-esr52-win32/1495049059/firefox-52.1.2.en-US.win32.test_packages.json

Which then downloads:

https://archive.mozilla.org/pub/firefox/tinderbox-builds/mozilla-esr52-win32/1495049059/firefox-52.1.2.en-US.win32.common.tests.zip

> % grep -r "watershed" tests/firefox-ui/
> tests/firefox-ui//harness/firefox_ui_harness/testcases.py:            # Bug 604364 - We do not support watershed releases yet.
> tests/firefox-ui//harness/firefox_ui_harness/testcases.py:                             'Additional update found due to watershed release {}'.format(

What? Why does this package contain old data? 

I will file a new bug to get this covered.

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Updated

•

7 years ago

Depends on: 1366173

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 81

•

7 years ago

We shouldn't see this problem with the 52.2.0 release anymore. But keep in mind that it will be present if another security fix release is shipped from the 52.1.x relbranch.

Status: REOPENED → RESOLVED

Closed: 7 years ago → 7 years ago

Resolution: --- → FIXED

Comment hidden (Intermittent Failures Robot)

25 failures in 146 pushes (0.171 failures/push) were associated with this bug yesterday.   

Repository breakdown:
* mozilla-esr52: 24
* mozilla-release: 1

Platform breakdown:
* windowsxp: 7
* windows8-32: 7
* windows7-32: 7
* windows7-64: 3
* windows8-64: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1316564&startday=2017-05-19&endday=2017-05-19&tree=all

Comment hidden (Intermittent Failures Robot)

41 failures in 777 pushes (0.053 failures/push) were associated with this bug in the last 7 days. 

This is the #24 most frequent failure this week.  

** This failure happened more than 30 times this week! Resolving this bug is a high priority. **

** Try to resolve this bug as soon as possible. If unresolved for 2 weeks, the affected test(s) may be disabled. ** 

Repository breakdown:
* mozilla-esr52: 39
* mozilla-release: 2

Platform breakdown:
* windowsxp: 25
* windows8-32: 6
* windows7-32: 6
* windows7-64: 3
* windows8-64: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1316564&startday=2017-05-15&endday=2017-05-21&tree=all

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 84

•

7 years ago

We also landed it on the 52.1.x relbranch. So it should be completely fixed even if there is another security release coming up for esr52.