Closed Bug 1450690 Opened 6 years ago Closed 6 years ago

The telemetry event is not created for the opt-out-study enrollment

Categories

(Firefox :: Normandy Client, defect, P1)

defect

Tracking

()

VERIFIED FIXED
Firefox 61
Tracking Status
firefox-esr52 --- unaffected
firefox59 --- unaffected
firefox60 blocking verified
firefox61 blocking verified

People

(Reporter: JuliaC, Assigned: mythmon)

References

(Blocks 1 open bug)

Details

(Keywords: qablocker, regression)

Attachments

(1 file)

[Affected versions]:
- 61.0a1 (2018-04-02)
- 60.0b8 build1 (20180329154119)

[Affected platforms]:
- Windows 10 x64
- Ubuntu 16.04 x64
- macOS 10.13.3

[Preconditions]
- You need access to SHIELD Control Panel ( https://normandy-admin.stage.mozaws.net/ )
- Set the 'app.normandy.dev_mode' preference to 'true' to run recipes immediately on startup.
- Set the 'app.normandy.logging.level' preference to '0' to enable more logging.
- Set the 'security.content.signature.root_hash' preference to 'DB:74:CE:58:E4:F9:D0:9E:E0:42:36:BE:6C:C5:C4:F6:6A:E7:74:7D:C0:21:42:7A:03:BC:2F:57:0C:8B:9B:90'.
- Set the preference value for 'app.normandy.api_url' to 'https://normandy.stage.mozaws.net/api/v1'.

[Steps to reproduce]:
1. Open the SHIELD Control Panel.
2. Create, approve and publish a new opt-out-study recipe. 
3. Launch Firefox having the pre-requisites set and open:
   - browser console
   - about:studies
   - about:telemetry (locate the Events tab and choose "dynamic" in top right corner dropdown)
   - about:addons
  
[Expected result]:
- [step 3] - Browser console will show that the recipe has been successfully fetched and executed
           - about:studies will list the current study as active
           - about:telemetry will display the details about the current event
           - The chosen extension is downloaded, installed and listed as enabled in about:addons 

[Actual result]:
- [step 3] - The recipe is displayed as fetched and executed, but the 'Invalid extra key for event ["normandy", "enroll", "addon_study"].  (unknown)'  JS warning is displayed in Browser Console (see the screenshot https://drive.google.com/file/d/1BmOmdUbHXqJPm3ideuiBwZqS3ZezMX3q/view?usp=sharing) 
           - The current study is displayed as active in about:studies
           - There is no event created in about:telemetry for the current opt-out-study
           - The extension is successfully installed and enabled 

[Regression range]:
- The issue was introduced on 60.0a1 (2018-03-07) 
- It seems that it was regressed by bug 1439519

[Additional notes]
- If using a corrupt extension, the unenrollment event is properly created in about:telemetry
Gijs, does any of this sound familiar to you? I looked through the patch in bug 1439519 (supposedly the regressing patch), and I don't see anything related to Normandy. I also looked through the current state of the Normandy code, and don't see any issues. I verified in STMO that we don't have any events after 2018-03-07 like Adrian said. I'll keep digging on the Normandy side of this.
Flags: needinfo?(gijskruitbosch+bugs)
Can you point me to where/how Normandy informs telemetry about the extra events it uses (ie registers the types of events it will later send event telemetry for) ? It sounds like that bit is somehow not working / getting clobbered, so that by the time normandy then sends the events, they're not around anymore.

This is a little worrying because this is also affecting beta.

bug 1439519 made it so telemetry doesn't ask for blocklist information until after the blocklist is loaded (rather than forcing it to load). It means some of telemetry might be slower to finish starting up, but I don't think it should affect telemetry events...
Flags: needinfo?(gijskruitbosch+bugs) → needinfo?(mcooper)
[Tracking Requested - why for this release]: This is affecting the telemetry event creation for the opt-out-studies enrollment process.
Severity: normal → major
The actual culprit is more likely bug 1443587 (landed the same day) where I r+ed a micro-optimization I should've looked at more closely.

It appears as though since March 7 we've not allowed any `extra_keys` on any event that wasn't the first method&object combination registered per "event data" entry in `registerEvents`

So, for Normandy's events here:

    Services.telemetry.registerEvents(TELEMETRY_CATEGORY, {
      enroll: {
        methods: ["enroll"],
        objects: ["preference_study", "addon_study"],
        extra_keys: ["experimentType", "branch", "addonId", "addonVersion"],
        record_on_release: true,
      },

      enroll_failure: {
        methods: ["enrollFailed"],
        objects: ["addon_study"],
        extra_keys: ["reason"],
        record_on_release: true,
      },

      unenroll: {
        methods: ["unenroll"],
        objects: ["preference_study", "addon_study"],
        extra_keys: ["reason", "didResetValue", "addonId", "addonVersion"],
        record_on_release: true,
      },
    });

normandy.{enroll|unenroll}.addon_study has no supported extra keys (so any events recorded with any extra keys will be rejected). But normandy.{enroll|unenroll}.preference_study and normandy.enrollFailed.addon_study have some valid extra keys because they are the first method&objects registered in that event data item.

Adding to the problem: the test that should catch this (test_TelemetryEvent.js' test_dynamicEvents) didn't have an assert for this case and also wasn't actually being run.

So, this is my bug and I'll fix it rightaway.
Assignee: nobody → chutten
Status: NEW → ASSIGNED
Priority: -- → P1
Comment on attachment 8964657 [details]
bug 1450690 - Ensure extra_keys applies to more than just the first dynamic Telemetry Event

https://reviewboard.mozilla.org/r/233356/#review238948

Thank you for cleaning up after my stupidity.
Attachment #8964657 - Flags: review?(nfroyd) → review+
Comment on attachment 8964657 [details]
bug 1450690 - Ensure extra_keys applies to more than just the first dynamic Telemetry Event

https://reviewboard.mozilla.org/r/233356/#review238962

(With apologies for the driveby. I was a bit confused about comment 4 saying the test wasn't being run, so I looked at the patch...)

::: toolkit/components/telemetry/tests/unit/test_TelemetryEvents.js:300
(Diff revision 1)
>    Assert.equal(events.length, 2, "Should have recorded 2 events.");
>    Assert.equal(events[0][4], value, "Should have recorded the right value.");
>    Assert.equal(events[1][5].key1, value, "Should have recorded the right extra value.");
>  });
>  
> -add_task(function* test_dynamicEvents() {
> +add_task(async function test_dynamicEvents() {

Is this why the test "wasn't being run" (comment 4)? If so, should the patch make the same change to the other functions in this file?
Clearing ni for mythmon given the comments + patch.

IuliaC: in future, could you please include pushlog information when filing regressions? I assumed that the pushlog for the regression positively identified a single bug, but it now sounds like that was not the case...
Flags: needinfo?(mcooper)
(In reply to :Gijs (he/him) from comment #8)
> Clearing ni for mythmon given the comments + patch.
> 
> IuliaC: in future, could you please include pushlog information when filing
> regressions? I assumed that the pushlog for the regression positively
> identified a single bug, but it now sounds like that was not the case...

The mozregression builds have the telemetry disabled, so the regression info was based on manually narrowing down possible candidates for this issue. As I mentioned in comment 0, bug 1439519 _seemed_ to be one of them.
Comment on attachment 8964657 [details]
bug 1450690 - Ensure extra_keys applies to more than just the first dynamic Telemetry Event

https://reviewboard.mozilla.org/r/233356/#review239206

::: toolkit/components/telemetry/tests/unit/test_TelemetryEvents.js:300
(Diff revision 1)
>    Assert.equal(events.length, 2, "Should have recorded 2 events.");
>    Assert.equal(events[0][4], value, "Should have recorded the right value.");
>    Assert.equal(events[1][5].key1, value, "Should have recorded the right extra value.");
>  });
>  
> -add_task(function* test_dynamicEvents() {
> +add_task(async function test_dynamicEvents() {

Wow, looks like that's the reason. But how is that even possible? We have many tests that are defined that way. [See here](https://searchfox.org/mozilla-central/search?q=add_task(function*&case=false&regexp=false&path=).
(In reply to Iulia Cristescu, QA [:JuliaC] from comment #9)
> (In reply to :Gijs (he/him) from comment #8)
> > Clearing ni for mythmon given the comments + patch.
> > 
> > IuliaC: in future, could you please include pushlog information when filing
> > regressions? I assumed that the pushlog for the regression positively
> > identified a single bug, but it now sounds like that was not the case...
> 
> The mozregression builds have the telemetry disabled, so the regression info
> was based on manually narrowing down possible candidates for this issue. As
> I mentioned in comment 0, bug 1439519 _seemed_ to be one of them.

OK, that makes sense, but in that case (in future) you can manually create a pushlog URL for the dates to which you narrowed down the issue. :-)
Comment on attachment 8964657 [details]
bug 1450690 - Ensure extra_keys applies to more than just the first dynamic Telemetry Event

https://reviewboard.mozilla.org/r/233356/#review238962

> Is this why the test "wasn't being run" (comment 4)? If so, should the patch make the same change to the other functions in this file?

I'd rather keep this small for ease of uplift. Filing a bug for figuring out why generator-function xpcshell tests aren't being run is on my list :S
Comment on attachment 8964657 [details]
bug 1450690 - Ensure extra_keys applies to more than just the first dynamic Telemetry Event

https://reviewboard.mozilla.org/r/233356/#review239206

> Wow, looks like that's the reason. But how is that even possible? We have many tests that are defined that way. [See here](https://searchfox.org/mozilla-central/search?q=add_task(function*&case=false&regexp=false&path=).

Generator functions are supported in mochitests, but near as I can tell they aren't working in xpcshell. ...or maybe it's just generator functions without any `yield` in them? Beyond the scope of this bug.
Comment on attachment 8964657 [details]
bug 1450690 - Ensure extra_keys applies to more than just the first dynamic Telemetry Event

https://reviewboard.mozilla.org/r/233356/#review239252
Attachment #8964657 - Flags: review?(alessio.placitelli) → review+
Comment on attachment 8964657 [details]
bug 1450690 - Ensure extra_keys applies to more than just the first dynamic Telemetry Event

https://reviewboard.mozilla.org/r/233356/#review239206

> Generator functions are supported in mochitests, but near as I can tell they aren't working in xpcshell. ...or maybe it's just generator functions without any `yield` in them? Beyond the scope of this bug.

Still there's [quite a few](https://searchfox.org/mozilla-central/search?q=add_task(function*&case=false&regexp=false&path=unit) of them :(
Pushed by chutten@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/c0a3a4b5e11c
Ensure extra_keys applies to more than just the first dynamic Telemetry Event r=Dexter,froydnj
For those interested, I've filed bug 1451333 to investigate the generator/async thing. It's currently in Toolkit::Telemetry since ours are the only confirmed tests that are having this problem.
https://hg.mozilla.org/mozilla-central/rev/c0a3a4b5e11c
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla61
:JuliaC would you be able to verify this fix on the next Nightly to see if this takes care of the problem?
Flags: needinfo?(iulia.cristescu)
(In reply to Chris H-C :chutten from comment #19)
> :JuliaC would you be able to verify this fix on the next Nightly to see if
> this takes care of the problem?

Sure, I will confirm the fix as soon as it is available. Keeping the ni until then.
Back with updates related to the fix that just got uplifted to the latest Nightly 61.0a1 (2018-04-05) build. The investigation took part across platforms (Windows 10 x64, Ubuntu 16.04 x86 and macOS 10.13.3).
Although all the active recipes are successfully fetched and executed in Browser Console:
- the main issue is still reproducible (the corresponding events for the opt-out-studies enrollment are not yet displayed in about:telemetry)
- more than that, the unenrollment events for the corrupt extensions are not displayed anymore in about:telemetry
- the 'Invalid extra key for event ["normandy", "enroll", "addon_study"].'  JS warning is not displayed anymore in Browser Console
These being said, Firefox 61 can be reconsidered as affected.
Status: RESOLVED → REOPENED
Flags: needinfo?(iulia.cristescu)
Resolution: FIXED → ---
I forgot to mention another side effect: the about:studies page is showing that "You have not participated in any studies", even though there are currently active studies taking place (as Browser Console also states).
Flags: needinfo?(chutten)
Target Milestone: mozilla61 → ---
Comment on attachment 8964657 [details]
bug 1450690 - Ensure extra_keys applies to more than just the first dynamic Telemetry Event

Approval Request Comment
[Feature/Bug causing the regression]: bug 1443587 
[User impact if declined]: 
Some Normandy events will not be able to be recorded.
Users enrolled in studies will be miscounted.
[Is this code covered by automated tests?]:
Not the Normandy parts, but now the Telemetry Events code is.
[Has the fix been verified in Nightly?]:
The fix fixes the rejection.
[Needs manual test from QE? If yes, steps to reproduce]: 
Nope, now covered under automated Telemetry Events testing.
[List of other uplifts needed for the feature/fix]:
None for now, though this bug will remain open for future investigations.
[Is the change risky?]:
No
[Why is the change risky/not risky?]:
It is a clean revert of an incorrect optimization, returning code to a previously-shipping state. 
And it adds tests.
[String changes made/needed]:
None.
Attachment #8964657 - Flags: approval-mozilla-beta?
I am unable to investigate this further as I do not have access to normandy stage or the tools and knowledge necessary to continue investigating.
Assignee: chutten → nobody
Component: Telemetry → Normandy Client
Flags: needinfo?(chutten)
Product: Toolkit → Firefox
(In reply to Iulia Cristescu, QA [:JuliaC] from comment #21)
> Back with updates related to the fix that just got uplifted to the latest
> Nightly 61.0a1 (2018-04-05) build. The investigation took part across
> platforms (Windows 10 x64, Ubuntu 16.04 x86 and macOS 10.13.3).
> Although all the active recipes are successfully fetched and executed in
> Browser Console:
> - the main issue is still reproducible (the corresponding events for the
> opt-out-studies enrollment are not yet displayed in about:telemetry)
> - more than that, the unenrollment events for the corrupt extensions are not
> displayed anymore in about:telemetry
> - the 'Invalid extra key for event ["normandy", "enroll", "addon_study"].' 
> JS warning is not displayed anymore in Browser Console
> These being said, Firefox 61 can be reconsidered as affected.

I just had an idea. The events that normandy uses are dynamic events (registered at runtime) so they won't appear in the Events section of about:telemetry unless you change the process you're looking at (the SELECT at the top-right) to "dynamic" (it defaults to "parent")

Is it possible that the events are present but hidden by poor interface design?
Flags: needinfo?(iulia.cristescu)
(In reply to Chris H-C :chutten from comment #25)
> 
> I just had an idea. The events that normandy uses are dynamic events
> (registered at runtime) so they won't appear in the Events section of
> about:telemetry unless you change the process you're looking at (the SELECT
> at the top-right) to "dynamic" (it defaults to "parent")
> 
> Is it possible that the events are present but hidden by poor interface
> design?

Indeed, they are dynamic events, but we already did this (used the "dynamic" section) every time we tested the about:telemetry stuff.
What is the actual plan here? Is it safe that this fix to be uplifted in this condition?
Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Flags: needinfo?(iulia.cristescu) → needinfo?(chutten)
Resolution: --- → INVALID
Status: RESOLVED → REOPENED
Resolution: INVALID → ---
Sorry for the involuntary status change!
I hope to have the fix uplifted as it fixes a real problem (that extra_keys don't work as advertised) and has some good test coverage. The problem it fixes was created in 60, so I'd like it to ship in 60.

As for what to do from here for this bug itself, I'm not sure. I don't have much Normandy domain knowledge (I'm only this active on the bug because I was poking around the Telemetry Events side), and I don't have access to normandy stage to try and reproduce it locally.

With :mythmon on PTO I'm not sure where to take this... +Cc :lonnen and :mostlygeek in hopes they'll know.
Flags: needinfo?(chutten)
I'll sync up with :rdalal and we will ping :chutten about helping him test it.
Comment on attachment 8964657 [details]
bug 1450690 - Ensure extra_keys applies to more than just the first dynamic Telemetry Event

we can probably uplift this to beta and work on sorting out verification in parallel.  approved for 60.0b12
Attachment #8964657 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Julien and I discussed this and decided to hold off on this for b12 until the unresolved questions in the bug are better answered, lest we end up putting Beta in a worse-off state.

https://hg.mozilla.org/releases/mozilla-beta/rev/64eeb41e2292
At Iulia's request, I created a Try build with bug 1443587 backed out on top of Beta for them to test. It was a bit non-trivial due to conflicts, but seems to pass tests OK. It's another option for 60 at least if we can't find a better path forward otherwise.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=54af3a74fd50578f2ea46923677ccc9ae0da6459
Blocks: 1443587
(In reply to Ryan VanderMeulen [:RyanVM] from comment #33)
> At Iulia's request, I created a Try build with bug 1443587 backed out on top
> of Beta for them to test. It was a bit non-trivial due to conflicts, but
> seems to pass tests OK. It's another option for 60 at least if we can't find
> a better path forward otherwise.
> 
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=54af3a74fd50578f2ea46923677ccc9ae0da6459

Managed to test this try build on Windows 10 x64 and Ubuntu 16.04 x64 for opt-out-study enrollment and unenrollment processes. We can confirm that in this case: 
- the opt-out-study recipes are successfully fetched and executed in Browser Console
- the corresponding extensions are downloaded, installed and listed as enabled in about:addons
- all the active studies are properly displayed in about:studies
- the opt-out-study enrollment and unenrollment processes properly generates events in about:telemetry  
- the 'Invalid extra key for event ["normandy", "enroll", "addon_study"].  (unknown)'  JS warning is not displayed anymore in Browser Console

The server validation tests are also in progress, but their results will not be ready till Monday.
We're in our last full week of beta for 60 at this point (we build the RC on Monday). I'm inclined to land the backout patch on Beta at this point to get us back to a known-good state for 60 while we continue to work through the remaining inconsistencies for 61+. Does that sound reasonable to you, Michael?
Flags: needinfo?(mcooper)
Whoops, there's actually two more weeks until we build the RC. My question remains still, though :). I still think the backout is the lowest-risk option for 60 at this point.
As an update to comment 34, the server validation tests that were started on 04-13-2018 have the following state in this moment: 
- the Windows 10 client IDs are successfully listed in sql for the pref experiments enrollment and unenrollment processes
- sql displays no entries for the Windows 10 opt-out-studies enrollment and unenrollment processes 
- sql displays no entries for the macOS 10.13.3. and Ubuntu 16.04 pref experiments & opt-out-studies enrollment and unenrollment processes 
The second and the third results may be caused by some server delays (e.g. the one from bug 1453619). I will re-execute the queries to get some extra results by tomorrow EOD. 
I keep the track for above mentioned tests on this gdoc https://docs.google.com/document/d/1YdGr-TS_0LOPIAlNCF_HFHZSHloBmBlBMpvGsz6J1NM/edit?usp=sharing.
I think backing out the micro-optimization makes sense. It seems to fix the problem, and if the optimization is micro, it shouldn't be a large impact to back it out. That being said, Normandy is a victim here, not actually related to the change, so I can't comment about if this is the right solution, only that it is *a* solution.
Flags: needinfo?(mcooper)
I double-checked with froydnj and chutten and everyone is on board with the backout option for Beta60 at this point, so let's do that for today's 60.0b13 build. Note that I'm *not* backing out from trunk right now, so we'll still need to continue the investigation here for 61+.

https://hg.mozilla.org/releases/mozilla-beta/rev/1979fbb0fb9364067a2ae0bb2201166f1ef3c418
Comment on attachment 8964657 [details]
bug 1450690 - Ensure extra_keys applies to more than just the first dynamic Telemetry Event

clearing beta approval since this was never actually uplifted.
Attachment #8964657 - Flags: approval-mozilla-beta+
At this point, it's not clear to me who owns the next step here for Fx61, but this needs an active owner.
Severity: major → critical
Flags: needinfo?(nfroyd)
Flags: needinfo?(mcooper)
Flags: needinfo?(chutten)
See Also: → 1455096
The next step for Firefox Telemetry Team is to repair Telemetry.recordEvent for dynamic events. I've filed bug 1455096 for getting that patch landed for 61.
Flags: needinfo?(chutten)
I don't understand what comment 21 and comment 22 are expressing, and like comment 24, I think I'm out of my depth as far as investigating, not having normandy access at the very least.  Is it possible that we can write a xpcshell-only test to express what comment 0 is seeing?

Did you mean to reset the status for fx60?
Flags: needinfo?(nfroyd)
(In reply to Nathan Froyd [:froydnj] from comment #43)
> Did you mean to reset the status for fx60?

Yes, we fixed this for Fx60 by backing out bug 1443587. Commment 34 and 37 confirmed that backing it out on Beta got us back into a known-good state (and I'm sure Iulia will eventually follow-up in this bug after testing 60.0b13 as well).
Depends on: 1455096
I'm not driving this, but I'm available to help with the Normandy bits to reproduce problems or verify fixes.
Flags: needinfo?(mcooper)
I've nailed it down to AddonStudies.start throwing an exception while trying to add a study to the database. Putting log.warn(err) in the top of catch(err) prints:

InvalidStateError: A mutation operation was attempted on a database that did not allow mutations. JS Stack trace: transaction@IndexedDB.jsm:289:39
objectStore@IndexedDB.jsm:317:23
getStore@AddonStudies.jsm:87:10
start@AddonStudies.jsm:273:13
Async*wrapAsync/</<@SandboxManager.jsm:108:9
wrapAsync/<@SandboxManager.jsm:103:14
execute@SandboxManager.jsm:57:2272
async*registerAction/<@SandboxManager.jsm:61:18
@SandboxManager.jsm:58:53
evalInSandbox@SandboxManager.jsm:57:12
runAsyncCallback@ActionSandboxManager.jsm:74:26
async*runRecipe@ActionsManager.jsm:97:17
async*run@RecipeRunner.jsm:240:9
async*init@RecipeRunner.jsm:80:13
async*finishInit@Normandy.jsm:97:11
async*observe@Normandy.jsm:53:7
ssi_sendRestoreCompletedNotifications@SessionStore.jsm:4584:7
ssi_restoreWindow@SessionStore.jsm:3523:5
ssi_restoreWindows@SessionStore.jsm:3629:5
initializeWindow@SessionStore.jsm:1177:11
onBeforeBrowserWindowShown/<@SessionStore.jsm:1328:9
promise callback*onBeforeBrowserWindowShown@SessionStore.jsm:1313:5
ssi_observe@SessionStore.jsm:764:9
onLoad@browser.js:1337:5
onload@browser.xul:1:1

The reason this didn't show in the usual ways is that the first statement in the catch block tries to delete the recipe, which errors out for the same reason. (I recommend adding some logging to AddonStudies.start)

Not sure where I can go from here, as I don't know how Normandy uses IndexedDB. ni?mythmon for more info.

[1]: https://searchfox.org/mozilla-central/rev/11a2ae294f50049e12515b5821f5a396d951aacb/toolkit/components/normandy/lib/AddonStudies.jsm#273
Assignee: nobody → chutten
Status: REOPENED → ASSIGNED
Flags: needinfo?(mcooper)
Reassigning as :mythmon's on it.
Assignee: chutten → mcooper
This seems like a separate issue that was originally reported, since it would prevent the study from being installed. comment 0 says "The current study is displayed as active in about:studies" and "The extension is successfully installed and enabled". Neither of these would be true if the above error happened.

That being said, this is a real issue, and our error handling is subpar here. I'm going to look further into this, but we may want to consider a separate bug for it.
If you want to close this bug out and file follow-up bugs for the remaining work that needs to be done, I think that makes a lot of sense.
Depends on: 1455735
I've opened bug 1455735 for the IndexedDB problems chutten saw in comment 46. I'm going to leave this open as a tracker of sorts, until all the blocking bugs are fixed and we can verify that everything is working again.
Flags: needinfo?(mcooper)
(In reply to Ryan VanderMeulen [:RyanVM] from comment #44)
> (In reply to Nathan Froyd [:froydnj] from comment #43)
> > Did you mean to reset the status for fx60?
> 
> Yes, we fixed this for Fx60 by backing out bug 1443587. Commment 34 and 37
> confirmed that backing it out on Beta got us back into a known-good state
> (and I'm sure Iulia will eventually follow-up in this bug after testing
> 60.0b13 as well).

Back with the follow-up for the comment 34. We can confirm that after the 60 backout:
- the opt-out-study recipes are successfully fetched and executed in Browser Console
- the corresponding extensions are downloaded, installed and listed as enabled in about:addons
- all the active studies are properly displayed in about:studies
- the opt-out-study enrollment and unenrollment processes properly generates events in about:telemetry  
- the 'Invalid extra key for event ["normandy", "enroll", "addon_study"].  (unknown)'  JS warning is not displayed anymore in Browser Console
- the preference experiment recipes are successfully fetched and executed in Browser Console
- the corresponding preferences are properly updated
- the preference experiment enrollment and unenrollment processes properly generates events in about:telemetry

The server validation tests are also looking good. 
See all these results reflected here: https://testrail.stage.mozaws.net/index.php?/plans/view/9328.
To summarize this bug: There were two problems that became conflated over the last 51 comments.

The first issue was a bug in Telemetry, caused by a recent optimization to Telemetry. This caused events generated by Normandy about add-on studies (opt-out-studies) to not be considered valid, and therefore not be recorded or sent. This is the original bug described in comment 0. The resolution of this bug was twofold: On Beta 60, the optimization was backed out. On Nightly 61, the optimization was improved to fix the problem while keeping the optimization.

While verifying the fix above, another issue was uncovered. Normandy was totally unable to enroll in opt-out-studies. Since this was noticed soon after the above Telemetry bug affected Normandy, it was assumed they were related. After further investigation, it became clear that this problem was a separate, unrelated issue that only affected Nightly 61. The issue was a regression caused by a change to Normandy. A particular code path did not `await` an async function, causing unpredictable problems including failed enrollment, failure to write to IndexedDB, or sometimes no problem at all. This bug has also been fixed on Nightly 61.

Overall, all of the problem I know of related to this bug are now fixed, QA is unblocked, both for Preference Rollout on Nightly 61 and Normandy Telemetry Events on Beta 60. Thank you everyone for the help tracking down and fixing these issues.
Status: ASSIGNED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 61
Taking in account comment 51, comment 52 and given that on Nightly 61 there is no time left to smoke test Normandy & Normandy telemetry, I would still go for it on Beta 61, just to be on the safe side. 
Adding NI's to keep this under radar and leaving the qe+ flag on until we figure out what the next QA step is on Beta 61 .
Flags: needinfo?(iulia.cristescu)
Flags: needinfo?(adrian.florinescu)
We finished the full functional and exploratory testing session targeting the changes that might be occurred after the 61 fix. No regressions were triggered by this particular fix and also, the overall feature status has been significantly improved. Two new issues were logged (bug 1460893 and bug 1460900). See more details here: https://testrail.stage.mozaws.net/index.php?/plans/view/9676
Status: RESOLVED → VERIFIED
Flags: qe-verify+
Flags: needinfo?(iulia.cristescu)
Flags: needinfo?(adrian.florinescu)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: