Closed Bug 1132479 Opened 9 years ago Closed 9 years ago

Fix test_IMAP_email_notification failure

Categories

(Firefox OS Graveyard :: Gaia::UI Tests, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: RobertC, Assigned: martijn.martijn)

References

Details

Attachments

(3 files)

test_IMAP_email_notification is failing on Jenkins because the notification toaster is not displayed.
This issue is also reproduced locally with automation. Repro rate: 5 out of 5.
The issue cannot be reproduced manually.

http://jenkins1.qa.scl3.mozilla.com/view/UI/job/flame-kk-319.b2g-inbound.ui.functional.smoke/2594/HTML_Report/

The issue seems to be with the custom sync method we implemented. It changed the value of the "5 minutes" option to 20 seconds in order for the email to sync faster so the test doesn't take too long. Now when we look in the list there is the usual "5 minutes option" with value = 300000 and another 20 seconds option (which I assume is the one we added). The checked option is "20 seconds", but it doesn't seem to work, the sync is never triggered. After we introduce the 20 sec timer even if I chose the 5 minutes option the sync will not work.

We change the sync timer using this method:
https://github.com/mozilla-b2g/gaia/blob/master/tests/python/gaia-ui-tests/gaiatest/apps/email/regions/setup.py#L184

Regression range b2g-inbound
Last working 
Device firmware (base) 	L1TC100118D0
Device firmware (date) 	11 Feb 2015 09:56:40
Device firmware (incremental) 	eng.cltbld.20150211.125629
Device firmware (release) 	4.4.2
Device identifier 	flame
Gaia date 	11 Feb 2015 11:32:00
Gaia revision 	1166ea84708b
Gecko build 	20150211115741
Gecko revision 	8a8292948914
Gecko version 	38.0a1

First broken:
Device firmware (base) 	L1TC100118D0
Device firmware (date) 	11 Feb 2015 12:52:12
Device firmware (incremental) 	eng.cltbld.20150211.155203
Device firmware (release) 	4.4.2
Device identifier 	flame
Gaia date 	11 Feb 2015 11:52:34
Gaia revision 	428e1971257b
Gecko build 	20150211121740
Gecko revision 	2bc23e56e1eb
Gecko version 	38.0a1

I am not sure what changed in the range provided above to affect our test. The auto-sync works if we do not change the values in the list.
QA Whiteboard: [fxosqa-auto-backlog+]
This test is constantly failing on b2g-inbound.
Geo, as our work day is done, can someone from your side investigate/xfail the test? Thanks!
Flags: needinfo?(gmealer)
This is the failure: 
http://jenkins1.qa.scl3.mozilla.com/job/flame-kk-319.mozilla-central.tinderbox.ui.functional.smoke/360/testReport/junit/%28root%29/test_IMAP_email_notification_py%20TestEmailNotification_test_IMAP_email_notification/test_IMAP_email_notification_py_TestEmailNotification_test_IMAP_email_notification/
Traceback (most recent call last):
  File "/var/jenkins/2/workspace/flame-kk-319.mozilla-central.ui.functional.smoke/.env/local/lib/python2.7/site-packages/marionette_client-0.8.7-py2.7.egg/marionette/marionette_test.py", line 283, in run
    testMethod()
  File "/var/jenkins/2/workspace/flame-kk-319.mozilla-central.ui.functional.smoke/tests/python/gaia-ui-tests/gaiatest/tests/functional/email/test_IMAP_email_notification.py", line 51, in test_IMAP_email_notification
    system.wait_for_notification_toaster_displayed(timeout=60)
  File "/var/jenkins/2/workspace/flame-kk-319.mozilla-central.ui.functional.smoke/tests/python/gaia-ui-tests/gaiatest/apps/system/app.py", line 45, in wait_for_notification_toaster_displayed
    expected.element_displayed(*self._notification_toaster_locator), message=message)
  File "/var/jenkins/2/workspace/flame-kk-319.mozilla-central.ui.functional.smoke/.env/local/lib/python2.7/site-packages/marionette_client-0.8.7-py2.7.egg/marionette/wait.py", line 143, in until
    cause=last_exc)
Comment on attachment 8563515 [details] [review]
[gaia] mwargers:email_failure > mozilla-b2g:master

This seems to fix it.
Kicked of a test run: http://jenkins1.qa.scl3.mozilla.com/job/flame-kk.ui.adhoc/698/
Attachment #8563515 - Flags: review?(gmealer)
Assignee: nobody → martijn.martijn
Flags: needinfo?(gmealer) → needinfo?
Comment on attachment 8563515 [details] [review]
[gaia] mwargers:email_failure > mozilla-b2g:master

Hrm, on Jenkins, the test is still failing. For me locally, it is consistently passing with this change.
Flags: needinfo?
Attachment #8563515 - Flags: review?(gmealer)
Ok, I lied, not always passing locally either.
Ok, I think this is a regression from bug 1098289:
https://github.com/mozilla-b2g/gaia/commit/c23c5bce1322a758a6fbc2e63c6029466041e7b0

James, could you help me on this one. For this gaia-ui test, we need a syncinterval of 20s to speed up the test. We don't want to wait 5 minutes before we can verify if syncing email works correctly.
What is the best way to achieve this?
Blocks: 1098289
Flags: needinfo?(jrburke)
The navigator.sync API has a minimum threshold of 100 seconds built into it, so setting a value smaller than that via the email app will fail. To force an even faster interval than 100 seconds requires using the "requestsync-manager" permission to use navigator.syncManager.setPolicy[1] to set it smaller value.

I'm also not sure how that will work in the context of these tests, as the email app does not ask for the "requestsync-manager" permission, but perhaps you can do some special power granting in the test? This is how they do it in the tests for navigator.sync:

https://mxr.mozilla.org/mozilla-central/source/dom/requestsync/tests/test_basic.html?force=1#14

[1] https://github.com/jrburke/requestsyncmanager has an example of using setPolicy in the js/main.js file.
Flags: needinfo?(jrburke)
The requestSync mechanism's minimum interval is controlled by the pref "dom.requestSync.minInterval", which can be set to 1.  If you just force that preference in your runtime, the 20 second thing should work.
Thanks Andrew, very useful!
Comment on attachment 8563515 [details] [review]
[gaia] mwargers:email_failure > mozilla-b2g:master

Ok, this fixes it. The switch_to_frame call() is necessary to get the test also working on b2g desktop.
Attachment #8563515 - Flags: review?(jlorenzo)
(In reply to Martijn Wargers [:mwargers] (QA) from comment #13)
> http://jenkins1.qa.scl3.mozilla.com/job/flame-kk.ui.adhoc/699/

They are all failing. I don't understand how that can happen.
I repeated it myself 10 times on my device and it's failing only 1 time with this timeout.
I wonder if the date/time on the Jenkins devices are perhaps wrong? That can cause the notification to not appear (that happened to my device as well, when it was still on Mountain View time, while I'm now in Europe).
Ok, with using:
Gaia-Rev        f0b93e0668ef9565bd6f050b15b4f794d59feb65
Gecko-Rev       https://hg.mozilla.org/mozilla-central/rev/e0cb32a0b1aa
Build-ID        20150216010344
Version         38.0a1
I still get this failure.

With:
Gaia-Rev        0d7b35f23402c4cb29bca6b98280fec48a196dec
Gecko-Rev       https://hg.mozilla.org/mozilla-central/rev/3436787a82d0
Build-ID        20150208174128
Version         38.0a1
it passes with the pull request applied.

So perhaps there is a new regression that makes this fail?
Also, regarding bug 1133446, which mentions this is also failing manually.
It's possible there's a new regression, but none of the jenkins links are publicly accessible so I'm unable to see if there are logcats/etc. available.
Attached file logcat.txt
Here's a logcat from another run I triggered[1] in order to have only 1 execution of the email test and nothing else.

[1] http://jenkins1.qa.scl3.mozilla.com/job/flame-kk.ui.adhoc/700/
Comment on attachment 8563515 [details] [review]
[gaia] mwargers:email_failure > mozilla-b2g:master

Clearing the review while the perma fail on Jenkins gets addressed.
Attachment #8563515 - Flags: review?(jlorenzo)
The log shows this:

02-17 01:50:55.072  2173  2173 I GeckoDump: ERR: cronsync-main navigator.sync.register for IDs: 0 failed: ParamsError

The validation function does:
    if (!("minInterval" in aParams) ||
        aParams.minInterval < minInterval) {
      return false;
    }

Also, minInterval's units are seconds, not milliseconds.  (Which is why constant names that don't include units make me grumpy! :)  Which means that the pull request should not be setting the pref to 20000, but instead 19.  Or safer yet, 1.  (Email is still using millisecond units for consistency with what we did before and because that's the JS convention.)
Er, or rather, 20 is fine too.  But I still like 1.

Also, thanks very much for providing the log!
Comment on attachment 8563515 [details] [review]
[gaia] mwargers:email_failure > mozilla-b2g:master

Thanks, updated the pull request to use 19 and that works locally (and on desktop.
Attachment #8563515 - Flags: review?(jlorenzo)
(In reply to Martijn Wargers [:mwargers] (QA) from comment #22)
> http://jenkins1.qa.scl3.mozilla.com/job/flame-kk.ui.adhoc/701/

There are 3 failures, but not related to this bug or the fix provided. That makes this test still horribly unstable, but I'll file a follow-up on that issue, once this one is resolved.
Comment on attachment 8563515 [details] [review]
[gaia] mwargers:email_failure > mozilla-b2g:master

"Value" is not used anymore which makes the use of in function confusing in setup_active_sync_email and setup_IMAP_email.
Attachment #8563515 - Flags: review?(jlorenzo) → review-
Comment on attachment 8563515 [details] [review]
[gaia] mwargers:email_failure > mozilla-b2g:master

Updated pull request
Attachment #8563515 - Flags: review- → review?(jlorenzo)
Comment on attachment 8563515 [details] [review]
[gaia] mwargers:email_failure > mozilla-b2g:master

Looks good to me, with one note though: A comment explaining why we change the data_layer in a function which manipulates the UI would be helpful for future readers.
Attachment #8563515 - Flags: review?(jlorenzo) → review+
(In reply to Johan Lorenzo [:jlorenzo] (QA) from comment #26)
> Comment on attachment 8563515 [details] [review]
> [gaia] mwargers:email_failure > mozilla-b2g:master
> 
> Looks good to me, with one note though: A comment explaining why we change
> the data_layer in a function which manipulates the UI would be helpful for
> future readers.

Now with explaining comments.
Comment on attachment 8563515 [details] [review]
[gaia] mwargers:email_failure > mozilla-b2g:master

Johan asked if another person could take a look at this.
Attachment #8563515 - Flags: review?(gmealer)
(In reply to Martijn Wargers [:mwargers] (QA) from comment #28)
Martijn and I discussed the need of having a second pair of eyes here as the fix we have here is the first of its kind to my knowledge. I'd prefer to be sure we're okay regarding the content and the form.
I've left comments there. 

TL;DR: this particular change is fine, but analyzing the test shows that we have way too many chances to be hammering the IMAP server with quick checks, to the point of possibly risking getting ourselves blacklisted from the (not Mozilla) IMAP server we're hitting for this test. That's why that minimum exists in the first place.

I recognize the need to get this change in, and I'll gladly r+ it on its own merit after a reply, but I'm actually quite concerned with the test and Email app construction because of this. 

It's not a problem right now because this thing you're fixing doesn't work yet, but the moment this lands it will be, and unless I've misunderstood the test/app logic significantly, I might want to disable all the downstream tests because of that issue.
fwiw, I've accidentally left flame devices with 20-second sync intervals active for both yahoo.com and gmail.com for a few days and seen no negative impacts.  Because we kill idle IMAP connections after 30 seconds, 20 seconds actually isn't a bad value since it's arguably less resource-intensive for us to keep the TCP connection open and the folder potentially selected.

I do agree that from a testing perspective you would ideally want to have a minimum number of low-interval polling instances occurring, so the 'finally' construct does seem like a good idea.  I think you'd probably want to disable periodic sync entirely in the 'finally' block though.
This is merely fixing the test to the state of what it was. 
Before this test got broken, the test would already check every 20s to the IMAP server.
I updated the pull request to set the pref back to 100s. Let me know if that addresses your concerns.
(In reply to Andrew Sutherland [:asuth] from comment #31)
> fwiw, I've accidentally left flame devices with 20-second sync intervals
> active for both yahoo.com and gmail.com for a few days and seen no negative
> impacts.  Because we kill idle IMAP connections after 30 seconds, 20 seconds
> actually isn't a bad value since it's arguably less resource-intensive for
> us to keep the TCP connection open and the folder potentially selected.

OK, good to know. My concern is if we have a lab of 30 or 40 of these behind a NAT, it might look pretty bad. But to be fair, that would require all of them to crash out in this test and leave the device un-reset.

(In reply to Martijn Wargers [:mwargers] (QA) from comment #33)
> I updated the pull request to set the pref back to 100s. Let me know if that
> addresses your concerns.

Sure, should be good enough.
QA Whiteboard: [fxosqa-auto-backlog+] → [fxosqa-auto-s11]
Keywords: checkin-needed
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Comment on attachment 8573150 [details] [review]
[gaia] shinglyu:email_failure > mozilla-b2g:v2.2

We need this for v2.2 too. The smoketest has been failing for quite a while. Thank you.
Attachment #8573150 - Flags: review?(jlorenzo)
Attachment #8573150 - Flags: review?(gmealer)
Comment on attachment 8573150 [details] [review]
[gaia] shinglyu:email_failure > mozilla-b2g:v2.2

Nothing blocking in this PR. One more line has been added between the PR on master and the PR on 2.2. I left a comment about it.
Attachment #8573150 - Flags: review?(jlorenzo) → review+
Comment on attachment 8573150 [details] [review]
[gaia] shinglyu:email_failure > mozilla-b2g:v2.2

It's basically a r+, though I do think you should clean up Johan's issue, and also maybe do a couple of little fixes in the app object where (pre-existing code) wasn't passing parameters to the underlying wait properly.

For that reason, not marking this checkin-needed.
Attachment #8573150 - Flags: review?(gmealer) → review+
(In reply to Geo Mealer [:geo] from comment #39)
@Geo, Johan,

Thanks for the review. I've fixed the problems and tested it locally. Please merge it if you think its OK. 

> Comment on attachment 8573150 [details] [review]
> [gaia] shinglyu:email_failure > mozilla-b2g:v2.2
> 
> It's basically a r+, though I do think you should clean up Johan's issue,
> and also maybe do a couple of little fixes in the app object where
> (pre-existing code) wasn't passing parameters to the underlying wait
> properly.
> 
> For that reason, not marking this checkin-needed.
Flags: needinfo?(gmealer)
LGTM, marking for Autolander to merge.
Flags: needinfo?(gmealer)
Keywords: checkin-needed
http://docs.taskcluster.net/tools/task-graph-inspector/#9UIflj5kSUy8aeqZlIPcyg

The pull request failed to pass integration tests. It could not be landed, please try again.
Wow, I've never seen this task cluster thing! It seems like the tests didn't even run because of some docker issues. What should I do now?
Don't look like a bug in the code, try to re-trigger the test once
Keywords: checkin-needed
http://docs.taskcluster.net/tools/task-graph-inspector/#PXBtwupSSV6DY6CyJ8DuXQ

The pull request failed to pass integration tests. It could not be landed, please try again.
I am still seeing this issue on automation tests. Unable to reproduce manually, but this is appearing in nightly ui functional smoke automation test and an ad hoc jenkins run:

http://jenkins1.qa.scl3.mozilla.com/job/flame-kk-319.mozilla-central.nightly.ui.functional.smoke/87/HTML_Report/
Repro rate: 1/1

http://jenkins1.qa.scl3.mozilla.com/job/flame-kk.ui.adhoc/721/HTML_Report/
Repro rate: 1/10

Device firmware (base)	L1TC100118D0
Device firmware (date)	13 Mar 2015 01:19:06
Device firmware (incremental)	eng.cltbld.20150313.041854
Device firmware (release)	4.4.2
Device identifier	flame
Gaia date	12 Mar 2015 11:01:49
Gaia revision	eabe35cf054d
Gecko build	20150313010238
Gecko revision	42afc7ef5ccb
Gecko version	39.0a1

Traceback (most recent call last):
  File "/var/jenkins/2/workspace/flame-kk-319.mozilla-central.nightly.ui.functional.smoke/.env/local/lib/python2.7/site-packages/marionette_client-0.9-py2.7.egg/marionette/marionette_test.py", line 290, in run
    testMethod()
  File "/var/jenkins/2/workspace/flame-kk-319.mozilla-central.nightly.ui.functional.smoke/tests/python/gaia-ui-tests/gaiatest/tests/functional/email/test_IMAP_email_notification.py", line 69, in test_IMAP_email_notification
    email.wait_for_senders_email_displayed()
  File "/var/jenkins/2/workspace/flame-kk-319.mozilla-central.nightly.ui.functional.smoke/tests/python/gaia-ui-tests/gaiatest/apps/email/regions/read_email.py", line 25, in wait_for_senders_email_displayed
    Wait(self.marionette).until(lambda m: m.find_element(*self._senders_email_locator).text != '')
  File "/var/jenkins/2/workspace/flame-kk-319.mozilla-central.nightly.ui.functional.smoke/.env/local/lib/python2.7/site-packages/marionette_driver-0.2-py2.7.egg/marionette_driver/wait.py", line 122, in until
    rv = condition(self.marionette)
  File "/var/jenkins/2/workspace/flame-kk-319.mozilla-central.nightly.ui.functional.smoke/tests/python/gaia-ui-tests/gaiatest/apps/email/regions/read_email.py", line 25, in <lambda>
    Wait(self.marionette).until(lambda m: m.find_element(*self._senders_email_locator).text != '')
  File "/var/jenkins/2/workspace/flame-kk-319.mozilla-central.nightly.ui.functional.smoke/.env/local/lib/python2.7/site-packages/marionette_driver-0.2-py2.7.egg/marionette_driver/marionette.py", line 1448, in find_element
    response = self._send_message('findElement', 'value', **kwargs)
  File "/var/jenkins/2/workspace/flame-kk-319.mozilla-central.nightly.ui.functional.smoke/.env/local/lib/python2.7/site-packages/marionette_driver-0.2-py2.7.egg/marionette_driver/decorators.py", line 36, in _
    return func(*args, **kwargs)
  File "/var/jenkins/2/workspace/flame-kk-319.mozilla-central.nightly.ui.functional.smoke/.env/local/lib/python2.7/site-packages/marionette_driver-0.2-py2.7.egg/marionette_driver/marionette.py", line 670, in _send_message
    self._handle_error(response)
  File "/var/jenkins/2/workspace/flame-kk-319.mozilla-central.nightly.ui.functional.smoke/.env/local/lib/python2.7/site-packages/marionette_driver-0.2-py2.7.egg/marionette_driver/marionette.py", line 709, in _handle_error
    raise errors.NoSuchElementException(message=message, status=status, stacktrace=stacktrace)
NoSuchElementException: NoSuchElementException: Unable to locate element: .msg-reader-header-label
QA Whiteboard: [fxosqa-auto-s11] → [QAnalyst-Triage?][fxosqa-auto-s11]
Flags: needinfo?(pbylenga)
NI on Shing to take a look.
QA Whiteboard: [QAnalyst-Triage?][fxosqa-auto-s11] → [QAnalyst-Triage+][fxosqa-auto-s11]
Flags: needinfo?(pbylenga) → needinfo?(slyu)
I am having some trouble landing my patch on 2.2. I'll check if the problem on nightly is the same as on 2.2. Perhaps I'll submit another patch for nightly later.

(In reply to Brogan Zumwalt [:BroganZ] from comment #46)
> I am still seeing this issue on automation tests. Unable to reproduce
> manually, but this is appearing in nightly ui functional smoke automation
> test and an ad hoc jenkins run:
Flags: needinfo?(slyu)
Looks like the taskcluster issue is addressed in https://bugzilla.mozilla.org/show_bug.cgi?id=1142854. Retrying.
Keywords: checkin-needed
http://docs.taskcluster.net/tools/task-graph-inspector/#_F0I0O0DQvCoyizIvFwAoQ

The pull request failed to pass integration tests. It could not be landed, please try again.
Retry once again since bug 1142854 is now RESOLVED FIXED
Keywords: checkin-needed
Depends on: 1149024
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: