Closed Bug 871897 Opened 11 years ago Closed 10 years ago

[email] Change mail sending process to use streaming to reduce peak memory usage

Categories

(Firefox OS Graveyard :: Gaia::E-Mail, enhancement, P1)

x86_64
Linux
enhancement

Tracking

(blocking-b2g:1.3+, b2g-v1.1hd wontfix, b2g-v1.2 wontfix, b2g-v1.3 fixed, b2g-v1.3T fixed, b2g-v1.4 unaffected)

RESOLVED FIXED
1.3 Sprint 4 - 11/8
blocking-b2g 1.3+
Tracking Status
b2g-v1.1hd --- wontfix
b2g-v1.2 --- wontfix
b2g-v1.3 --- fixed
b2g-v1.3T --- fixed
b2g-v1.4 --- unaffected

People

(Reporter: asuth, Assigned: asuth)

References

Details

(Whiteboard: [TD-42468], burirun2, burirun3)

Attachments

(3 files, 3 obsolete files)

We build the text of the message we are sending via SMTP/ActiveSync and APPENDing to the sent folder (non-gmail IMAP) in a single go, rather than in a streaming fashion with flow control.

For SMTP/IMAP mozTCPSocket provides us flow control for free.  For ActiveSync, we either need to also use mozTCPSocket for that (bad), keep using XHR but stage the file to a Blob/File that XHR/necko can use to stream the contents from disk, or request an XHR enhancement along the lines of responseType's moz-blob/moz-chunked-* for the request.

Note that most e-mail servers will bounce messages above a certain size, so there is a fundamental upper bound on the size of the messages we could reasonably expect to send and have it work out.
Severity: normal → enhancement
See Also: → 871852
Hi Andrew,

I found the mailapi uses MailComposer to serve e-mail content. And MailComposer of node.js already supports streaming based "local file" uploading. If it has, why do we need to write one??

I had tried to turn on that feature in our email app. It sends correctly about 100MB attachment without OOM through my gmail.
Flags: needinfo?(bugmail)
sorry, I made a mistake. The mail didn't send because of without fs.* implementation. You are right. We need to use XHR to get them trunk by trunk.
Flags: needinfo?(bugmail)
Indeed, the node libraries already support streaming and so we don't need to implement any extensive changes to them.  The problems are these:

1) our node-net shim provides no flow control support
2) our composer.js reads the entirety of the attachments into memory and provides them to mailcomposer.js
3) our composer.js explicitly puts mailcomposer.js in a non-streaming mode to build its results.  My memory isn't great on this, but I recall it seeming like by using the synchronous mode of operation I was avoiding implementing some other shims, but it's very possible I have since implemented them or my concerns were unfounded.
4) Composer in composer.js provides a signature that only provides a fully formed buffer
5) SmtpAccount only uses that buffer
6) the resulting chained call to (non-gmail) IMAP accounts in mailapi/imap/jobs.js also just takes just a buffer.  That's an offline operation which can be persisted to our IndexedDB storage, and persisting the full array is not a great idea, so we also need to alter that logic to be able to either re-create the stream and/or have streamed the file to a blob which we then later stream back out over the socket.

Note that we don't APPEND messages to the 'sent' folder for gmail, so it will see better memory use.  Also, I don't know what type of device you tested on or how fast its network was.  (But regardless, it's good to hear that you were able to get a 100 meg file to send with only relatively minor changes!)

If you can post a patch I can better understand what you've tried so far.
Sorry, it didn't work.

What I did are:
1. change the code of MailComposer to use callback function
2. write everything direct into callback
3. use Blob.slice to get chunk of file
4. write arraybuffer to callback

I don't know what's wrong, but I will try to make it work. 

I am using Inari with wifi. In this environment, it should work to send large file.
Cool.  I think a key thing will be hooking up the 'ondrain' event of mozTCPSocket (https://developer.mozilla.org/en-US/docs/Web/API/TCPSocket) combined with checking the return value of its send() implementation, or approximating it by having node-net maintain an internal running tally of bytes in flight.  (We obviously want to avoid having the worker thread putting data in async flight that exceeds the what mozTCPSocket needs/wants.)
Assignee: nobody → johu
Status: NEW → ASSIGNED
Andrew, 

I am not fully understand what you talks. I found the connection for sending email is created with 'simplesmtp/lib/client'($smtpclient). With the tracking of this library, it is defined at smtp/probe.js. But I can't find any TCPSocket.

The only place which uses TCPSocket is main-frame.setup.js which defines 'mailapi/worker-support/net-main'. But in this definition, I can't find it redirects anything about ondrain event??

Do you means I need to redirect to event to smtpclient??
Attached file streaming implementation (obsolete) —
Andrew,

To make thing clear, I put my working code as the attachment. It can send files through streaming.

The main change is in composer.js file. I rewrite the "_serveFile" method to use FileReader, and slice method. I found it always uses "base64" as attachment
 encoding in MailComposer. So, I just use it to encode file. Another big change is writeCallback. I use writeCallback to replace its original "_outputBuffer" for direct writing. All of them are implemented synchronized mode.

The change in configurator is to close the message while "withMessageBuffer" is finished.

I had tested it to send files near 25MB(limit of gmail) through gmail to my gmail account correctly.
John, it looks like you are directly modifying the aggregated files in the gaia repo.  As per https://github.com/mozilla-b2g/gaia/blob/master/apps/email/README.md the e-mail back-end code actually lives in the gaia-email-libs-and-more repo.  You should be making your changes against that repo and providing a pull request or diff against that repo.  Zip files of modified files are much harder to get useful diffs from, especially without knowing the base revision your changes are based on.

Also, please be aware that mailcomposer.js is from the upstream node.js mailcomposer library.  We are currently using an unforked version, and ideally we would keep it that way, so we want to avoid making direct changes to mailcomposer.js unless it's something we can upstream.  Ideally, our various shims like node-net could be improved to more directly support the node.js flow control mechanisms.

As you found, we don't do anything with the ondrain event right now.  But we need to, or do something like it.  Think of it this way: if your modified implementation does not know how many bytes have actually been buffered versus sent over the network, then how can it know when it is appropriate to read more data from disk so we don't use up all the memory?

When I unzip your changes and diff them, it looks like everything is still done synchronously, it is just that we we issue more writes when building the message and read attachments in 48k chunks at a time.  This is a large improvement because the garbage collector is able to reduce our peak memory usage for the intermediary strings/buffers that don't get queued up for the network.  But if you switch to switch to a 2g network connection, the entire file is still going to end up backlogged in the main process.
Oh, and also, we don't want the worker thread tied up in a synchronous loop that lasts as long as the send process takes since it will cause the e-mail app's backend to become unresponsive.  (Currently, sending mail is a synchronous operation from the UI, but we want to fix that.)
Wow, that looks more complex than I thought. If we need to avoid change the code of mailcomposer.js, we may need to implement partial filesystem module of node to make it run under  streaming mode, and what I had done seems useless now.

Ok, I will make a fork of gaia-email-libs-and-more, and read codes under it to find out the proper way to do so.
It's okay to monkey-patch the MailComposer object where it makes sense.  AKA, we could do (new MailComposer())._serveFile = function() {} rather than shimming out a very awkward fs shim.  We just definitely want to make sure our unit tests would detect any breakage if we try and upgrade the MailComposer version we use.
Ok, I got it. I will move the code to gaia-email-libs-and-more with the version of monkey-patch.
Attached file patch for this bug (obsolete) —
Use monkey-patch to override _serveFile method for streaming uploading.

This version is still implemented under synchronous mode. And I think it's nice to fix a issue with one path.
Attachment #751646 - Flags: review?(bugmail)
Attachment #751646 - Attachment mime type: text/x-patch → text/html
Attachment #751579 - Attachment is obsolete: true
Comment on attachment 751646 [details]
patch for this bug

(In reply to John Hu [:johnhu] from comment #13)
> This version is still implemented under synchronous mode. And I think it's
> nice to fix a issue with one path.

I'm not sure by what you mean when you say "with one path" here.  We would switch to doing things entirely asynchronously in all cases.


In any event, as I noted in my previous comment, this does improve the large attachment situation, but does not actually implement streaming, so it's not a fix for this bug.  If you want to file a new bug to do this a stop-gap improvement, we can do that, but the key things you'll need to address is that this breaks the test_compose.js unit test.  It appears you are reordering events in a bad way or failing to send an \r\n that should be sent.  It would also be nice if you could avoid sending just '\r\n' on its own, but that's not the end of the world.

Also, you'll want to manually test that this works with an activesync account (or run the test_compose test for activesync on a real hotmail account) because the fake-server currently does not simulate mail sending.
Attachment #751646 - Flags: review?(bugmail) → review-
Sorry, "with one path" is a typo, I mean "with one patch"..

And to make things run in asynchronously may need to route the event and send result from net-main to composer. Currently, I don't have enough knowledge to do it.

As to unit-test, I don't know how to configure it correctly. I may need your help on this part. 

I had run once yesterday. Before applying this patch, most of test_account_logic, test_activesync_*, test_body_observers, test_compose, test_just_auth, and test_mutation are failed to pass. And after this patch, the situation is still the same. So, I just think this patch doesn't break anything. But it shows I am wrong.
Andrew, 

With the help of squib, I can run unit test with activesync, but not imap. So, I don't understand what you exactly mean about my breaks of test_compose.js, especially: "reordering events in a bad way or failing to send an \r\n that should be sent". I don't think I change the order of events, that's the most confused part. I also noticed that I forgot to send '\r\n' while file is not found. Is this related to what you said?
test_compose currently does not work for the activesync tests; you need to run the IMAP tests for that.

test_compose breaks because the send process never completes.  We don't send "<CRLF>.<CRLF>", so the SMTP server just sits there waiting for us, and the test times out.
Andrew, 

Thanks for your information. I will try to find out what's going wrong about missing "<CRLF>.<CRLF>". 

And, I found I made a huge mistake on this bug. I only modify the SMTP part. This patch doesn't work for ActiveSync part. It uses more complex than SMTP. I will try to find out how to do it after missing "<CRLF>.<CRLF>" fixed. 

BTW, I still need your help to set up imap working. I had following the doc to set up it: https://github.com/mozilla-b2g/gaia-email-libs-and-more/blob/master/test/dovecot.md.
Attachment #751646 - Attachment is obsolete: true
I found the attached patch is correct, because it should take care about smtp, active sync, and composite account which creates sent mail for non-gmail typed. But the attached patch only solves smtp part, and not take care of the response of TCPSocket.send and ondrain event. So, I make it obsolete.
Assignee: johu → nobody
Requesting to implement this, as when there are poor network connection, email app dies silently.
blocking-b2g: --- → leo+
Priority: -- → P1
Whiteboard: [TD-42468]
Target Milestone: --- → 1.1 QE3 (24jun)
Hi Andrew,

Is this something you can follow up on?
Flags: needinfo?(bugmail)
I'll take this or will try and get lightsofapollo to take it, depending on when the QE3 deadline is updated to.
Assignee: nobody → bugmail
Flags: needinfo?(bugmail)
Andrew, to clarify, this only impacts ActiveSync accounts (Windows Live Mail and Exchange users).  Correct?

Does the issue affect both Windows Live Mail and Exchange users the same?  Thanks.
(In reply to Chris Lee [:clee] from comment #23)
> Andrew, to clarify, this only impacts ActiveSync accounts (Windows Live Mail
> and Exchange users).  Correct?

Incorrect.  Currently, whenever you send an e-mail, be it IMAP (gmail, etc.) or ActiveSync (covering Live Mail and Exchange the same), we build the message we are sending synchronously by creating a (potentially giant) string.  This string and its origins may use up so much memory that our app gets killed before sending completes.  There is also the potential complicating factor that we don't hold a wake-lock through this process and/or we may be put into the background by the user so we may be okay at first but then get killed when another app causes the out-of-memory killer to look for prey and we are no longer protected by virtue of being the foreground app.

There are potential complications relating to placing a copy of the message in the sent folder on non-gmail IMAP accounts since we need to build a second copy of the message (or have previously persisted the message to storage) in those cases.  This is not required for ActiveSync or gmail since the act of sending the messages copies it into the sent folder.
Note that we previously addressed leo+ bug 871852 in order to limit the problem in 1.1.
Target Milestone: 1.1 QE3 (26jun) → 1.1 QE4 (15jul)
Blocks: 887733
Hi Andrew, is this still on your radar?
Flags: needinfo?(bugmail)
Yes, I'm working this, although for unit test reasons I am finishing up the IMAP/SMTP fake-server support first.  Last week had a lot of time spent on holiday and interview-related overhead, and this week we've got a functional team meeting, but I'm trying to make as much progress as possible.
Depends on: 813411
Flags: needinfo?(bugmail)
Thanks Andrew, can we get an estimate for this one here?
Flags: needinfo?(bugmail)
Any movement on this?  We're standing by to do a regression and exploratory test pass against email for 1.1.
I'm finishing out the IMAP/SMTP fake-server dependency that's essential for confidence in the streaming fix.  I'm within spitting distance; there were just a lot of moles to whack, many optimistically assumed down for the count.  Earliest the patch could land would be Monday when reviewers become available; I'm expecting to work on the fix through the weekend.  I believee there are no longer any unknowns for this fix, so I'm reasonably confident about the effort to fix.

Implementation will significantly affect localdrafts storage and sending, but nothing else.  I do understand this bug is considered risky given current time-frames, so it's conceivable it won't be allowed to be uplifted when it does land.
Flags: needinfo?(bugmail)
Leo agrees not to block on this for 1.1 but would like to see this when upgrading to 1.2 happens.
blocking-b2g: leo+ → koi?
Dylan Oliver changed story state to started in Pivotal Tracker
Depends on: 902070
Attachment #786667 - Attachment description: Pointer to Github pull request: https://github.com/mozilla-b2g/jswbxml/pull/6 → Pointer to Github pull request: https://github.com/mozilla-b2g/jswbxml/pull/6 (jswbxml changes to support Blobs)
Attachment #786667 - Flags: review?(squibblyflabbetydoo)
triage: well underway -- this is part of 1.2.
blocking-b2g: koi? → koi+
Target Milestone: 1.1 QE4 (15jul) → 1.2 FC (16sep)
Target Milestone: 1.2 FC (16sep) → 1.3 Sprint 1 - 9/27
Having an issue to download an email with a large attachment (22mb) to gmail.com IMAP account,
File is in the loading process, but never downloaded

Gaia   313599c293f596519604070fa378ffc89e61e98f
SourceStamp decf58f0d596
BuildID 20131008004009
Version 26.0a2
Whiteboard: [TD-42468] → [TD-42468], burirun2
Comment on attachment 814709 [details]
Pointer to Github pull request: https://github.com/mozilla-b2g/gaia-email-libs-and-more/pull/254/files

I think this is all good, if a bit large (diffstat says 47 changed files with 3,047 additions and 750 deletions.) but there are several things I need to do for various reasons:

- I probably need to rebase and squash.  Unfortunately this was outstanding long enough and I did a merge at some point and my sleep/naive squash attempts just turned out very badly, so I recovered to my pre-rebase state.  I need to revisit the current drift state.

- The JS engine crasher that affects ActiveSync seems to be in full force.  I get full green IMAP tests with "make post-tests TEST_LOG_ENABLE=true TEST_VARIANT=imap:fake" but activesync makes things crash.  I believe I got the ActiveSync variants to successfully run on their own where the run was short enough that the probability of failure was not overwhelming.  I'll probably need to help the JS team create a more minimal crasher repro; I'll try that after a delta.

- I haven't run in gaia.  Activities are broken in general in Firefox nightly for me right now so I can't test the new functionality and I can't do on-device testing until I get back home from Brussels to where I have viable networks.

- Note that the 'jsas' and 'wbxml' commits reference commits in my repo, not the mozilla-b2g repos.  You'll need to manually fetch from those yourself.  The WBXML commit seems generally good by :squib, I think, and the ActiveSync once is very mimimal.
Attachment #814709 - Flags: review?(mcav)
Target Milestone: 1.3 Sprint 1 - 9/27 → 1.3 Sprint 3 - 10/25
Update now that I'm back from PTO, etc.

I did the rebase/squash equivalent.  I got a Leo device largely working again (mozilla-central has terrible graphics problems, not to mention apparent layout regressions that I had also seen when I was trying to get my unagi working) and have done some testing.  The immediate feedback to myself is that I need to put back in some of the console.log output I had in there previously, because it is not clear what the innards get up to.  (This also was a take-away from the recent mail bugs that turned out to be RIL problems; it seemed like it wasn't us, but it also wasn't clear what we were up to at the time of the failure.)  Also, there seems to be a potential problem where we continue to autosave the draft while we are in the process of trying to send it.  That won't be helpful.  I'll do this and continue my testing tomorrow.
Depends on: 915223
Here are the relevant diff hunks from the install-into-gaia run.  (We may want to cause the install-into-gaia make step to produce something like this in the gaia repo so gaia merges show the changes.):

 gaia-symlink/apps/email/js/ext/mailapi/worker-bootstrap.js
 ----------------
 deps/alameda.js
 data/lib/mailapi/worker-bootstrap.js
 scripts/config.js
 data/lib/node-buffer.js
 data/lib/mailapi/shim-sham.js
 data/lib/js-shims/event-queue.js
 data/lib/js-shims/microtime.js
 data/lib/rdcommon/extransform.js
 data/lib/rdcommon/log.js
 data/lib/mailapi/util.js
 data/lib/mailapi/a64.js
 data/lib/mailapi/allback.js
 data/lib/mailapi/date.js
 data/lib/mailapi/syncbase.js
 data/lib/mailapi/mailslice.js
+data/lib/mailapi/db/mail_rep.js
 data/lib/mailapi/searchfilter.js
 data/lib/mailapi/worker-router.js
 data/lib/mailapi/jobmixins.js
+data/lib/mailapi/drafts/draft_rep.js
+data/lib/mailapi/b64.js
+data/lib/mailapi/async_blob_fetcher.js
+data/lib/mailapi/drafts/jobs.js
 data/lib/mailapi/accountmixins.js
 data/deps/browserify-builtins/events.js
 data/deps/browserify-builtins/util.js
 data/deps/browserify-builtins/stream.js
 data/lib/node-crypto.js
+data/lib/mix.js
 data/lib/js-shims/faux-encoding.js
 data/lib/mailapi/mailchew-strings.js
 data/lib/mailapi/slice_bridge_proxy.js
 data/lib/mailapi/mailbridge.js
 data/lib/rdcommon/logreaper.js
 data/lib/mailapi/maildb.js
 data/lib/mailapi/cronsync.js
 data/lib/mailapi/accountcommon.js
 data/lib/mailapi/mailuniverse.js
 data/lib/mailapi/worker-setup.js
 
 
 gaia-symlink/apps/email/js/ext/mailapi/main-frame-setup.js
 ----------------
 data/lib/mailapi/worker-support/shim-sham.js
 data/lib/mailapi/mailapi.js
 data/lib/mailapi/worker-support/main-router.js
 data/lib/mailapi/worker-support/configparser-main.js
 data/lib/mailapi/worker-support/cronsync-main.js
 data/lib/mailapi/worker-support/devicestorage-main.js
 data/lib/mailapi/worker-support/maildb-main.js
+data/lib/mailapi/async_blob_fetcher.js
 data/lib/mailapi/worker-support/net-main.js
 data/lib/mailapi/main-frame-setup.js

...

-gaia-symlink/apps/email/js/ext/mailapi/composer.js
+gaia-symlink/apps/email/js/ext/mailapi/drafts/composer.js
 ----------------
 data/deps/mailcomposer/lib/punycode.js
 data/deps/mailcomposer/lib/dkim.js
 data/lib/nop.js
 data/lib/nop2.js
 data/lib/nop3.js
 data/deps/mailcomposer/lib/urlfetch.js
 data/lib/nop4.js
 data/deps/mailcomposer/lib/mailcomposer.js
 data/deps/mailcomposer.js
-data/lib/mailapi/composer.js
+data/lib/mailapi/drafts/composer.js
(In reply to Andrew Sutherland (:asuth) from comment #39)
> - The JS engine crasher that affects ActiveSync seems to be in full force. 
> I get full green IMAP tests with "make post-tests TEST_LOG_ENABLE=true
> TEST_VARIANT=imap:fake" but activesync makes things crash.  I believe I got
> the ActiveSync variants to successfully run on their own where the run was
> short enough that the probability of failure was not overwhelming.  I'll
> probably need to help the JS team create a more minimal crasher repro; I'll
> try that after a delta.

This is unfortunately still a problem.  As I commented in https://bugzilla.mozilla.org/show_bug.cgi?id=915223#c43 this appears to be a compiler bug (or a race condition related to varying optimization levels/optimization strategies between compiler versions, but since this is in the JS engine which generally eschews races, direct compiler bug is more likely).  I'm going to see if this is still happening on device targets too.  If so, given the apparent localization of the problem to ActiveSync, it's possible we're still using some weird Iterator feature we can just remove and just avoid the problem.
No longer depends on: 915223
(In reply to Andrew Sutherland (:asuth) from comment #42)
> (In reply to Andrew Sutherland (:asuth) from comment #39)
> > - The JS engine crasher that affects ActiveSync seems to be in full force. 
...
> This is unfortunately still a problem.

Good news / bad news.

Good news: The ActiveSync crasher is no more that was bug 915223.

Bad news 1: There is a deterministic crasher that I have filed as bug 932143 that it looks I can work around in our unit tests.  This has been marked as a security bug so I guess I can't go into it too much, but it seems like it won't affect our operation on devices, or it will only affect them with low probability and the multi-process model means it's only e-mail would crash.  Based on when the badness happens, this seems unaffected by my fixes on this bug so landing this fix won't affect that.  It also seems like there is a timing dependency and the reason we aren't seeing it a lot on travis is that A) timing on travis must be sufficiently different, and B) we haven't been landing a lot of stuff on GELAM since this regression came to town.

Bad news 2: Once bad news 1 is worked-around, there appears to be a potentially unrelated deterministic-ish (seems to involve multi-threaded cycle-collected GC) crasher that is related to this bug and could potentially affect e-mail on all branches were we to land this fix.  The stack is on https://bugzilla.mozilla.org/show_bug.cgi?id=932143#c3 right now.  Given the multi-threaded GC stuff involved which is beyond the capabilities of the e-mail app to work around, this is concerning.  From feedback I've already gotten, it seems like it probably is unrelated, so I'm going to file another bug on that now and then be all cagey about what the problem is because it probably counts as a security bug too.
Depends on: 932162
Depends on: 932183
In device testing it turns out mozTCPSocket was more broken than I thought while validating actual streaming; I've filed bug 932183 on its broken bufferedAmount and drain semantics.  I have a work-around in place in our code but am seeing a hang that could either be a mozTCPSocket problem, some different gecko/RIL issue, or some different network problem.  There's a fair chance it's not a mozTCPSocket problem given that in my testing I was having trouble reliably initiating a connection with the SMTP server in the first place and the case where we gave everything to mozTCPSocket immediately still had us ending up with the server not receiving all the data we sent.  I previously did some setup work to facilitate getting tcpdump traces of the actual wi-fi traffic from my device so I could look into train-wrecks like this more easily.  I am unable to do much in unit tests right now because of the various platform crashes mentioned in comment 43 and because b2g-desktop doesn't do multi-process which is the thing that makes mozTCPSocket's flow control stuff unreliable.
Whiteboard: [TD-42468], burirun2 → [TD-42468], burirun2, burirun3
Given the latest developments, this one is too risky for 1.2 at this point in the cycle -- we'll keep working on it but push it out to 1.3.
blocking-b2g: koi+ → 1.3+
Update: All bad news is now in hand.

- Update for "Bad news 1".  This turns out to be a debugging artifact caused by our JS engine.  Bug 933807 is the bug about making sure other people don't get tricked by it.

- Update for "Bad news 2".  I just put a fix up for review that includes a unit test, so we should be good on this soon.  I will pursue a 1.2 uplift for this even though this bug is no longer targeted at v1.2.

- Update for the mozTCPSocket flow control breakage on bug 932183: :kk1fff has a WIP so we should be good on that front soon, too.
Target Milestone: 1.3 Sprint 3 - 10/25 → 1.3 Sprint 4 - 11/8
Comment on attachment 786667 [details]
Pointer to Github pull request: https://github.com/mozilla-b2g/jswbxml/pull/6 (jswbxml changes to support Blobs)

I think this got to r=squib a while back.
Attachment #786667 - Flags: review?(squibblyflabbetydoo) → review+
Attachment #814709 - Attachment is obsolete: true
Attachment #814709 - Flags: review?(mcav)
Attached file GELAM patch for review
Attachment #8338317 - Flags: review?(mcav)
(In reply to Andrew Sutherland (:asuth) from comment #47)
> Comment on attachment 786667 [details]
> Pointer to Github pull request:
> https://github.com/mozilla-b2g/jswbxml/pull/6 (jswbxml changes to support
> Blobs)
> 
> I think this got to r=squib a while back.

Yes, this sounds familiar.
For a gaia branch with the GELAM changes installed, use:
https://github.com/asutherland/gaia/tree/stream-test

I've created a pull request for that branch too so I can see what Travis does:
https://github.com/mozilla-b2g/gaia/pull/14049

Much of this you already reviewed, but there has been some cleanup/compensation for POP3 changes/further fixes/enhancements made/etc.


In my on-device testing I was able to send a 29 megabyte video, (~41 megabytes encoded) on my ZTE open after modifying compose.js's MAX_ATTACHMENT_SIZE constant by multiplying it by 100 and making sure my mail-server had its maximum message size limit raised up a bit.  It is my understanding that without this patch some of our 256 MB devices have had trouble even sending at our hard-coded 5 megabyte limit, so this is pretty good success.

The GELAM tests run green on my local machine.

I think the marionette tests may have some minor breakage?  Locally it said "12 passing, 1 pending, 1 failing" when I ran them in a batch and the failing one was "email notifications, disable disable notification, but still sync".  I didn't run them under a vnc X, so there could have been some oddness from that.

Important device testing notes: I haven't run with POP3 on the device yet.  Also, my IMAP testing was run on an account without a 'sent' folder in existence and I haven't fixed that "create a sent folder" bug.  The unit tests for compose make me feel that the POP3 case is probably pretty good.  The IMAP APPEND case is going to be worse on huge files because after we synchronously do the SMTP, we asynchronously try and APPEND the message.  (Unless we are on gmail, in which case we luck out and don't have to do anything.  (I'm somewhat tempted to change up our implementation from using APPEND to automatically using BCC and us just using a specialized task to find the message in the Inbox and move it/copy it to the sent folder.  There would be horrible edge cases if the user had filters/ran afoul of their own spam filters/etc. though.)

I am planning to run those additional manual tests tomorrow.
Attachment #8338327 - Flags: review?(mcav)
Awesome! Just a couple of notes:

- It took me a couple passes to get all the git submodules loaded properly (specifically, activesync et al); once I did, GELAM passed green for me too. So just make sure that when you land it, you've updated all the right submodule pointers.

- I tested a POP3 account on-device sending several largish attachments, and it all seemed to work just fine for me, so that's good.

- The one non-nit I noted in the PR is that we're using two event signatures for updateHeader/Body -- changeType and changeDetail. We ought to homogenize them.

Unrelated: As I expected, POP3 still has trouble downloading a large attachment due to OOM. I know we talked about this before but I don't remember if you said we should wait for the streaming fix, the new mailparser, or if it just got left unnoticed. I assume this would involve similar sorts of blob-stuffing like what happens in this patch. Let me know and I can create a tracking bug for that too.

As far as Gaia integration tests, it seems to like the email tests okay... but it's always a little sketchy for me.
Attachment #8338317 - Flags: review?(mcav) → review+
Attachment #8338327 - Flags: review?(mcav) → review+
(In reply to Marcus Cavanaugh [:mcav] from comment #51)
> - It took me a couple passes to get all the git submodules loaded properly
> (specifically, activesync et al); once I did, GELAM passed green for me too.
> So just make sure that when you land it, you've updated all the right
> submodule pointers.

I'm thinking we potentially want something like a "make sane-repo" button that does the correct combination of "git submodule update" with arguments and "git submodule sync".  This can then be used to gloss over any migration to using npm with git references, etc. too.

What I think I needed to do to make your life simpler was modify .gitmodules and for you to run submodule sync in the root and inside deps/activesync for its wbxml submodule sync.  (Which is, of course, somewhat ridiculous.)
 
> Unrelated: As I expected, POP3 still has trouble downloading a large
> attachment due to OOM. I know we talked about this before but I don't
> remember if you said we should wait for the streaming fix, the new
> mailparser, or if it just got left unnoticed. I assume this would involve
> similar sorts of blob-stuffing like what happens in this patch. Let me know
> and I can create a tracking bug for that too.

I wanted you to wait for the streaming fix for my the addition of "flushBecause" to updateMessageBody since when the option is passed it will trigger an account save that will allow us to convert any attachments' memory-backed Blobs to disk-backed Blobs.  This doesn't resolve the peak memory usage of a single message, but it avoids memory death due to our FolderStorage cache keeping those Blobs around for a long time.  So if we can survive downloading a single large message, downloading a second large message should not kill us.  This leaves all accounts in basically the same solution post-this bug (and bug 894834 which I am resolving in this bug with my change to to do_download to pass the flushBecause option).

Fully addressing that problem for our planned download-manager treatment where we cache attachments in IndexedDB requires platform fixes so that we can use FileHandle for IndexedDB in a sub-process.  That will require some changes to our parsing of attachments / entire message parsing (for POP3) and I think to the POP3 parsing so you can stream to the parser.  It may also require IMAP changes.  I would suggest that you file a bug to try and quantify what our maximum message download size is for POP3 and then we do some combination of 1) just adding a guard that says "oh hell no, this message is too big to download" to save ourselves and 2) trying to address low-hanging fruit in the garbage generation front.  As I noted in my POP3 review, I think your re-generation of the big Uint8Array/Buffer after having created hella strings will generate a lot of garbage.  I also suspect that mailparser may generate a lot of garbage too.  Since mailparser can already do some streaming consumption, a first stab at streaming the POP3 download as it happens to mailparser should reduce peak memory usage by letting the GC catch more stuff.

I'll add the flushBecause usage to your existing storeMessage, especially since I glitched and did not fix your updateMessageBody call signature there.  I'll also triple-check I didn't miss any other calls added to updateMessageBody in your patch.  I haven't done my own POP3 testing yet for this, so I will sanity check this and try with some various message sizes and let you know how that goes.
 
> As far as Gaia integration tests, it seems to like the email tests okay...
> but it's always a little sketchy for me.

Cool, I'll be sure the travis runs are happy, of course.
Verified Fixed - This issue does not need testing as this is a back-end implementation. This is covered by extensive automated back-end tests.
Status: RESOLVED → VERIFIED
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(ktucker)
Please ignore comment 54, this is a back-end change and we have no way of verifying this issue.
Status: VERIFIED → RESOLVED
Closed: 11 years ago10 years ago
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+][QAnalyst-verify-]
Flags: needinfo?(ktucker)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: