Closed Bug 964561 Opened 10 years ago Closed 8 years ago

Use the size of structured clone data to choose whether it will be stored in the database or in the filesystem

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla52

Tracking Flags:

Tracking

Status

firefox52

---

fixed

People

(Reporter: bent.mozilla, Assigned: janv)

References

Details

(Whiteboard: [games])

Attachments

(11 files, 8 obsolete files)

Add JSAutoStructuredCloneBuffer::Copy() which takes a PRFileDesc 10 years ago Jan Varga [:janv] 3.26 KB, patch		Details \| Diff \| Splinter Review
patch v2 10 years ago Jan Varga [:janv] 2.64 KB, patch	luke : review+	Details \| Diff \| Splinter Review
Part 1: Add JSAutoStructuredCloneBuffer::Copy() which takes a FILE* 10 years ago Jan Varga [:janv] 2.54 KB, patch	luke : review+	Details \| Diff \| Splinter Review
patch v1 10 years ago Jan Varga [:janv] 46.57 KB, patch		Details \| Diff \| Splinter Review
Part 2: Use the size of structured clone data to choose whether it will be stored in the database or in the filesystem 10 years ago Jan Varga [:janv] 28.01 KB, patch		Details \| Diff \| Splinter Review
Part 3: Fix tests 10 years ago Jan Varga [:janv] 5.38 KB, patch		Details \| Diff \| Splinter Review
Part 2: Use the size of structured clone data to choose whether it will be stored in the database or in the filesystem 10 years ago Jan Varga [:janv] 28.11 KB, patch		Details \| Diff \| Splinter Review
Part 2: Use the size of structured clone data to choose whether it will be stored in the database or in the filesystem 10 years ago Jan Varga [:janv] 28.47 KB, patch		Details \| Diff \| Splinter Review
Part 1: Use a type in StructuredCloneFile instead of a boolean; r=asuth 8 years ago Jan Varga [:janv] 14.37 KB, patch	asuth : review+	Details \| Diff \| Splinter Review
Part 2: Refactor file ids handling for better expandability; r=asuth 8 years ago Jan Varga [:janv] 5.88 KB, patch	asuth : review+	Details \| Diff \| Splinter Review
Part 3: Implement an input stream wrapper around structured clone data; r=asuth 8 years ago Jan Varga [:janv] 4.25 KB, patch	asuth : review+	Details \| Diff \| Splinter Review
Part 4: Update keys directly in the structured clone buffer; r=asuth 8 years ago Jan Varga [:janv] 4.36 KB, patch	asuth : review+	Details \| Diff \| Splinter Review
Part 5: Add a data threshold pref; r=asuth 8 years ago Jan Varga [:janv] 5.31 KB, patch	asuth : review+	Details \| Diff \| Splinter Review
Part 6: Core changes for storing structured clone data outside of the database; r=asuth 8 years ago Jan Varga [:janv] 20.26 KB, patch	asuth : review+	Details \| Diff \| Splinter Review
Part 7: Compress externally stored structured clone data; r=asuth 8 years ago Jan Varga [:janv] 7.69 KB, patch	asuth : review+	Details \| Diff \| Splinter Review
Part 8: Disable externally stored structured clone data in file tests; r=asuth 8 years ago Jan Varga [:janv] 7.55 KB, patch	asuth : review+	Details \| Diff \| Splinter Review
Part 9: Add a new test for externally stored structured clone data; r=asuth 8 years ago Jan Varga [:janv] 9.88 KB, patch	asuth : review+	Details \| Diff \| Splinter Review
Part 10: Fix a possible leak in BindAdoptedBlobByName()/BindAdoptedBlobByIndex(); r=mak 8 years ago Jan Varga [:janv] 2.39 KB, patch	mak : review+	Details \| Diff \| Splinter Review
A follow-up fix. Pass correct buffer size to CopyFileData(); r=asuth 8 years ago Jan Varga [:janv] 1.13 KB, patch	asuth : review+	Details \| Diff \| Splinter Review

Ben Turner (not reading bugmail, use the needinfo flag!)

Reporter

Description

•

10 years ago

Currently all structured clone data (including ArrayBuffer contents) is stored in the database while all Blob/File objects are always stored as actual files in the filesystem. Neither decision takes the data size into account so in the worst case we can have very large values in the database and many tiny files in the filesystem. We should be smarter about this and consistently store small values in the database and larger values in the filesystem.

The size threshold will probably need to be tweakable via pref so that mobile/desktop can have different values.

Luke Wagner [:luke]

Comment 1

•

10 years ago

In particular, this would greatly help in avoiding the general problem described in bug 933398 comment 0.  We'd still have the 1 big copy from the ArrayBuffer (holding the VFS) to the structured clone buffer, but at least we'd avoid the subsequent copy when compressing and again later when storing in sqlite.  (To avoid the structured clone buffer copy, we discussed an orthogonal optimization.)

Blocks: 933398

Whiteboard: [games]

Jan Varga [:janv]

Assignee

Comment 2

•

10 years ago

I did a little investigation of Epic's Citadel demo.

<profile>/storage/persistenct/http+++www.unrealengine.com/idb/1873003896EEMH_CPARCE_LDOA
- empty directory (no stored files obviously)

sqlite3 <profile>/storage/persistenct/http+++www.unrealengine.com/idb/1873003896EEMH_CPARCE_LDOA.sqlite
sqlite> pragma page_size;
32768

sqlite> select length(data) from object_data;
102690803
105

so, just 2 rows, one of them almost 100MB (compressed)

There are some performance numbers about whether to store blobs in the database or a filesystem.
It seems the limit is 100k.

We can make this number dependent on the platform, and maybe even configurable from JS, e.g.
var request = indexedDB.open("myDb", 1, { maxInternalBlobSize: 102400 });

Jan Varga [:janv]

Assignee

Comment 3

•

10 years ago

Attached patch Add JSAutoStructuredCloneBuffer::Copy() which takes a PRFileDesc (obsolete) — Details — Splinter Review

We need this to avoid creating an extra buffer in IndexedDB implementation.

Assignee: nobody → Jan.Varga

Attachment #8366754 - Flags: review?(luke)

Luke Wagner [:luke]

Comment 4

•

10 years ago

Comment on attachment 8366754 [details] [diff] [review]
Add JSAutoStructuredCloneBuffer::Copy() which takes a PRFileDesc

Review of attachment 8366754 [details] [diff] [review]:
-----------------------------------------------------------------

Thus far we've tried to keep our dependency on NSPR quite limited.  Do you suppose you could change the API to take an 'int fd' instead so we can just use plain posix freads instead?  MSDN suggests this will just work on MSVC.

Jan Varga [:janv]

Assignee

Comment 5

•

10 years ago

ah, ok, I'll rework it using plain posix freads.

Jan Varga [:janv]

Assignee

Comment 6

•

10 years ago

Attached patch patch v2 (obsolete) — Details — Splinter Review

Attachment #8366754 - Attachment is obsolete: true

Attachment #8366754 - Flags: review?(luke)

Attachment #8366857 - Flags: review?(luke)

Luke Wagner [:luke]

Comment 7

•

10 years ago

Comment on attachment 8366857 [details] [diff] [review]
patch v2

Review of attachment 8366857 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/vm/StructuredClone.cpp
@@ +44,5 @@
>  
>  #include "jscntxtinlines.h"
>  #include "jsobjinlines.h"
>  
> +#define FILE_COPY_BUFFER_SIZE 32768

Could you move this to a static const unsigned inside JSAutoStructuredCloneBuffer::copy?

@@ +1688,5 @@
> +bool
> +JSAutoStructuredCloneBuffer::copy(FILE *fd, size_t size, uint32_t version)
> +{
> +    uint64_t *newData = static_cast<uint64_t *>(js_malloc(size));
> +    if (!newData)

Could you use a ScopedJSFreePtr<uint64_t>?

@@ +1707,5 @@
> +        }
> +
> +        js_memcpy(pos, copyBuffer, bytesRead);
> +        pos += bytesRead;
> +    } while (true);

Could you add a JS_ASSERT((char*)newData + size == (char*)pos) at the end of the loop?

Attachment #8366857 - Flags: review?(luke) → review+

Jan Varga [:janv]

Assignee

Updated

•

10 years ago

Depends on: 965084

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 8

•

10 years ago

I think storing Blobs in the database if they are small enough makes a lot of sense.

However I don't think we should be in a hurry to add the complexity involved in storing large structured clone data outside of the database until we have data showing that there's significant performance improvements.

Storing large data in such that it doesn't get in the way of performance of database lookups seems like the job of SQLite. Do we know what it does as the data gets large?

Storing values of the size of 100MB is never going to be fast. No matter where the data is stored we're going to have to read 100MB into memory. This is hardly a common case.

The fact that the games code is doing it right now is hopefully temporary. As I understand it it's a workaround since they need synchronous access to a filesystem. So they keep the whole filesystem in memory. Once we have something like sync IDB or the sync message channel, that will be solved in much better ways by only loading the data that's needed into memory.

Jan Varga [:janv]

Assignee

Comment 9

•

10 years ago

Well, it seems that collapsing of small Blobs is harder than storing structured clones outside of the database.
I have the latter mostly implemented.

I agree that long term, they need real sync access to a filesystem, but in order to do that they also need WebGL on workers (that's what Luke told me), so it will take some time. For now, we could fix the crashes at least, by lowering number of allocated buffers and storing large structured clones externally.

Luke Wagner [:luke]

Comment 10

•

10 years ago

(In reply to Jonas Sicking (:sicking) from comment #8)
> Storing large data in such that it doesn't get in the way of performance of
> database lookups seems like the job of SQLite. Do we know what it does as
> the data gets large?

Copies happen even before it gets into the hands of SQLite.  It's copies all the way down :)

> Storing values of the size of 100MB is never going to be fast. No matter
> where the data is stored we're going to have to read 100MB into memory. This
> is hardly a common case.

It doesn't need to be fast, it needs to avoid keeping multiple full-size copies of the structured clone buffer in memory simultaneously.  It seems like this would be really important for FFOS as well where the memory spike from much smaller payloads would unacceptable.  Gregor was telling me yesterday he has seen this be a problem.

> The fact that the games code is doing it right now is hopefully temporary.

Do you expect that *everything* a game needs will be available in Web Workers in less than a year?

Jan Varga [:janv]

Assignee

Comment 11

•

10 years ago

btw, https://www.sqlite.org/intern-v-extern-blob.html, that's the page where I found the 100k boundary.

Jan Varga [:janv]

Assignee

Comment 12

•

10 years ago

Attached patch Part 1: Add JSAutoStructuredCloneBuffer::Copy() which takes a FILE* (obsolete) — Details — Splinter Review

Attachment #8366857 - Attachment is obsolete: true

Attachment #8367607 - Flags: review?(luke)

Luke Wagner [:luke]

Updated

•

10 years ago

Attachment #8367607 - Flags: review?(luke) → review+

Jan Varga [:janv]

Assignee

Comment 13

•

10 years ago

I posted all the patches on try:
https://tbpl.mozilla.org/?tree=Try&rev=43244dabf359

and also with enforced external storage of structured clones:
https://tbpl.mozilla.org/?tree=Try&rev=2ec7c5676cf4

I will need some help here to do some better testing on mobile before landing.

Jan Varga [:janv]

Assignee

Updated

•

10 years ago

Depends on: 965982

Jan Varga [:janv]

Assignee

Comment 14

•

10 years ago

try run with default "max internal blob size" (100 KB on android, 1 MB on other platforms):
https://tbpl.mozilla.org/?tree=Try&rev=a449f823796e

try run with "max internal blob size" set to 0 (every structured clone stored externally):
https://tbpl.mozilla.org/?tree=Try&rev=0bc8c6d7c317

The bc failures are caused by mochitest timeouts, it's interesting that especially windows 7 and windows 8 is very slow when storing a lot of tiny files.

Jan Varga [:janv]

Assignee

Comment 15

•

10 years ago

Attached patch patch v1 (obsolete) — Details — Splinter Review

ok, this is the main patch

Attachment #8368504 - Flags: review?(bent.mozilla)

Jan Varga [:janv]

Assignee

Updated

•

10 years ago

Depends on: 966233

Luke Wagner [:luke]

Comment 16

•

10 years ago

Random comment on HN complaining about the specific issue this bug solves:
https://news.ycombinator.com/item?id=7068171

Jan Varga [:janv]

Assignee

Comment 17

•

10 years ago

Ok, so who is going to make the final decision whether to take this patch or not ?

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 18

•

10 years ago

It's unclear to me where this patch is attempting to prevent copies? I.e. what copies that the current code does is this patch attempting to eliminate? The fact that we're writing to sqlite rather than a separate file doesn't seem to be what creates more copies.

Also, do we really need a new column here? Couldn't we simply write the file-id in the existing file-ids column and leave the data empty to indicate that the data lives in a file rather than the database?

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 19

•

10 years ago

But to more concretely answer your question, I think it's Ben's decision what to do here.

Jan Varga [:janv]

Assignee

Comment 20

•

10 years ago

(In reply to Jonas Sicking (:sicking) from comment #18)
> It's unclear to me where this patch is attempting to prevent copies? I.e.
> what copies that the current code does is this patch attempting to
> eliminate? The fact that we're writing to sqlite rather than a separate file
> doesn't seem to be what creates more copies.

I think mozStorage or sqlite creates a copy, but there's also the compression which is done before writing to sqlite, so it seems that this patch eliminates 2 copies.

> Also, do we really need a new column here? Couldn't we simply write the
> file-id in the existing file-ids column and leave the data empty to indicate
> that the data lives in a file rather than the database?

Yes, that might be possible, good catch.

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 21

•

10 years ago

> I think mozStorage or sqlite creates a copy,

Ben thought otherwise, but I'll leave it to him.

> but there's also the compression which is done before writing to sqlite,

Compression seems orthogonal to where we store things. If want to eliminate the copy that compression creates, we can do that while still writing into the database, no?

Jan Varga [:janv]

Assignee

Comment 22

•

10 years ago

(In reply to Jonas Sicking (:sicking) from comment #21)
> > I think mozStorage or sqlite creates a copy,
> 
> Ben thought otherwise, but I'll leave it to him.

We have this comment in IDBObjectStore.cpp:
// This will hold our compressed data until the end of the method. The
// BindBlobByName function will copy it.

but it seems it's not correct, hm

> 
> > but there's also the compression which is done before writing to sqlite,
> 
> Compression seems orthogonal to where we store things. If want to eliminate
> the copy that compression creates, we can do that while still writing into
> the database, no?

Well, that's possible, but snappy doesn't support compression in chunks :(
and switching to some other library doesn't sound like a good solution.

Anyway, I still think that we should support storing big structured clones in separate files. We can set the limit high, like several megabytes. There always be people who store big things in the database and then complain about performance.

Jan Varga [:janv]

Assignee

Comment 23

•

10 years ago

I don't know all the internals of sqlite, but I have a feeling that even when we eliminate the extra copy for compression, it's better to not store huge blobs in the database.

I think we should also ask someone from sqlite devs.

Flags: needinfo?(drh)

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 24

•

10 years ago

Storing things outside of the database for performance reasons seems like a good idea. It just sounded like we were storing it outside of the database in order to eliminate copies which didn't seem accurate. We can choose to store both compressed and uncompressed data both inside and outside the database.

D. Richard Hipp

Comment 25

•

10 years ago

(In reply to Jan Varga [:janv] from comment #23)
> I don't know all the internals of sqlite, but I have a feeling that even
> when we eliminate the extra copy for compression, it's better to not store
> huge blobs in the database.
> 
> I think we should also ask someone from sqlite devs.

For "smaller" blobs, it is faster to store them in the SQLite database.  SQLite is slower to write content to disk (because of the extra formatting it has to do, and because of transaction control) but storing in the database eliminates an open()/close() operation, which can be expensive.

The cutoff for what constitutes a "small" blob varies by OS and hardware.  For tests we ran on Linux, the cutoff point was about 100KB.  In other words, for blobs smaller than about 100KB, it is faster to store them as fields in an SQLite database than it is to write them into separate files.  See http://www.sqlite.org/intern-v-extern-blob.html for details.

Note that SQLite rewrites an entire row whenever you change any column in that row.  So if you do put massive blobs in a database, it is best to put them in their own table that assigned each blob an integer tracking number:

    CREATE TABLE somekindablob(id INTEGER PRIMARY KEY, content BLOB);

Then use the integer tracking number in the table that contains all the other properties of the blob.  Or if you really, really must put the large blob in the same table with other fields, at least put the large blob at the very end, so that the other fields can be read without having to seek past the entire blob.

Flags: needinfo?(drh)

Jan Varga [:janv]

Assignee

Comment 26

•

10 years ago

(In reply to D. Richard Hipp from comment #25)
> Or if you really, really must put the large
> blob in the same table with other fields, at least put the large blob at the
> very end, so that the other fields can be read without having to seek past
> the entire blob.

bent already did this optimization

Ben Turner (not reading bugmail, use the needinfo flag!)

Reporter

Comment 27

•

10 years ago

mozStorage does make a copy of the blob before we write it to SQLite. This is the call order:

  1. http://dxr.mozilla.org/mozilla-central/source/dom/indexedDB/IDBObjectStore.cpp?from=idbobjectstore.cpp#3204
  2. http://dxr.mozilla.org/mozilla-central/source/storage/src/mozStorageBindingParams.cpp#351
  3. http://dxr.mozilla.org/mozilla-central/source/storage/src/Variant.h#320
  4. http://dxr.mozilla.org/mozilla-central/source/storage/src/Variant.h#245

This is horrible, and we should fix it.

Ben Turner (not reading bugmail, use the needinfo flag!)

Reporter

Comment 28

•

10 years ago

1. I think we should pick a threshold, maybe 50k based on the chart drh provided in comment 25.
2. We should ask snappy how much space it needs to store our blob or structured clone data.
3. Anything smaller than the threshold should be compressed and stored in the database. I don't think we need to change our schema at all for this.
4. Anything larger should be stored outside the database.
5. I think it makes sense to store blobs uncompressed (they're likely in a format that is already compressed anyway).
6. Structured clone data maybe should always be compressed since our format is not very efficient (lots of 0s last I looked). However, this requires increased memory to give snappy a buffer to work with. We could try tricks like stitching together chunked snappy buffers but we'd need to see if that costs more performance than the speedup we get from writing less data.

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Comment 29

•

10 years ago

I spun off comment 27 as bug 979445.

Ben Turner (not reading bugmail, use the needinfo flag!)

Reporter

Comment 30

•

10 years ago

Comment on attachment 8368504 [details] [diff] [review]
patch v1

Canceling review until we get a new patch that avoids the extra column.

Attachment #8368504 - Flags: review?(bent.mozilla)

Jan Varga [:janv]

Assignee

Comment 31

•

10 years ago

When I force to store all structured clones externally all tests pass except browser_quotaPromptDelete.js and browser_quotaPromptDeny.js:
https://tbpl.mozilla.org/?tree=Try&rev=1daa655e88d4
It's not a big deal, this is just an edge case that shouldn't happen during normal browser-chrome-mochitest runs, but it would be better to fix it.
However, it seems that I need to fix bug 856921 first.

Depends on: 856921

Jan Varga [:janv]

Assignee

Comment 32

•

10 years ago

Attached patch Part 2: Use the size of structured clone data to choose whether it will be stored in the database or in the filesystem (obsolete) — Details — Splinter Review

Attachment #8368504 - Attachment is obsolete: true

Attachment #8400666 - Flags: review?(bent.mozilla)

Jan Varga [:janv]

Assignee

Comment 33

•

10 years ago

Attached patch Part 3: Fix tests (obsolete) — Details — Splinter Review

Attachment #8400667 - Flags: review?(bent.mozilla)

Jan Varga [:janv]

Assignee

Comment 34

•

10 years ago

https://tbpl.mozilla.org/?tree=Try&rev=b0d8c77a2611

Jan Varga [:janv]

Assignee

Comment 35

•

10 years ago

Attached patch Part 2: Use the size of structured clone data to choose whether it will be stored in the database or in the filesystem (obsolete) — Details — Splinter Review

Replaced custom FileAutoCloser with ScopedCloseFile from FileUtils.h

Attachment #8401073 - Flags: review?(bent.mozilla)

Jan Varga [:janv]

Assignee

Updated

•

10 years ago

Attachment #8400666 - Attachment is obsolete: true

Attachment #8400666 - Flags: review?(bent.mozilla)

Luke Wagner [:luke]

Comment 36

•

10 years ago

When the structured clone data is stored as a file, will it also be compressed?

Jan Varga [:janv]

Assignee

Comment 37

•

10 years ago

(In reply to Luke Wagner [:luke] from comment #36)
> When the structured clone data is stored as a file, will it also be
> compressed?

The current patch doesn't compress it to avoid additional memory copy since snappy doesn't support streaming at the moment. We could do the trick Ben mentioned, but that can be done later.
Hm, this reminds me that I wanted to create a flag for this, so it would be easy to add such support in future, w/o converting/compressing all stored files. Thanks for reminding me :)

Jan Varga [:janv]

Assignee

Updated

•

10 years ago

Attachment #8401073 - Flags: review?(bent.mozilla)

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 38

•

10 years ago

I didn't look too closely at the patch, but do make sure that we properly handle things like

oS.put({ a: reallyHugeArrayBuffer, b: blob1, c: blob2 });

Also make sure to test this of course.

Jan Varga [:janv]

Assignee

Comment 39

•

10 years ago

(In reply to Jonas Sicking (:sicking) from comment #38)
> I didn't look too closely at the patch, but do make sure that we properly
> handle things like
> 
> oS.put({ a: reallyHugeArrayBuffer, b: blob1, c: blob2 });
> 
> Also make sure to test this of course.

That should work correctly, but I'll add a test for it.
I canceled the review because I forgot to add a flag for the compression of externally stored structured clone data.

Jan Varga [:janv]

Assignee

Comment 40

•

10 years ago

Attached patch Part 2: Use the size of structured clone data to choose whether it will be stored in the database or in the filesystem (obsolete) — Details — Splinter Review

Attachment #8401073 - Attachment is obsolete: true

Attachment #8406806 - Flags: review?(bent.mozilla)

Jan Varga [:janv]

Assignee

Comment 41

•

10 years ago

I'll attach the test separately.

(no longer active)

Comment 42

•

10 years ago

Jan, is this urgent or can we get to this after Jamun is done?  Thanks!

Flags: needinfo?(Jan.Varga)

Ben Turner (not reading bugmail, use the needinfo flag!)

Reporter

Comment 43

•

10 years ago

Also maybe a question for luke ^

Flags: needinfo?(luke)

Luke Wagner [:luke]

Comment 44

•

10 years ago

I don't know of any super-pressing needs for this feature, just that we've hit this (comment 1, and again more recently) and had to tell people to change the code to store incrementally in smaller ABs.

Flags: needinfo?(luke)

Jan Varga [:janv]

Assignee

Comment 45

•

10 years ago

(In reply to :Ehsan Akhgari (lagging on bugmail, needinfo? me!) from comment #42)
> Jan, is this urgent or can we get to this after Jamun is done?  Thanks!

Yeah, we'll get to it after Jamun is done.

Flags: needinfo?(Jan.Varga)

Jan Varga [:janv]

Assignee

Comment 46

•

10 years ago

Comment on attachment 8400667 [details] [diff] [review]
Part 3: Fix tests

Clearing review request for now.

Attachment #8400667 - Flags: review?(bent.mozilla)

Jan Varga [:janv]

Assignee

Updated

•

10 years ago

Attachment #8406806 - Flags: review?(bent.mozilla)

Jan Varga [:janv]

Assignee

Comment 47

•

9 years ago

I'm going to try to resurrect this patch.

Ben Turner (not reading bugmail, use the needinfo flag!)

Reporter

Comment 48

•

9 years ago

IDB code should be pretty stable now, the only other relevant big change pending is locale-aware indexes (bug 871846).

Jan Varga [:janv]

Assignee

Comment 49

•

8 years ago

I have a patch queue for this:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=dfb8873f8462600cd2bdcb6f1240fe15144dfcd1

It's a prerequisite for WASM in IDB.

This version also supports compression of externally stored structured clone data w/o
creating flat buffers.
So there are no big buffer copies. I created an input stream wrapper for structured clone data
and compression is done using the snappy compression output stream.

I also upgraded the snappy library to the latest version which gives better compression ratio
and is also slightly faster.

Jan Varga [:janv]

Assignee

Updated

•

8 years ago

Blocks: 1276029

Jan Varga [:janv]

Assignee

Updated

•

8 years ago

Depends on: 1272855, 768074

Jan Varga [:janv]

Assignee

Updated

•

8 years ago

Attachment #8367607 - Attachment is obsolete: true

Jan Varga [:janv]

Assignee

Updated

•

8 years ago

Attachment #8400667 - Attachment is obsolete: true

Jan Varga [:janv]

Assignee

Updated

•

8 years ago

Attachment #8406806 - Attachment is obsolete: true

Jan Varga [:janv]

Assignee

Updated

•

8 years ago

Status: NEW → ASSIGNED

Jan Varga [:janv]

Assignee

Updated

•

8 years ago

Blocks: 1311466

Jan Varga [:janv]

Assignee

Updated

•

8 years ago

No longer blocks: 1276029

Jan Varga [:janv]

Assignee

Comment 50

•

8 years ago

Attached patch Part 1: Use a type in StructuredCloneFile instead of a boolean; r=asuth — Details — Splinter Review

Attachment #8802722 - Flags: review?(bugmail)

Jan Varga [:janv]

Assignee

Comment 51

•

8 years ago

Attached patch Part 2: Refactor file ids handling for better expandability; r=asuth — Details — Splinter Review

Attachment #8802723 - Flags: review?(bugmail)

Jan Varga [:janv]

Assignee

Comment 52

•

8 years ago

Attached patch Part 3: Implement an input stream wrapper around structured clone data; r=asuth — Details — Splinter Review

Attachment #8802724 - Flags: review?(bugmail)

Jan Varga [:janv]

Assignee

Comment 53

•

8 years ago

Attached patch Part 4: Update keys directly in the structured clone buffer; r=asuth — Details — Splinter Review

Attachment #8802725 - Flags: review?(bugmail)

Jan Varga [:janv]

Assignee

Comment 54

•

8 years ago

Attached patch Part 5: Add a data threshold pref; r=asuth — Details — Splinter Review

Attachment #8802726 - Flags: review?(bugmail)

Jan Varga [:janv]

Assignee

Comment 55

•

8 years ago

Attached patch Part 6: Core changes for storing structured clone data outside of the database; r=asuth — Details — Splinter Review

Attachment #8802727 - Flags: review?(bugmail)

Jan Varga [:janv]

Assignee

Comment 56

•

8 years ago

Attached patch Part 7: Compress externally stored structured clone data; r=asuth — Details — Splinter Review

Attachment #8802728 - Flags: review?(bugmail)

Jan Varga [:janv]

Assignee

Comment 57

•

8 years ago

Attached patch Part 8: Disable externally stored structured clone data in file tests; r=asuth — Details — Splinter Review

Attachment #8802729 - Flags: review?(bugmail)

Jan Varga [:janv]

Assignee

Comment 58

•

8 years ago

Attached patch Part 9: Add a new test for externally stored structured clone data; r=asuth — Details — Splinter Review

Attachment #8802730 - Flags: review?(bugmail)

Jan Varga [:janv]

Assignee

Comment 59

•

8 years ago

try looks good
https://treeherder.mozilla.org/#/jobs?repo=try&revision=45818f1983265c40812f06078a0344f7becfb727

Jan Varga [:janv]

Assignee

Updated

•

8 years ago

Attachment #8802729 - Attachment description: Part 8: Disable externally stored structured clone dat a in file tests; r=asuth → Part 8: Disable externally stored structured clone data in file tests; r=asuth

Jan Varga [:janv]

Assignee

Updated

•

8 years ago

Attachment #8802730 - Attachment description: Part 9: Add a new test for externally stored structure d clone data; r=asuth → Part 9: Add a new test for externally stored structured clone data; r=asuth

Jan Varga [:janv]

Assignee

Comment 60

•

8 years ago

Comment on attachment 8802725 [details] [diff] [review]
Part 4: Update keys directly in the structured clone buffer; r=asuth

Review of attachment 8802725 [details] [diff] [review]:
-----------------------------------------------------------------

This is needed because when we compress externally stored structured clone data we don't need to create one big flat buffer. We read from structured clone buffers directly, so the key must be updated there. The writing is a bit complicated since we can cross buffers after each char of the key in theory.

Jan Varga [:janv]

Assignee

Comment 61

•

8 years ago

Comment on attachment 8802726 [details] [diff] [review]
Part 5: Add a data threshold pref; r=asuth

Review of attachment 8802726 [details] [diff] [review]:
-----------------------------------------------------------------

I decided to have 1 MB threshold for now. We can lower it once we are confident there are no bugs related to this new feature.

Jan Varga [:janv]

Assignee

Comment 62

•

8 years ago

Comment on attachment 8802727 [details] [diff] [review]
Part 6: Core changes for storing structured clone data outside of the database; r=asuth

Review of attachment 8802727 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/indexedDB/ActorsParent.cpp
@@ +25726,5 @@
> +  if (mDataOverThreshold) {
> +    // Higher 32 bits reserved for flags like compressed/uncompressed. For now,
> +    // we don't compress externally stored structured clone data since snappy
> +    // doesn't support streaming yet and we don't want to allocate another
> +    // large memory buffer for that.

This comment is updated in following patch for compression.

Jan Varga [:janv]

Assignee

Comment 63

•

8 years ago

Comment on attachment 8802729 [details] [diff] [review]
Part 8: Disable externally stored structured clone data in file tests; r=asuth

Review of attachment 8802729 [details] [diff] [review]:
-----------------------------------------------------------------

We need this just in case someone runs all the tests with the threshold set to 0.
In that case all structured clones need to allocate a file id.
However, file tests have static assumptions about blob file ids, so file tests don't pass.
It would be really hard and messy to fix that.

Andrew Sutherland [:asuth] (he/him)

Comment 64

•

8 years ago

Comment on attachment 8802722 [details] [diff] [review]
Part 1: Use a type in StructuredCloneFile instead of a boolean; r=asuth

Review of attachment 8802722 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/indexedDB/ActorsParent.cpp
@@ +8162,5 @@
>  };
>  
>  struct ObjectStoreAddOrPutRequestOp::StoredFileInfo final
>  {
> +  enum Type

note to other readers: Happily this duplicated enum type is removed in bug 1311466's part 1 patch.

Attachment #8802722 - Flags: review?(bugmail) → review+

Andrew Sutherland [:asuth] (he/him)

Updated

•

8 years ago

Attachment #8802723 - Flags: review?(bugmail) → review+

Andrew Sutherland [:asuth] (he/him)

Updated

•

8 years ago

Attachment #8802724 - Flags: review?(bugmail) → review+

Andrew Sutherland [:asuth] (he/him)

Updated

•

8 years ago

Attachment #8802725 - Flags: review?(bugmail) → review+

Andrew Sutherland [:asuth] (he/him)

Comment 65

•

8 years ago

Comment on attachment 8802726 [details] [diff] [review]
Part 5: Add a data threshold pref; r=asuth

Review of attachment 8802726 [details] [diff] [review]:
-----------------------------------------------------------------

Looking forward to having the threshold lowered!  I expect the target would want to be something in the neighborhood of PAGE_SIZE/20 (per https://www.sqlite.org/withoutrowid.html#when_to_use_without_rowid) since object_data is WITHOUT ROWID.  Of course, depending on per-file overhead, it may make sense to adjust the value upwards and/or consider introducing an additional file type for sizes from PAGE/20 up to 4*PAGE_SIZE where we use a dedicated blob table in the database.  Until/unless we have a blob pool database, a value around PAGE_SIZE might be least bad.

Specifically, when referring to per-file overhead, I mean:
- file system overhead for opening a new file, especially:
  - if we continue to use a flat directory structure and run the risk of file-systems not being happy about us accumulating tens of thousands of files in a single directory and encountering scale problems there.
  - if we find that antivirus software is increasing latencies.  We definitely would want to benchmark with a few AV solutions.
- disk waste/storage bloat if the file-system requires that at least one disk sector is used per file, etc.

::: dom/indexedDB/IndexedDatabaseManager.cpp
@@ +398,5 @@
>  #endif
>    Preferences::RegisterCallbackAndCall(LoggingModePrefChangedCallback,
>                                         kPrefLoggingEnabled);
>  
> +  Preferences::RegisterCallbackAndCall(DataThresholdPrefChangedCallback,

note for other readers: I had a question about using the callback idiom versus adding more atomic-capable cache variants like AddAtomicUintVarCache but part 8 adds logic that upgrades -1 to INT32_MAX which seems like an appropriate use for continuing the existing pattern.  (The AtomicBoolPrefChangedCallback could use a new atomic *Cache variant if the atomic bit is really necessary, but the churn doesn't seem worth it anytime soon.)

Attachment #8802726 - Flags: review?(bugmail) → review+

Andrew Sutherland [:asuth] (he/him)

Comment 66

•

8 years ago

Comment on attachment 8802727 [details] [diff] [review]
Part 6: Core changes for storing structured clone data outside of the database; r=asuth

Review of attachment 8802727 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/indexedDB/ActorsParent.cpp
@@ +4118,5 @@
>  nsresult
> +UpgradeSchemaFrom24_0To25_0(mozIStorageConnection* aConnection)
> +{
> +  // The only change between 24 and 25 was a different file_ids format,
> +  // but it's backwards-compatible.

Note for other readers: The "data" column is created with BLOB type affinity which means no conversion/coercion is performed by SQLite, so there is no schema change to worry about.

@@ +9586,5 @@
>    int32_t id;
>    StructuredCloneFile::Type type;
>  
> +  bool isDot = false;
> +  if ((isDot = aText.First() == '.') ||

Note for other readers: This lint-nit gets mooted in bug 1311466's part 6.

@@ +25748,3 @@
>      size_t compressedLength = snappy::MaxCompressedLength(uncompressedLength);
>  
> +    // We don't have a smart pointer class that calls free, so we need to

UniquePtr supports destructor policies and UniquePtrExtensions.h defines a FreePolicy that can be used.  I don't think this comment/code got mooted in subsequent patches, but I could be wrong.

::: dom/indexedDB/PBackgroundIDBSharedTypes.ipdlh
@@ +53,5 @@
>  };
>  
>  union BlobOrMutableFile
>  {
> +  null_t;

Note for other readers: The ambiguity with NullableMutableFile is eliminated in bug 1311466's Part 2 which eliminates NullableMutableFile by extracting PBackgroundMutableFile up into the BlobOrMutableFile union.

Attachment #8802727 - Flags: review?(bugmail) → review+

Andrew Sutherland [:asuth] (he/him)

Updated

•

8 years ago

Attachment #8802728 - Flags: review?(bugmail) → review+

Andrew Sutherland [:asuth] (he/him)

Comment 67

•

8 years ago

Comment on attachment 8802729 [details] [diff] [review]
Part 8: Disable externally stored structured clone data in file tests; r=asuth

Review of attachment 8802729 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/indexedDB/IndexedDatabaseManager.cpp
@@ +262,2 @@
>      Preferences::GetInt(aPrefName, kDefaultDataThresholdBytes);
> +  if (dataThresholdBytes == -1) {

This could benefit from a simple comment like your comment 63.  Ex:
// The magic -1 is for use only by tests that depend on stable blob file id's.

Attachment #8802729 - Flags: review?(bugmail) → review+

Andrew Sutherland [:asuth] (he/him)

Updated

•

8 years ago

Attachment #8802730 - Flags: review?(bugmail) → review+

Jan Varga [:janv]

Assignee

Comment 68

•

8 years ago

(In reply to Andrew Sutherland [:asuth] from comment #65)
> Comment on attachment 8802726 [details] [diff] [review]
> Part 5: Add a data threshold pref; r=asuth
> 
> Review of attachment 8802726 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Looking forward to having the threshold lowered!  I expect the target would
> want to be something in the neighborhood of PAGE_SIZE/20 (per
> https://www.sqlite.org/withoutrowid.html#when_to_use_without_rowid) since
> object_data is WITHOUT ROWID.  Of course, depending on per-file overhead, it
> may make sense to adjust the value upwards and/or consider introducing an
> additional file type for sizes from PAGE/20 up to 4*PAGE_SIZE where we use a
> dedicated blob table in the database.  Until/unless we have a blob pool
> database, a value around PAGE_SIZE might be least bad.
> 
> Specifically, when referring to per-file overhead, I mean:
> - file system overhead for opening a new file, especially:
>   - if we continue to use a flat directory structure and run the risk of
> file-systems not being happy about us accumulating tens of thousands of
> files in a single directory and encountering scale problems there.
>   - if we find that antivirus software is increasing latencies.  We
> definitely would want to benchmark with a few AV solutions.
> - disk waste/storage bloat if the file-system requires that at least one
> disk sector is used per file, etc.

Yeah, these are excellent points. However implementation of a blob pool is a complex task. Changing the flat directory structure doesn't look so complex and could be done sooner. But as you said, many small files can slow down things significantly, we should be careful about lowering the threshold.
Anyway, I think we will be in much better shape with the 1MB limit, since it removes the need for big flat buffers and the SQLite database won't be stressed so much.

Jan Varga [:janv]

Assignee

Comment 69

•

8 years ago

Attached patch Part 10: Fix a possible leak in BindAdoptedBlobByName()/BindAdoptedBlobByIndex(); r=mak — Details — Splinter Review

I think these methods are expected to free the value even when an error occurs. The AdoptedBlobVariant does the job, but NS_ENSURE_ARG_MAX can trigger an early return (before AdoptedBlobVariant is created).

Attachment #8803847 - Flags: review?(mak77)

Jan Varga [:janv]

Assignee

Comment 70

•

8 years ago

(In reply to Andrew Sutherland [:asuth] from comment #66)
> @@ +25748,3 @@
> >      size_t compressedLength = snappy::MaxCompressedLength(uncompressedLength);
> >  
> > +    // We don't have a smart pointer class that calls free, so we need to
> 
> UniquePtr supports destructor policies and UniquePtrExtensions.h defines a
> FreePolicy that can be used.  I don't think this comment/code got mooted in
> subsequent patches, but I could be wrong.

Ok, fixed. I also found a possible leak in mozStorage code. See patch Part 10.

Jan Varga [:janv]

Assignee

Comment 71

•

8 years ago

(In reply to Andrew Sutherland [:asuth] from comment #67)
> @@ +262,2 @@
> >      Preferences::GetInt(aPrefName, kDefaultDataThresholdBytes);
> > +  if (dataThresholdBytes == -1) {
> 
> This could benefit from a simple comment like your comment 63.  Ex:
> // The magic -1 is for use only by tests that depend on stable blob file
> id's.

Ok, added.

Marco Bonardo [:mak] (Away Apr 25 - May 5)

Comment 72

•

8 years ago

Comment on attachment 8803847 [details] [diff] [review]
Part 10: Fix a possible leak in BindAdoptedBlobByName()/BindAdoptedBlobByIndex(); r=mak

Review of attachment 8803847 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM. fwiw, Andrew can review any change to Storage as well (and I honestly think you should act as a peer as well!)

Attachment #8803847 - Flags: review?(mak77) → review+

Jan Varga [:janv]

Assignee

Comment 73

•

8 years ago

(In reply to Marco Bonardo [::mak] from comment #72)
> Comment on attachment 8803847 [details] [diff] [review]
> Part 10: Fix a possible leak in
> BindAdoptedBlobByName()/BindAdoptedBlobByIndex(); r=mak
> 
> Review of attachment 8803847 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> LGTM. fwiw, Andrew can review any change to Storage as well (and I honestly
> think you should act as a peer as well!)

Oh, ok, will do next time :)

Jan Varga [:janv]

Assignee

Comment 74

•

8 years ago

Ok, this is ready, but I want to wait for bug 1311466 and land them together.
The other bug extends the format of file_ids column too. We don't need to add another schema upgrade of IDB databases if we land them together.

Jan Varga [:janv]

Assignee

Comment 75

•

8 years ago

Heads-up: Bug 1302036 just landed and increased the IDB schema version, I updated appropriate patch for this bug to use the same schema since both land at the same day. If there are any backouts I'll fix things to get it right.

Pulsebot

Comment 76

•

8 years ago

Pushed by jvarga@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/12213ad49dc0
Part 1: Use a type in StructuredCloneFile instead of a boolean; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/e2166a89ee40
Part 2: Refactor file ids handling for better expandability; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/a1fcb8001d2a
Part 3: Implement an input stream wrapper around structured clone data; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/97e29df357f7
Part 4: Update keys directly in the structured clone buffer; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/94f3aca1fe73
Part 5: Add a data threshold pref; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/34a3a5034620
Part 6: Core changes for storing structured clone data outside of the database; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/34b5720fa068
Part 7: Compress externally stored structured clone data; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/9113736a76ce
Part 8: Disable externally stored structured clone data in file tests; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/1e2c6afa8a5e
Part 9: Add a new test for externally stored structured clone data; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/e1c649956724
Part 10: Fix a possible leak in BindAdoptedBlobByName()/BindAdoptedBlobByIndex(); r=mak

Jan Varga [:janv]

Assignee

Comment 77

•

8 years ago

Attached patch A follow-up fix. Pass correct buffer size to CopyFileData(); r=asuth — Details — Splinter Review

Attachment #8804440 - Flags: review?(bugmail)

Andrew Sutherland [:asuth] (he/him)

Updated

•

8 years ago

Attachment #8804440 - Flags: review?(bugmail) → review+

Pulsebot

Comment 78

•

8 years ago

Pushed by jvarga@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/97d5eb039b32
A follow-up fix. Pass correct buffer size to CopyFileData(); r=asuth

Carsten Book [:Tomcat]

Comment 79

•

8 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/12213ad49dc0
https://hg.mozilla.org/mozilla-central/rev/e2166a89ee40
https://hg.mozilla.org/mozilla-central/rev/a1fcb8001d2a
https://hg.mozilla.org/mozilla-central/rev/97e29df357f7
https://hg.mozilla.org/mozilla-central/rev/94f3aca1fe73
https://hg.mozilla.org/mozilla-central/rev/34a3a5034620
https://hg.mozilla.org/mozilla-central/rev/34b5720fa068
https://hg.mozilla.org/mozilla-central/rev/9113736a76ce
https://hg.mozilla.org/mozilla-central/rev/1e2c6afa8a5e
https://hg.mozilla.org/mozilla-central/rev/e1c649956724
https://hg.mozilla.org/mozilla-central/rev/97d5eb039b32

Status: ASSIGNED → RESOLVED

Closed: 8 years ago

status-firefox52: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla52

You need to log in before you can comment on or make changes to this bug.