Closed Bug 1373089 Opened 7 years ago Closed 5 years ago

Use Sets rather than objects to store available locales in Intl API

Categories

(Core :: JavaScript: Internationalization API, enhancement, P3)

enhancement

Tracking

()

RESOLVED FIXED
mozilla71
Tracking Status
firefox57 --- wontfix
firefox71 --- fixed

People

(Reporter: kmag, Assigned: anba)

References

Details

(Whiteboard: [MemShrink:P2])

Attachments

(7 files, 1 obsolete file)

I was looking into a memory regression using the Dominator tree view of the devtools memory panel, and happened to notice that I had two _availableLocales objects in every window that were ~48K each, which about 36K of just shape data.

It seems this is caused by the fact that my system knows about ~750 locales, and we're storing a property for each on the availableLocales object, and a Shape object for each property.

It seems like we'd be much better off using Sets for these objects. They would be much more memory efficient, and I doubt that they'd be any slower.
(extLangMappings, incidentally, has a similar issue, but with only 10K of shape data per window.)
Attached patch bug1373089.patch (obsolete) — Splinter Review
Patch to use Sets instead of Objects for the available locales, in case someone wants to measure the memory impact if we switch the underlying storage type. The patch also makes the `extlangMappings` object a bit smaller by avoiding to store duplicate information (extlangMappings' keys and the value of the corresponding "preferred" properties are always the same strings, so we actually only need store the "prefix" values).
Comment on attachment 8884566 [details] [diff] [review]
bug1373089.patch

Review of attachment 8884566 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/builtin/Intl.cpp
@@ +817,2 @@
>      uint32_t count = countAvailable();
>      for (uint32_t i = 0; i < count; i++) {

Drive-by perf comment, feel free to ignore:

Given kmag was seeing 750 of these we might consider other perf improvements as well. We're effectively doing two strlens on the same data (one for the dup, one for the atomize).

Additionally in the while loop we're iterating on lang (the beginning of the string) rather than p + 1 (the current location in the string).

We could also avoid alloc/free churn by copying into a reasonably sized stack buffer instead.
Whiteboard: [MemShrink] → [MemShrink:P2]
Priority: -- → P3

Both caches (localeCandidateCache and localeCache) are keyed on the same value, so when there's a
cache miss, it always happens in both caches at the same time. That means the caching in
DefaultLocaleIgnoringAvailableLocales can be removed and instead we can directly inline the code
into its sole caller.

This will allow to share these hash-sets across multiple globals.

Depends on D40381

When we move the default locale computation to C++, we don't have to worry
about keeping the last-ditch locale and old-style locale mappings synchronised
between JS and C++ code.

Depends on D40384

Assignee: nobody → andrebargull
Status: NEW → ASSIGNED
Type: defect → enhancement
Depends on: 1570370
Attachment #8884566 - Attachment is obsolete: true

The new patches series does not only change the representation from JS objects to hash-sets, but additionally also moves the hash-sets into SharedIntlData, which will allow to reuse them across multiple globals.

(In reply to Eric Rahm [:erahm] (Out until Aug 6th) from comment #3)

Drive-by perf comment, feel free to ignore:

Given kmag was seeing 750 of these we might consider other perf improvements
as well. We're effectively doing two strlens on the same data (one for the
dup, one for the atomize).

Additionally in the while loop we're iterating on lang (the beginning of the
string) rather than p + 1 (the current location in the string).

We could also avoid alloc/free churn by copying into a reasonably sized
stack buffer instead.

Applied all three suggestions in the new patches. :-)

Blocks: 1433306
Blocks: 1568760

Because:

  • udat_countAvailable and unum_countAvailable directly return the result of uloc_countAvailable,
  • and udat_getAvailable and unum_getAvailable directly return the result of uloc_getAvailable,

we don't need to keep the list of DateFormat and NumberFormat available locales in separate
hash-sets, but instead can reuse the set used for ULocale.

Depends on D40385

Pushed by dluca@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/949651526ead
Part 1: Inline DefaultLocaleIgnoringAvailableLocales into its sole caller. r=jwalden
https://hg.mozilla.org/integration/autoland/rev/07abc37fc753
Part 2: Add LanguageTag::toString function. r=jwalden
https://hg.mozilla.org/integration/autoland/rev/72e1f59d79d7
Part 3: Add available-locales set to SharedIntlData. r=jwalden
https://hg.mozilla.org/integration/autoland/rev/9f6157252362
Part 4: Move BestAvailableLocale function to C++. r=jwalden
https://hg.mozilla.org/integration/autoland/rev/40a509a1f81f
Part 5: Remove no longer used functions. r=jwalden
https://hg.mozilla.org/integration/autoland/rev/8d6a70e38898
Part 6: Move default locale computation to C++. r=jwalden
https://hg.mozilla.org/integration/autoland/rev/d504123bd527
Part 7: Remove separate sets for DateFormat and NumberFormat available locales. r=jwalden

Keywords: checkin-needed
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: