Closed Bug 914414 Opened 11 years ago Closed 10 years ago

Refactor l10n.js

Categories

(Firefox OS Graveyard :: Gaia::L10n, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: stas, Assigned: stas)

References

Details

(Keywords: perf, Whiteboard: [perf-reviewed])

Attachments

(2 files, 5 obsolete files)

The first step of switching Gaia over to L20n (bug 913712) should minimize the impact of the change;  we therefore simply aim to replace the shared/js/l10n.js file.  No changes to the API nor the file format are being made at this point.

We created a branch of the main L20n codebase which supports the .properties files and is fully compatible with the navigator.mozL10n API.  This means that no changes are need to the l10n infrastructure, the build infrastructure and the developer process.  Everything should just work as it did before.

The code is developed in https://github.com/l20n/l20n.js/tree/gaia and bugs can be filed against the L20n Product in Bugzilla (Component: Gaia Bindings).  Test results are at https://travis-ci.org/l20n/l20n.js

Switching to this special version of L20n already brings a number of improvements:

 - modular code which encapsulates main concepts of localization with a clean OOP approach,
 - clean start-up which prevents race conditions on app launch and on language change,
 - robust language negotiation via the Intl object,
 - smart language fallback using the negotiatied fallback chain,
 - much better error reporting, especially on build time,
 - increased security thanks to the compilation step.

Other features of L20n can be implemented over time as progressive improvements.
Attached file Use L20n/webl10n (pull request #12050) (obsolete) —
https://github.com/mozilla-b2g/gaia/pull/12050

The pull request is split into a few commits (which I'll squash later) for better overview of the changes.

Some minimal changes were needed to the build scripts (because of the change in how the inlined JSON looks like and where we take the information about the available locales). The pull request also updates a few tests which made too many assumptions about how l10n.sj was loaded.

The last commit lands the actual L20n code in shared/js/l10n.js.  It deserves perhaps a little bit more explanation:

The structure of L20n in https://github.com/l20n/l20n.js/tree/gaia (names of modules are derived from folder and file names):
 - ./lib/l20n/ - the core library
 - ./lib/client/l20n - clientside overrides (for I/O)
 - ./bindings/l20n - platform bindings

We recommend to review the components in the following order, starting from most isolated classes, to more algorithmic code:

1) event.js - simple EventEmitter

2) io.js - simple XHR based I/O

3) intl.js 
 - subset of ECMA 402 implementation extended to support prioritizeLocales giving us full language negotiation
 - once we standardize prioritizeLocales in second edition of ECMA 402 we will be able to remove this
 - read more https://mail.mozilla.org/pipermail/es-discuss/2013-July/031771.html

4) plurals.js - pretty close copy of the l10n.js:getPluralRule

5) parser.js 
 - .properties parser.
 - one thing to notice is that we initially parse every string as "simple" and only mark it for lazy complexString resolution

6) compiler.js - the compiler class creates JavaScript objects and functions out of the AST that is the result of the parsing.  This provides a runtime environment for localization and helps safe-guard the evaluation of entities and macro expressions.  The full version of L20n supports custom macros and the compilation step ensures no malicious code is being interpreted. Since shared/js/l10n.js doesn't support custom macros, we turned them off as well in this version, but by keeping the compiler we're ready to re-add them in the future with little effort.

7) context.js - context has three classes:
  - Resource - single resource object,
  - Locale - single locale object with multiple resources for a given locale id,
  - Context - core class of L20n which manages loading resources, resolving them and locale fallback.

8) webl10n.js
  - platform bindings. HTML node translation, pretranslation, INI parsing, public events
Assignee: nobody → stas
Status: NEW → ASSIGNED
Attachment #801911 - Flags: review?(21)
Attached patch as a single patch (obsolete) — Splinter Review
attaching pull request as a patch for review
Attachment #813746 - Flags: review?(21)
This is an updated patch which:

 - fixes failing unit tests:  https://travis-ci.org/stasm/gaia/builds/12545604
 - improves performance by removing the Intl object (not needed at this stage because the language negotiation against navigator.language is simple) and by optimizing parsing (previously, strings were parsed lazily, but since we store everything in JSON anyways, it's faster to parse everything up-front).

The performance results are as follows.  The deltas are not statistically significant.

(median)      master    patch     delta
---------------------------------------
browser	        536       526       -10
calendar        641       639        -2
camera          420       426        +6
clock           557       547       -10
contacts        715       709        -6
email           317       314        -3
gallery         431       443       +12
messages        581       586        +5
music           529       534        +5
phone           582       587        +5
settings	775       768        -7
video           486       480        -6

(How I tested:

  Gecko: https://github.com/mozilla/mozilla-central/commit/05cf2965be979befc68c19954567f59c56e14d6a
  Gaia master: https://github.com/mozilla-b2g/gaia/commit/8ab98212881cfe656ef9d1e1fcfff586f3e24568

  make clean && make production MAKECMDGOALS=production MOZILLA_OFFICIAL=1 NOFTU=1

  python b2gperf.py --iterations 30 --settle-time 10 --delay 10 browser…
)

Vivien, can you take a look?
Attachment #813746 - Attachment is obsolete: true
Attachment #813746 - Flags: review?(21)
Attachment #816913 - Flags: review?(21)
The numbers from Comment 3 are from Keon device.


For Unagi we measured an average of 4ms difference, on Peak the average difference out of means for those 12 apps is 0ms.
Attachment mime type: text/plain → text/x-github-pull-request
Blocks: 809600
Comment on attachment 816913 [details] [diff] [review]
Refactor l10n.js (tests fixed, perf improved)

Review of attachment 816913 [details] [diff] [review]:
-----------------------------------------------------------------

This is just my start at r? this patch. I more or less follow the instructions in https://bugzilla.mozilla.org/show_bug.cgi?id=914414#c1 at first and then I end up reading the code starting from load sequence to understand how stuffs are working together and what is the workflow. So I basically did: 1), 2) and 8). 
3) does not exists anymore and I don't mind about 4) since this is basically a copy of the old code.

As a result I still need to dig into (5, 6 and 7) the Parser, the Context and the Compiler and the various objects it is spawning. I have started to look at it to understand how the many pieces interact but not into too much details yet.

Some general things that I'm thinking about when reading the code. There is a lot of |function something() {}| that the browser needs to parse at load time. I wonder if we should not |var something = function() | for functions that may not be needed by all apps. I'm thinking mostly about buildMessage, flushBuildMessages, getSubDictionary, parseIni and stuff like that. I don't think it worth doing that in this patch but I wonder if we can save a few ms this way.

Reading the code again I wonder if moving the the io/parser/compiler code into a separate worker may be useful. I know you guys are thinking of doing some stuff to add more locales directly from a server. Again this is not for this patch and for now spawning a worker is expensive in memory iirc since they basically need to start a new JS runtime.

Globally I like the architecture I'm seeing here, stuffs are logically separated and this sounds great. But one of the concern here is memory, this code will spawn much more objects than the previous and especially for the compiler part if I have read it correctly. Can you guys do a small memory analysis on a few apps?
I think that the main process, the homescreen and the settings app with a few panels opened are good candidates. What I'm usually doing to check memory is to open the app, way a little bit for it to stabilized and then turn off the screen (this should force a garbage collection). Then I monitor the application memory usage by looking at the USS part while doing: |watch -n 1 'adb shell b2g-procrank'|. I'm interested in the results here.

Related to memory I wonder if you don't want to use more prototype stuff for Locale and Resource since if iiuc there could be multiple Locale per context and multiple Resource per context. 

I will continue reviewing the patch asap. You will also noticed that there is a lot of nits, I really like explicit naming instead of shortcuts and line breaks (they are free!).

::: .jshintignore
@@ +15,4 @@
>  apps/keyboard/js/imes/jspinyin/libpinyin.js
>  shared/js/setImmediate.js
>  apps/gallery/test/unit/setImmediate_test.js
> +shared/js/l10n.js

Actually people are pushing hard to have more files working with jshint. Some argues that it helps them to find bugs. Let's not regress that by adding more files to this.

Is there anything in particular that prevents you to have your code working with jshint ?

::: build/webapp-optimize.js
@@ +332,5 @@
>    var resources = doc.querySelectorAll('link[type="application/l10n"]');
>    if (resources.length) {
> +    let meta = doc.createElement('meta');
> +    meta.name = 'locales';
> +    meta.content = l10nLocales.join(',');

Same thing than the comments into shared/js/l10n.js. Please remove this as it could be discussed into a separate bug and will likely ends into endless discussion in this bug and this is way out of the scope of this bug.

::: shared/js/l10n.js
@@ +1,1 @@
> +(function(window, undefined) {

Let's keep the small vim header.

@@ +1,5 @@
> +(function(window, undefined) {
> +
> +function define(name, payload) {
> +  define.modules[name] = payload;
> +};

nit: extra ;

@@ +13,5 @@
> +  var parts = path.split('/');
> +  var normalized = [];
> +  for (var i = 0; i < parts.length; i++) {
> +    if (parts[i] == '.') {
> +      // don't add it to `normalized`

nit: add |continue;| That will avoid a warning with some linter.

@@ +43,3 @@
>  
> +  var module = define.modules[name];
> +  if (typeof module == "function") {

nit: ' -> " all over the places

@@ +47,5 @@
> +    var reply = module(req.bind(null, name), exports, { id: name, uri: "" });
> +    module = (reply !== undefined) ? reply : exports;
> +  }
> +  return define.exports[name] = module;
> +};

nit: extra ;

@@ +54,5 @@
> +var require = req.bind(null, null);
> +define('l20n/webl10n', function(require, exports, module) {
> +  'use strict';
> +
> +  var DEBUG = false;

Since this variable is used only for logMessage, let's declare it just above the logMessage declaration.

@@ +57,5 @@
> +
> +  var DEBUG = false;
> +
> +  var L20n = require('../l20n');
> +  var io = require('./platform/io');

Can you declare io in bootstrap since this is the only place where it is needed for l20n/webl10n ?

@@ +61,5 @@
> +  var io = require('./platform/io');
> +
> +  var ctx;
> +  var isBootstrapped = false;
> +  var isPretranslated;

= false ?

@@ +62,5 @@
> +
> +  var ctx;
> +  var isBootstrapped = false;
> +  var isPretranslated;
> +  var rtlLocales = ['ar', 'fa', 'he', 'ps', 'ur'];

Move rtlLocales into createPublicAPI since this is where it is used.

@@ +63,5 @@
> +  var ctx;
> +  var isBootstrapped = false;
> +  var isPretranslated;
> +  var rtlLocales = ['ar', 'fa', 'he', 'ps', 'ur'];
> +  var buildMessages;

Move buildMessages closer to addMessage and created the property in addMessages if needed.

@@ +72,2 @@
>      }
> +    document.addEventListener('readystatechange', function() {

nit: Add a name to the anonymous function

@@ +73,5 @@
> +    document.addEventListener('readystatechange', function() {
> +      if (document.readyState === state) {
> +        callback();
> +      }
> +    });

This method will throw a warning since it does not always return a function. I assume |return callback();| is a shortcut where what you want is | return; \n callback();|.

Also let's declare waitFor below the initialization block just above, since this is the first thing that the function does and it makes it clearer for an external reader.

@@ +84,5 @@
> +        isBootstrapped = false;
> +        buildMessages = {
> +          error: [],
> +          warn: []
> +        };

As said above move this initialization in addMessage and create properties if they are needed.

@@ +93,5 @@
> +      },
> +      enumerable: true
> +    });
> +  } else {
> +    ctx = L20n.getContext();

Seems like you always want a context, so you can |var ctx = L20n.getContext();| in the variable declaration at the top of the function.

@@ +98,5 @@
> +    navigator.mozL10n = createPublicAPI(ctx);
> +    ctx.addEventListener('error', logMessage.bind(null, 'error'));
> +    ctx.addEventListener('warning', logMessage.bind(null, 'warn'));
> +
> +    isPretranslated = document.documentElement.lang === navigator.language;

I think it worth adding a comment explaining that the build system set the lang of the document if it has been translated at build time.

Maybe it even worth mentionning the different scenarios that can happen for a web page? Not sure that everybody in Gaia knows them, this is probably unknown by third party contributors.

@@ +100,5 @@
> +    ctx.addEventListener('warning', logMessage.bind(null, 'warn'));
> +
> +    isPretranslated = document.documentElement.lang === navigator.language;
> +
> +    if (isPretranslated) {

I guess we can |if (isPrelocalized || document.readyState === 'complete') |

@@ +104,5 @@
> +    if (isPretranslated) {
> +      // if the DOM has been pretranslated, defer bootstrap as long as possible
> +      waitFor('complete', function() {
> +        window.setTimeout(bootstrap);
> +      });

Do an early return to avoid some indentation and to make clear that our work here is done if the page has been pretranslated.

@@ +119,5 @@
> +  }
> +
> +  function pretranslate() {
> +    if (inlineLocalization()) {
> +      // if inlineLocalization succeeded, defer bootstrap

This comment is not really helpful. You probably want to remove it or to tweak it as: // if some resources has been directly inserted as a JSON dictionary inside the document, defer bootstrap until the page is fully loaded.

@@ +133,5 @@
> +    var body = document.body;
> +    var scripts = body.querySelectorAll('script[type="application/l10n"]');
> +    if (scripts.length === 0) {
> +      return false;
> +    }

nit: add a line break

@@ +136,5 @@
> +      return false;
> +    }
> +    var inline = L20n.getContext();
> +    inline.addEventListener('error', logMessage.bind(null, 'error'));
> +    inline.addEventListener('warning', logMessage.bind(null, 'warn'));

Why do we need a new context and can we not reuse the one created by |var ctx = L20n.getContext();| ?

@@ +144,5 @@
> +      var lang = scripts[i].getAttribute('lang');
> +      langs.push(lang);
> +      // pass the node to save memory
> +      inline.addDictionary(scripts[i], lang);
> +    }

nit: add a line break

@@ +145,5 @@
> +      langs.push(lang);
> +      // pass the node to save memory
> +      inline.addDictionary(scripts[i], lang);
> +    }
> +    inline.once(function() {

nit: name the anonymous function.

@@ +148,5 @@
> +    }
> +    inline.once(function() {
> +      translateFragment(inline);
> +      isPretranslated = true;
> +    });

nit: add a line break

@@ +160,5 @@
> +
> +  function bootstrap(forcedLocale) {
> +    isBootstrapped = true;
> +
> +    var availableLocales = [];

This seems to be used only in iniLoaded, so let's move the declaration there.

@@ +164,5 @@
> +    var availableLocales = [];
> +
> +    var head = document.head;
> +    var iniLinks = head.querySelectorAll('link[type="application/l10n"]' + 
> +                                         '[href$=".ini"]');

Move iniLinks closer to where it is used.

@@ +173,5 @@
> +      var uri = jsonLinks[i].getAttribute('href');
> +      ctx.linkResource(uri.replace.bind(uri, /\{\{\s*locale\s*\}\}/));
> +    }
> +
> +    ctx.ready(function() {

nit: add a name to the anonymous function.

@@ +193,5 @@
> +    }
> +
> +    var iniToLoad = iniLinks.length;
> +    if (iniToLoad === 0) {
> +      ctx.registerLocales('en-US', getAvailable());

I'm not sure the behavior of getAvailable is something we want since it basically moves content negociation at the web page level where on the web it is done at the server level afaict and apps already declare the supported languages in the manifest.

I bet that you have a completely different opinion on that :) For the scope of this patch can we remove that for now and discuss it as a separate bug?

@@ +199,4 @@
>        return;
>      }
>  
> +    for (var i = 0; i < iniLinks.length; i++) {

I guess that |i| has already been declared by the previous loop. Sorry there is not |let| here :(

@@ +206,5 @@
> +
> +    function iniLoaded(url, err, text) {
> +      if (err) {
> +        throw err;
> +      }

nit: add a line break

@@ +220,5 @@
> +        ctx.requestLocales(forcedLocale || navigator.language);
> +      }
> +    }
> +
> +  }

Globally the startup code is a bit painful to follow. I assume that it is related to our various startup optimizations and different file formats though.

@@ +229,5 @@
> +      return metaLocs.getAttribute('content').split(',').map(String.trim);
> +    } else {
> +      return [];
> +    }
> +  }

As asked above, can you please remove this code and we can discuss that as a separate bug.

@@ +236,5 @@
> +    ini: {
> +      section: /^\s*\[(.*)\]\s*$/,
> +      import: /^\s*@import\s+url\((.*)\)\s*$/i
> +    }
> +  };

Do we really need the ini indirections or have you planned to support more file formats?

@@ +239,5 @@
> +    }
> +  };
> +
> +  function parseINI(source, iniPath) {
> +    var entries = source.split(/[\r\n]+/);

Does this regexp belongs to patterns ?

@@ +256,1 @@
>        }

nit: add a line break

@@ +273,1 @@
>      }

nit: add a line break

@@ +273,5 @@
>      }
> +    var dirs = baseUrl.split('/')
> +      .slice(0, -1)
> +      .concat(url.split('/'))
> +      .filter(function(elem) {

nit: elem -> path

@@ +284,5 @@
> +  function createPublicAPI(ctx) {
> +    return {
> +      get: function l10n_get(id, data) {
> +        var entity = ctx.getEntity(id, data);
> +        // entity.locale is null if the entity could not be displayed in any of 

nit: extra whitespace

@@ +294,5 @@
> +      },
> +      localize: localizeElement.bind(null, ctx),
> +      translate: translateFragment.bind(null, ctx),
> +      language: {
> +        get code() { 

nit: extra whitespace

@@ +330,2 @@
>  
> +  function addMessage(type, e) {

nit: addMessage -> addBuildMessage to be more consistent.

@@ +345,5 @@
> +    if (buildMessages.error.length) {
> +      console.log('[l10n] [' + ctx.supportedLocales[0] + ']: ' +
> +                  buildMessages.error.length + ' broken ' + variant + ': ' +
> +                  buildMessages.error.join(', '));
> +    }

Seems like you can loop over the property of buildMessages to display your those buildMessages.

@@ +380,5 @@
> +      var entity = ctx.getEntity(attrs.id, attrs.args);
> +      for (var id in entity.identifiers) {
> +        if (!entity.identifiers.hasOwnProperty(id)) {
> +          continue;
> +        }

Do you really need this check since you are iterating over the property of identifiers?

@@ +413,5 @@
>        }
>      }
>      return { id: l10nId, args: args };
>    }
> +  

nit: extra whitespace.
Please clean them all over the places since there is a few of them lurking all over the place.

@@ +446,5 @@
> +  function translateNode(ctx, node) {
> +    var attrs = getL10nAttributes(node);
> +    if (!attrs.id) {
> +      return true;
> +    }

nit: add a line break

@@ +457,5 @@
> +      setTextContent(node, entity.value);
> +    }
> +
> +    for (var key in entity.attributes) {
> +      if (entity.attributes.hasOwnProperty(key)) {

Same comment as a few lines above. I'm unsure why you call hasOwnProperty. I have not read all the code yet so I may have missed some inheritance.

@@ +468,5 @@
> +        }
> +      }
> +    }
> +    return true;
> +  }

Also the old code is doing stuff with aria-label that I don't see here.

@@ +500,5 @@
> +  }
> +  
> +  // translate an array of HTML elements
> +  // -- returns an array of elements that could not be translated
> +  function translateElements(ctx, elements) {

translateNodes since you rename translateElement -> translateNode and you may want to change a little bit the comment.

@@ +522,5 @@
> +    return untranslated;
> +  }
> +
> +  function fireLocalizedEvent(ctx) {
> +    var event = document.createEvent('Event');

createEvent('Event') blabla. new CustomEvent('localized'). I know this is old code but it makes more sense to use a custom event for now.

@@ +529,5 @@
> +    window.dispatchEvent(event);
> +  }
> +
> +  return L20n;
> +});

nit: add a line break or 2.

@@ +536,5 @@
> +
> +  var Context = require('./l20n/context').Context;
> +
> +  exports.Context = Context;
> +  exports.getContext = function L20n_getContext(id) {

I see that most of the calls to getContext does not pass an Id. Also the id sounds unused for now, let's remove it until it has some usage.

@@ +540,5 @@
> +  exports.getContext = function L20n_getContext(id) {
> +      return new Context(id);
> +  };
> +
> +});

nit: add a line break or 2.

@@ +685,5 @@
> +    this.id = id;
> +
> +    this.registerLocales = registerLocales;
> +    this.requestLocales = requestLocales;
> +    this.addResource = addResource;

Let's remove this for now since there is no caller and this is likely an API extension.

@@ +705,5 @@
> +      enumerable: true
> +    });
> +
> +    // language negotiator function
> +    var _negotiator;

Seems unused.

@@ +753,5 @@
> +
> +    function getSource(id) {
> +      if (!_isReady) {
> +        throw new ContextError('Context not ready');
> +      }

nit: add a line break

@@ +759,5 @@
> +      var locale = getLocale(_fallbackChain[current]);
> +      // if the requested id doesn't exist in the first locale, fall back
> +      while (!locale.getEntry(id)) {
> +        warn(new TranslationError('Not found', id, _fallbackChain, locale));
> +        var nextLoc = _fallbackChain[++current];

nit: nextLoc -> nextLocale

@@ +762,5 @@
> +        warn(new TranslationError('Not found', id, _fallbackChain, locale));
> +        var nextLoc = _fallbackChain[++current];
> +        if (!nextLoc) {
> +          return null;
> +        }

nit: add a line break

@@ +904,5 @@
> +        locale.resources.push(res);
> +      }
> +    }
> +
> +    function registerLocales(defLocale, available) {

nit: defLocale -> defaultLocale
available -> availableLocales

@@ +910,5 @@
> +        throw new ContextError('Context is frozen');
> +      }
> +
> +      if (defLocale === undefined) {
> +        return;

Don't you want to throw here as well?

@@ +922,5 @@
> +      }
> +      available.push(defLocale);
> +
> +      // uniquify `available` into `_registered`
> +      available.forEach(function(loc) {

nit: add a name to your anonymous function and loc -> locale

@@ +928,5 @@
> +          throw new ContextError('Language codes must be strings');
> +        }
> +        if (_registered.indexOf(loc) === -1) {
> +          _registered.push(loc);
> +        }

By looking at the usage of _registered, I wonder if you don't want to use a map instead?

@@ +932,5 @@
> +        }
> +      });
> +    }
> +
> +    function negotiate(available, requested, defLoc) {

defLoc -> defaultLocale

@@ +994,5 @@
> +      _isFrozen = true;
> +      _requested = Array.prototype.slice.call(arguments);
> +
> +      if (_requested.length) {
> +        _requested.forEach(function(loc) {

nit: give a name to the anonymous function and loc -> locale

@@ +1089,5 @@
> +    var args = Array.prototype.slice.call(arguments);
> +    var type = args.shift();
> +    if (!this._listeners[type]) {
> +      return false;
> +    }

nit: add a line break.

@@ +1090,5 @@
> +    var type = args.shift();
> +    if (!this._listeners[type]) {
> +      return false;
> +    }
> +    var typeListeners = this._listeners[type].slice();

Why do you need a copy instead of iterating directly on the array?

@@ +1098,5 @@
> +    return true;
> +  };
> +
> +  EventEmitter.prototype.addEventListener = function ee_add(type, listener) {
> +    if (!this._listeners[type]) {

nit: Some JS impl will throw a warning, so let's use |if (!(type in this._listeners))|

(Same for the other calls).

@@ +1102,5 @@
> +    if (!this._listeners[type]) {
> +      this._listeners[type] = [];
> +    }
> +    this._listeners[type].push(listener);
> +    return this;

The current code does not use chaining, so let's remove that.

@@ +1109,5 @@
> +  EventEmitter.prototype.removeEventListener = function ee_rm(type, listener) {
> +    var typeListeners = this._listeners[type];
> +    var pos = typeListeners.indexOf(listener);
> +    if (pos === -1) {
> +      return this;

Same comment as above for chaining. Let's just use |return;| for now.

@@ +1110,5 @@
> +    var typeListeners = this._listeners[type];
> +    var pos = typeListeners.indexOf(listener);
> +    if (pos === -1) {
> +      return this;
> +    }

nit: add a line break;

@@ +1117,5 @@
> +  };
> +
> +  exports.EventEmitter = EventEmitter;
> +
> +});

nit: Add a line break (or 2) between 2 modules. That will make them easier to find by looking quickly at the code.

@@ +1149,5 @@
> +
> +    function parse(source) {
> +      var ast = {};
> +
> +      var entries = source.split(/[\r\n]+/);

Does this regexp belongs to patterns?

@@ +1161,5 @@
> +        while (patterns['multiline'].test(line) && i < entries.length) {
> +          line = line.slice(0, line.length - 1) + entries[++i].trim();
> +        }
> +
> +        var entity_match = line.match(patterns['entity']);

nit: guideline is entity_match -> entityMatch

@@ +1169,5 @@
> +          } catch (e) {
> +            if (e instanceof ParserError) {
> +              _emitter.emit('error', e);
> +              continue;
> +            }

Can we avoid the try/catch here? I feel like this method is a good candidate for jitting and try/catch will prevent that IIRC.

@@ +1191,5 @@
> +
> +      if (attr) {
> +        if (!entry.attrs) {
> +          entry.attrs = {};
> +        }

Use the !(attrs in entry) form to avoid some warning in strict mode (spidermonkey strict mode).

same for other places.

@@ +1252,5 @@
> +      }
> +
> +      var nameElements = name.split('.');
> +
> +      if (nameElements.length > 1) {

nit: if (nameElements.length)

@@ +1255,5 @@
> +
> +      if (nameElements.length > 1) {
> +        var attrElements = [];
> +        attrElements.push(nameElements.pop());
> +        if (nameElements.length > 1) {

nit: Same here

@@ +1283,5 @@
> +                 .replace(/\\{/g, '{')
> +                 .replace(/\\}/g, '}')
> +                 .replace(/\\"/g, '"')
> +                 .replace(/\\'/g, "'");
> +      }

Sounds like you want a unescape method that calls unescapeUnicode and unescapeControlCharacters or similar.

@@ +1339,5 @@
> +
> +    function addEventListener(type, listener) {
> +      if (!_emitter) {
> +        throw Error('Emitter not available');
> +      }

I'm not against additional checks but I believe this can not happen with the current code?

@@ +1346,5 @@
> +
> +    function removeEventListener(type, listener) {
> +      if (!_emitter) {
> +        throw Error('Emitter not available');
> +      }

Same as above.

@@ +1378,5 @@
> +    this.compile = compile;
> +    this.addEventListener = addEventListener;
> +    this.removeEventListener = removeEventListener;
> +
> +    // Private

I'm not against those Public/Private comments but to be consistent you may want them all over the place.

@@ +1397,5 @@
> +
> +    function compile(ast, env) {
> +      if (!env) {
> +        env = {};
> +      }

nit: add a line break. I do like code where blocks are easy to identify, sorry if I'm painful with that.

Basically one of my rule is more or less: add a line break after each blocks.

@@ +1408,5 @@
> +        try {
> +          env[id] = new constructor(id, entry, env);
> +        } catch (e) {
> +          requireCompilerError(e);
> +        }

As I said (above or upper) try/catch prevents jitting as far as I know. We may want to confirm we one JS folks just in case this has changed.

@@ +1413,5 @@
> +      }
> +      return env;
> +    }
> +
> +    function addEventListener(type, listener) {

This one does not check _emitter! \o/

@@ +1427,5 @@
> +    function emit(ctor, message, entry, source) {
> +      var e = new ctor(message, entry, source);
> +      _emitter.emit('error', e);
> +      return e;
> +    }

emit -> emitError ?

@@ +1876,5 @@
> +      xhr.overrideMimeType('text/plain');
> +    }
> +    xhr.onreadystatechange = function() {
> +      if (xhr.readyState == 4) {
> +        if (xhr.status == 200 || xhr.status == 0) {

I know this is old code, but will it be possible to replace this by xhr.addEventListener('load', ...); and xhr.addEventListener('error', ...); ?

@@ +2306,2 @@
>  
> +// close the function defined in build/prefix/microrequire.js

not sure it is needed here since it does not mean anything for third party repository like Gaia.
Comment on attachment 816913 [details] [diff] [review]
Refactor l10n.js (tests fixed, perf improved)

Review of attachment 816913 [details] [diff] [review]:
-----------------------------------------------------------------

That's an awesome review! Thanks so much Vivien!

We're working on applying your feedback, I'm landing updates to https://github.com/zbraniecki/l20n.js/tree/gaia which will be then merged to https://github.com/l20n/l20n.js/tree/gaia and then into pull-request. Since the changes are rather small, I hope that it doesn't block you from moving forward with your review.

Detail comments below:

::: shared/js/l10n.js
@@ +43,3 @@
>  
> +  var module = define.modules[name];
> +  if (typeof module == "function") {

I hope you meant the other way around :)

@@ +93,5 @@
> +      },
> +      enumerable: true
> +    });
> +  } else {
> +    ctx = L20n.getContext();

we actually want to delay this in this scenario.

@@ +136,5 @@
> +      return false;
> +    }
> +    var inline = L20n.getContext();
> +    inline.addEventListener('error', logMessage.bind(null, 'error'));
> +    inline.addEventListener('warning', logMessage.bind(null, 'warn'));

because Context may be frozen and then you cannot add more entities to it.

@@ +193,5 @@
> +    }
> +
> +    var iniToLoad = iniLinks.length;
> +    if (iniToLoad === 0) {
> +      ctx.registerLocales('en-US', getAvailable());

yeah, we'll file a new bug for that and remove that from this patch.

@@ +220,5 @@
> +        ctx.requestLocales(forcedLocale || navigator.language);
> +      }
> +    }
> +
> +  }

yup. Once we port more things from l20n we'll want to have separate startup code for bto and runtime. That will help.

@@ +380,5 @@
> +      var entity = ctx.getEntity(attrs.id, attrs.args);
> +      for (var id in entity.identifiers) {
> +        if (!entity.identifiers.hasOwnProperty(id)) {
> +          continue;
> +        }

yeah, we identified all places where variable is passed from an outside environment and we marked those as not trusted, so we check that to remove this vector of attack.

This is the easiest vector of attack against the current l10n.js code (entity values like "foo {{ __proto__ }}" or "foo {{ toSource }}" work all too well, and "foo {{ watch }}" kills the app).

We made sure that this cannot happen to our code.

@@ +468,5 @@
> +        }
> +      }
> +    }
> +    return true;
> +  }

we'll add this, it has been added after we submitted the patch.

@@ +536,5 @@
> +
> +  var Context = require('./l20n/context').Context;
> +
> +  exports.Context = Context;
> +  exports.getContext = function L20n_getContext(id) {

I'd prefer to keep id here. It helps us in debugging, when we try to identify which document the context belongs to. We will want to use it more.

@@ +928,5 @@
> +          throw new ContextError('Language codes must be strings');
> +        }
> +        if (_registered.indexOf(loc) === -1) {
> +          _registered.push(loc);
> +        }

that sounds reasonable. Didn't know that we can use ES6 magic already :)

@@ +1090,5 @@
> +    var type = args.shift();
> +    if (!this._listeners[type]) {
> +      return false;
> +    }
> +    var typeListeners = this._listeners[type].slice();

to prevent our code from falling into a trap when addEventListener callback removes itself, while we're looping over the list to fire them.

@@ +1117,5 @@
> +  };
> +
> +  exports.EventEmitter = EventEmitter;
> +
> +});

yup, we'll add it to our dryice code.

@@ +1169,5 @@
> +          } catch (e) {
> +            if (e instanceof ParserError) {
> +              _emitter.emit('error', e);
> +              continue;
> +            }

I will test the perf impact, but would prefer not to. It's the best way for us to keep recovery code in parsing.

It also doesn't affect Gaia perf because this code is not used at runtime.
erm, I used the bugzilla review view to add my responses to your comments. Didn't know that bz doesn't know how to deal with that. 

Please read my comments here: https://bugzilla.mozilla.org/page.cgi?id=splinter.html&bug=914414&attachment=816913
Vivien: I filed bugs for two of your suggestions that require BTO changes.
(In reply to Vivien Nicolas (:vingtetun) (:21) from comment #5)

> Some general things that I'm thinking about when reading the code. There is
> a lot of |function something() {}| that the browser needs to parse at load
> time. I wonder if we should not |var something = function() | for functions
> that may not be needed by all apps. I'm thinking mostly about buildMessage,
> flushBuildMessages, getSubDictionary, parseIni and stuff like that. I don't
> think it worth doing that in this patch but I wonder if we can save a few ms
> this way.

For these specific functions, I think we'd prefer to move the build-specific code path to a separate file.

> Reading the code again I wonder if moving the the io/parser/compiler code
> into a separate worker may be useful. I know you guys are thinking of doing
> some stuff to add more locales directly from a server. Again this is not for
> this patch and for now spawning a worker is expensive in memory iirc since
> they basically need to start a new JS runtime.

Yes, this is exactly something we're thinking about and also the reason why the whole API is asynchronous.  I'd like to do more perf testing with and without workers in the future and file bugs as needed.
 
> Globally I like the architecture I'm seeing here, stuffs are logically
> separated and this sounds great. But one of the concern here is memory, this
> code will spawn much more objects than the previous and especially for the
> compiler part if I have read it correctly. Can you guys do a small memory
> analysis on a few apps?

We've been working on memory over past few days, Gandalf or I will post results by the end of the week.

> Related to memory I wonder if you don't want to use more prototype stuff for
> Locale and Resource since if iiuc there could be multiple Locale per context
> and multiple Resource per context. 

I have a WIP branch which does that, but I didn't see any difference memory-consumption-wise.  It may add up eventually, though, so I think it's a good suggestion.

> Actually people are pushing hard to have more files working with jshint.
> Some argues that it helps them to find bugs. Let's not regress that by
> adding more files to this.
> 
> Is there anything in particular that prevents you to have your code working
> with jshint ?

We use jshint in our repo with the following config:  https://github.com/l20n/l20n.js/blob/master/.jshintrc

I think there shouldn't be much problem with passing Gaia's make hint, but I had troubles running it, tbh.  I'll try tomorrow and see if I can make it work.

> ::: build/webapp-optimize.js
> @@ +332,5 @@
> >    var resources = doc.querySelectorAll('link[type="application/l10n"]');
> >    if (resources.length) {
> > +    let meta = doc.createElement('meta');
> > +    meta.name = 'locales';
> > +    meta.content = l10nLocales.join(',');
> 
> Same thing than the comments into shared/js/l10n.js. Please remove this as
> it could be discussed into a separate bug and will likely ends into endless
> discussion in this bug and this is way out of the scope of this bug.

I spent some time yesterday thinking about how we could get rid of this change.  I really like your strategy of dividing the patch into small parts and moving certain decisions to future patches, but unfortunately, I didn't come up with any good solutions.   The idea that the list of available locales is known up-front is very deeply ingrained into L20n.

The list of all available locales is needed for language switching and language negotiation.  Even if we don't use Intl in this patch, L20n doesn't allow switching the language to one that is not already known.  This is done in order to avoid unnecessary network requests.

To use an example, currently if you change language.current and navigator.language to something unsupported, say "xx", l10n.js will try to fetch locales/xx.json or locales/xx.properties as if it was a valid and supported locale.  Only upon the network request errorring out will it fall back to en-US.  L20n, on the other hand, does a simple negotiation between the requested language "xx" and the available languages (e.g. "fr", "en-US", "pl") and instantly knows that "xx" is not available, and uses the second-best language, in this case the one defined with GAIA_DEFAULT_LOCALE.

The consequence of this is that if the list of available locales doesn't have the language you want to switch to (maybe "de" in our example above), you can't switch to it in L20n.  That's why we propose to put the information about available locales in the meta element in HTML.  The alternative would be to fetch and parse locales.ini, which obviously would slow things down.  And because the L20n context freezes at some point (in order to start making network requests), it's not possible to add more locales to the list of available ones.

To sum up, I'd love to be able to move this to a future patch and I appreciate the opportunity, but I think it would be very hard for us to do it :(
Reading my previous comment again, I realize that I hadn't clearly stated the main advantage of L20n's approach.  Thanks to the fact that we know the list of available locales up-front, we are able to avoid unnecessary network requests in some cases.

More importantly, this architecture allows us to scale in the future and expand to scenarios where the language negotiation is done via the Intl object.  We're laying foundations with this patch and it will pay off soon, when we have navigator.languages.

The reason why this works reasonably well for the current l10n.js is that it doesn't use the Locale abstraction at all and instead, all translations are stored in a flat key-value dictionary (the order of reading translations in is important).  With this approach, it's possible to "add a new locale" so to speak, by reading new translations in and overwriting values in the dictionary.  This approach, however, is limiting when it comes to language fallback because l10n.js doesn't know which language the translations are coming from.

With this change, we're making l10n.js ready for the next step of supporting proper language negotiation and fallback, which is one of the goals of the refactor.
Blocks: 936532
Blocks: 936534
I compared memory usage of gaia dropin branch with master from today, on Keon device.

	        master	dropin	delta
Main process	48.62	48.76	-0.14
Homescreen	16.50	16.48	 0.02
Settings	15.24	15.28	-0.04
Clock	        14.98	14.96	 0.02
Contacts	16.22	16.78	-0.56
Email    	18.56	18.46	 0.10
Music   	15.34	15.50	-0.16
Phone   	13.88	14.14	-0.26
Messages	14.00	14.00	 0.00
Browser   	 9.80	 9.80	 0.00
Calendar	13.22	13.38	-0.16
		          AVG	-0.11

Vivien: Is that good enough for you to move forward with the review?
A small addendum:  I think it's clearer when the delta uses 'master' for reference.  So the average change from master to l20n-dropin-master is +0.11 MB, or 110 KB more.

Following Vivien's advice, we moved Resource and Locale classes to use prototypes [1][2].  I tested moving the Context to prototype as well, but I didn't see any improvements in memory consumption.  I might still land that change, as I like the code better.

[1] https://github.com/l20n/l20n.js/commit/28cb711e17e16d4249011579a231182b601a8df3
[2] https://github.com/l20n/l20n.js/commit/965445b2f66f783211bfcfa1b56720dccb2f3401

We optimized the Compiler to not leak the AST nodes after compiling entities [3] and we also explicitly clean the Locale objects after their built and compiled [4].

[3] https://github.com/l20n/l20n.js/commit/572e474d735f7c70f443ad89313dd7acd1527854
[4] https://github.com/l20n/l20n.js/commit/b6ed91ab1c1d0cc5417ce37b3ce4451ecc739c6b

Lastly, we split the whole file into two versions [5]:  a runtime one, which lives in shared/js/l10n.js and which doesn't have all the additional cruft to make it work with webapp-optimize, and a buildtime one, which lives in build/l10n.js, which provides specifal facilities for built-time optimizations.

[5] https://github.com/l20n/l20n.js/commit/06b27dc31e4a5dcf89b302ae9aea7372df156f49

I researched further optimizations for the Compiler, but they often come at the cost of security.
(In reply to Zbigniew Braniecki [:gandalf] from comment #11)
> I compared memory usage of gaia dropin branch with master from today, on
> Keon device.
> 
> 	        master	dropin	delta
> Main process	48.62	48.76	-0.14
> Homescreen	16.50	16.48	 0.02
> Settings	15.24	15.28	-0.04
> Clock	        14.98	14.96	 0.02
> Contacts	16.22	16.78	-0.56
> Email    	18.56	18.46	 0.10
> Music   	15.34	15.50	-0.16
> Phone   	13.88	14.14	-0.26
> Messages	14.00	14.00	 0.00
> Browser   	 9.80	 9.80	 0.00
> Calendar	13.22	13.38	-0.16
> 		          AVG	-0.11
> 
> Vivien: Is that good enough for you to move forward with the review?

Sounds like a big diff for contacts. do you have any idea what's going on here?
Comment on attachment 816913 [details] [diff] [review]
Refactor l10n.js (tests fixed, perf improved)

Due to the memory target we have I don't think we want to regress memory usage in any way. Also from the tests it is also unclear how many panels you have opened. Let's fix a bit the architecture so it does not consume extra memory, and is at least identical to what we have today.

One of my theory here is that the compiler is costly due to the large number of JS objects it generates.

Also since this is a refactor request, please remove the language negociation part. This is fine that it deserves some purpose in the future but as I said this stuff could be discussed into a separate bug.
Attachment #816913 - Flags: review?(21) → review-
Stas, are you still working on this?
(In reply to Dietrich Ayala (:dietrich) from comment #15)
> Stas, are you still working on this?

Yes, I'm hoping to have a new patch early next week.  The WIP uses less memory and is faster than the current l10n.js.
Keywords: perf
Whiteboard: [Memshrink]
Stas, Evelyn and I has been working on bug 958440 as a result to shrink memory takes of JSON files (and loaded JS object) in ./locales-obj/ of app packages. Converting data into typed array will leverage bug 945152 too, making us even more memory efficient.

Are you working on similar thing in l20n.js or is there any plan to do so?
I'm removing the MemShrink tag because "don't regress memory usage when landing a new feature" is not really something we can prioritize, that's just a thing that needs to be done.  If there's something MemShrink can do here to help you to eliminate the regression let me know and we'll be happy to help.
Whiteboard: [Memshrink]
Attached patch Refactor l10n.js (obsolete) — Splinter Review
We decided to branch the work on refactor in order to accomodate for memory target on top of performance one.

The new pull request has better memory characteristics, preserves the 
performance win and passes the tests.

A quick rundown of this refactor's main features:
- better initialization path which leads to fewer race conditions,
- OOP design, with helpful abstractions like Locale class,
- better security features based on the security review L20n got last year (bug 925579),
- leaner AST format which results in smaller memory footprint of the JSON resources.
- sets us on a path for l20n and features like locale fallback, phone/tablet string selection, responsive l10n etc.


Startup time (milliseconds):

                master      refactor      delta
Browser           405          414            9
Calendar          976          967           -9
Camera            904          884          -20
Clock             887          891            4
Contacts         1009          990          -19
Email             691          679          -12
Messages         1282         1287            5
Music             955          893          -62
Phone             825          823           -2
Settings         1606         1552          -54
Usage            1270         1231          -39


                  1.3       refactor      delta
Browser           446          452            6
Calendar          608          604           -4
Camera            497          502            5
Clock             545          543           -2
Contacts          579          580            1
Email             301          305            4
Messages          866          855          -11
Music             461          447          -14
Phone             520          517           -3
Settings         1052         1034          -18
Usage             912          881          -31


Memory:
                master      refactor      delta
Contacts USS    12230        12061         -169
Settings USS    10371        10636          264
Email USS       14944        14928          -17
Phone USS        9521         9452          -68
Contacts GC     10238        10398          160
Settings GC      9469         9427          -42
Email GC        13024        13000          -24
Phone GC         8106         8137           32


                  1.3      refactor       delta
Contacts USS    16537        16369         -168
Settings USS    15068        15060           -8
Email USS       18971        19298          327
Phone USS       13932        13225          -12
Clock USS       15487        15408          -79
Contacts GC     14930        15049          119
Settings GC     14072        14076            4
Email GC        17684        17676           -8
Phone GC        12764        12825           60
Clock GC        13768        13777            9


New pull-request: https://github.com/mozilla-b2g/gaia/pull/15985
Attachment #801911 - Attachment is obsolete: true
Attachment #816913 - Attachment is obsolete: true
Attachment #8373480 - Flags: review?(21)
(In reply to Zbigniew Braniecki [:gandalf] from comment #20)
 
> The new pull request has better memory characteristics, preserves the 
> performance win and passes the tests.

Actually there is one last test remaining for us to fix and I was hoping Vivien or Dietrich could point us to a person who may know what is going on.  I'm out of ideas.  The net error page looks different with the refactored l10n.js, as you can see here: http://imgur.com/a/dp4dK.  The failing test can be seen here: https://travis-ci.org/mozilla-b2g/gaia/builds/18620421.
No sooner than I submitted the previous comment did I find out what was going on.  Apparently the net error page never fires an onreadystatechange event for the 'complete' value, on which we are relying. I'll be investigating this further today.
I think the issue which I described in comment 22 is related to bug 444165.  I'm working on a work-around.  The patch is ready for review anyways.
Whiteboard: [perf-reviewed]
Stas, will this refactor impact bug 811136, or does that require a different fix?
I'm not sure if we can reproduce the problem on master. While working on this refactor we tried to minimize the risk of reflows (we dont translate html at startup on production for default locale at runtime) so we may fix that.

I need str to confirm if we fix it.
Any progress here? It seems this bug had gone dark for now reason.
Flags: needinfo?(stas)
It looks like it was sitting in the review queue for more than a month?
Flags: needinfo?(stas) → needinfo?(21)
(In reply to Tim Guan-tin Chien [:timdream] (MoCo-TPE) (please ni?) from comment #27)
> It looks like it was sitting in the review queue for more than a month?

Yeah. I postpone the review until the end of 1.4 as it won't make sense for 1.4 and the stability target. That's on my todo list for 1.5.
Flags: needinfo?(21)
Attached file Pull request
Attached patch Refactor l10n.js, take 2 (obsolete) — Splinter Review
Here's an updated patch which uses less memory by relying on a smaller number of objects created in the runtime. The compiler is much simpler: the object model has been reduced to the minimum without compromising the goal of the feature parity with the current l10n.js library.
Attachment #8373480 - Attachment is obsolete: true
Attachment #8373480 - Flags: review?(21)
Attachment #8400079 - Flags: review?(21)
Comment on attachment 8400079 [details] [diff] [review]
Refactor l10n.js, take 2

Review of attachment 8400079 [details] [diff] [review]:
-----------------------------------------------------------------

Beside my line break addiction, I know that yesterday I was fine with having a dedicated l10n.js file for build time but I was kind of hoping that there will be only a simple file that overidde some of the shared/js/l10n.js file. Looking at all the code duplication with this version it does not seems a good idea anymore :/ 

r- but that's mostly a lot of nits.
The 2 big things are:
 - a better approach for the build l10n.js specific solution.
 - more tests

::: shared/js/l10n.js
@@ +1,3 @@
>  'use strict';
>  
> +(function(window, undefined) {

Let's move the 'use strict' instruction inside the function declaration.

@@ +1,3 @@
>  'use strict';
>  
> +(function(window, undefined) {

Also while this file is beeing completely refactored can you remove it from build/jshint/xfail.list and fix the related messages ? (it can be done as a followup but that seems like a good time to do it).

@@ +18,5 @@
> +            callback(null, xhr.responseText);
> +          } else {
> +            callback(new Error('Not found: ' + url));
> +          }
> +        }

Any reasons to not use xhr.onload/onerror ? Or an event listener ?

@@ +62,5 @@
>  
> +  EventEmitter.prototype.emit = function ee_emit() {
> +    if (!this._listeners) {
> +      return;
> +    }

nit: add a lin break after code blocks that returns.

@@ +88,5 @@
>  
> +  EventEmitter.prototype.removeEventListener = function ee_rm(type, listener) {
> +    if (!this._listeners) {
> +      return;
> +    }

nit: add a line break

@@ +576,5 @@
> +        continue;
> +      }
> +
> +      while (patterns.multiline.test(line) && i < entries.length) {
> +        line = line.slice(0, line.length - 1) + entries[++i].trim();

line.slice() without arguments may do what you are looking for here. But that's basically copying the entire content right ?

If yes, can't you simply use += ?

@@ +580,5 @@
> +        line = line.slice(0, line.length - 1) + entries[++i].trim();
> +      }
> +
> +      var entityMatch = line.match(patterns.entity);
> +      if (entityMatch && entityMatch.length === 3) {

Can you add a comment explaining why |3| ?

@@ +589,5 @@
> +            ctx._emitter.emit('error', e);
> +          } else {
> +            throw e;
> +          }
> +        }

I'm not sure try/catch is a good idea here. IIRC it will prevent this code to be eventually jitted.

@@ +611,1 @@
>      }

nit: add a line break

@@ +611,5 @@
>      }
> +    if (!key) {
> +      obj[prop] = value;
> +      return;
> +    } else {

No need for the else statement here. Let's save some indentations.

@@ +622,5 @@
> +          obj[prop]._[key] = value;
> +        } else {
> +          obj[prop]._[key] = value;
> +        }
> +      }

Can't you simply remove the 3 statements that are doing:

obj[prop]._[key] = value; in the if/else block and do that once after this block ?

@@ +624,5 @@
> +          obj[prop]._[key] = value;
> +        }
> +      }
> +    }
> +    return;

Not sure the return is useful here :)

@@ +627,5 @@
> +    }
> +    return;
> +  }
> +
> +  function parseEntity(match, ast) {

Instead of passing the match here which belongs more or less to the regexp of the parse method, can you direcly pass value/key ?

@@ +628,5 @@
> +    return;
> +  }
> +
> +  function parseEntity(match, ast) {
> +    var name, key, attr, pos;

nit: declare |attr| closer to where it will be used.

@@ +650,5 @@
> +        // special quirk to comply with webl10n's behavior
> +        if (nestedProps.indexOf(
> +              nameElements[nameElements.length - 1]) !== -1) {
> +                attrElements.push(nameElements.pop());
> +              }

This code and it's indentation seems a bit weird to me. You're not iterating over all the nameElements items, but you check if nameElements[nameElements.length - 1] is inside nestedProps and add only the last elements of nameElements to attrElements.

What happens to the possible elements in the middle ? (I may misunderstood this code).

@@ +671,5 @@
> +      .replace(/\\f/g, '\f')
> +      .replace(/\\{/g, '{')
> +      .replace(/\\}/g, '}')
> +      .replace(/\\"/g, '"')
> +      .replace(/\\'/g, '\'');

Any way that the multiple .replace calls here can be factorize into one single call that use a single regexp that use capture parenthesis ?

@@ +683,3 @@
>  
> +  function unescapeString(str) {
> +    if (str.lastIndexOf('\\') !== -1) {

Oh! I didn't know lastIndexOf!

@@ +725,5 @@
>  
> +  Entity.prototype.resolve = function E_resolve(ctxdata) {
> +    if (this.dirty) {
> +      return undefined;
> +    }

nit: add a line break

@@ +731,5 @@
> +    var val;
> +    // if resolve fails, we want the exception to bubble up and stop the whole
> +    // resolving process;  however, we still need to clean up the dirty flag
> +    try {
> +      val = resolve(this.value, this.env, ctxdata, this.index);

It's a bit confusing that this method name |resolve| calls an other method call resolve that does not belongs to this object but is part of the global scope. I first thought it was a typo.

@@ +734,5 @@
> +    try {
> +      val = resolve(this.value, this.env, ctxdata, this.index);
> +    } finally {
> +      this.dirty = false;
> +    }

Can you explain a little bit what the dirty flag is used for ? It seems like it is used as a kind of guard, but what do you want to prevent and in which cases it can happens that this code is reentrant.

@@ +743,5 @@
> +    try {
> +      return this.resolve(ctxdata);
> +    } catch (e) {
> +      return undefined;
> +    }

The try catch combo you use by not catching the error in resolve, but catching it in the caller is a bit confusing.

Can't you simply manage errors inside this.resolve and returns val as undefined from there ?

@@ +753,1 @@
>      }

nit: add a line break

@@ +753,5 @@
>      }
> +    var entity = {
> +      value: this.toString(ctxdata),
> +      attributes: {}
> +    };

nit: add a line break

@@ +758,5 @@
> +    for (var key in this.attributes) {
> +      if (this.attributes.hasOwnProperty(key)) {
> +        entity.attributes[key] = this.attributes[key].toString(ctxdata);
> +      }
> +    }

nit: add a line break

@@ +767,5 @@
> +    if (ctxdata && ctxdata.hasOwnProperty(id) &&
> +        (typeof ctxdata[id] === 'string' ||
> +         (typeof ctxdata[id] === 'number' && !isNaN(ctxdata[id])))) {
> +      return ctxdata[id];
> +    }

nit: add a line break

@@ +774,5 @@
> +        env[id] = new Entity(id, env[id], env);
> +      }
> +      var value = env[id].resolve(ctxdata);
> +      if (typeof value === 'string') {
> +        if (value.length >= MAX_PLACEABLE_LENGTH) {

Just curious. Where does the value of MAX_PLACEABLE_LENGTH comes from ?

@@ +785,3 @@
>    }
>  
> +  function interpolate(str, env, ctxdata) {

Let's move ctxdata as the first argument for all the methods that use it That will bring a bit of consistency. Then maybe env when it is needed ? The variable arguments parts beeing at the end.

@@ +785,4 @@
>    }
>  
> +  function interpolate(str, env, ctxdata) {
> +    var placeables_count = 0;

s/placeables_count/placeablesCount because of guidelines, python addicts!

@@ +787,5 @@
> +  function interpolate(str, env, ctxdata) {
> +    var placeables_count = 0;
> +    var value = str.replace(rePlaceables, function(match, id) {
> +      if (placeables_count++ >= MAX_PLACEABLES) {
> +        throw new Error('Too many placeables');

For this message you may want to dump the current number of placeables, and the max number. (Same comment for the MAX_PLACEABLES_LENGTH).

@@ +843,5 @@
> +    this.id = id;
> +    this.ctx = ctx;
> +    this.isReady = false;
> +    this.entries = {
> +      __plural: getPluralRule(id)

nit: Any reasons for a double _ instead of a single ?

@@ +847,5 @@
> +      __plural: getPluralRule(id)
> +    };
> +  }
> +
> +  Locale.prototype.getEntry = function L_getEntry(id) {

do a |var entries = this.entries;| and use it into this function.

@@ +850,5 @@
> +
> +  Locale.prototype.getEntry = function L_getEntry(id) {
> +    if (!this.entries.hasOwnProperty(id)) {
> +      return undefined;
> +    }

nit: add a line break

@@ +853,5 @@
> +      return undefined;
> +    }
> +    if (this.entries[id] instanceof Entity) {
> +      return this.entries[id];
> +    }

nit: add a line break

@@ +859,5 @@
> +  };
> +
> +  Locale.prototype.build = function L_build(callback) {
> +    var sync = !callback;
> +    var l10nLoads = this.ctx.resLinks.length;

Create a local variable for |ctx| and use it in this method.

@@ +866,5 @@
> +    function onL10nLoaded(err) {
> +      if (err) {
> +        self.ctx._emitter.emit('error', err);
> +      }
> +      if (--l10nLoads <= 0) {

Can we really be below 0 ?

@@ +880,5 @@
> +      return;
> +    }
> +
> +    function onJSONLoaded(err, json) {
> +      if (!err) {

Don't you want to check json instead of err ? Or can it be an error but a json ?

@@ +889,3 @@
>  
> +    function onPropLoaded(err, source) {
> +      if (!err) {

Don't you want to check source instead of err ? Or can it be an error but a source ?

@@ +922,5 @@
> +    var ast = parse(this.ctx, source);
> +    this.addAST(ast);
> +
> +    // on buildtime, we want to keep the AST in order to serialize it into JSON
> +    if (this.ctx.isBuildtime) {

I thought the build time code path was living in the other file ?

@@ +950,5 @@
> +
> +    this.id = id;
> +    this.isBuildtime = false;
> +    this.isReady = false;
> +    this.isFrozen = false;

I wonder if we can call Object.freeze here instead and use Object.isFrozen ? That would ensure there should really be no change in this object once it is declared as frozen.

@@ +976,5 @@
> +
> +    function getWithFallback(id) {
> +      if (!this.isReady) {
> +        throw new ContextError('Context not ready');
> +      }

nit: add a line break

@@ +989,5 @@
> +        }
> +        var entry = locale.getEntry(id);
> +        if (entry === undefined) {
> +          cur++;
> +          warn.call(this, new ContextError(id + ' not found in ' + loc, id,

s/warn/warning

@@ +1002,5 @@
>  
> +    function get(id, ctxdata) {
> +      var entry = getWithFallback.call(this, id);
> +      if (entry === null) {
> +        return '';

nit: add a line break

@@ +1012,5 @@
> +    function getEntity(id, ctxdata) {
> +      var entry = getWithFallback.call(this, id);
> +      if (entry === null) {
> +        return null;
> +      }

nit: add a line break

@@ +1019,5 @@
>  
> +
> +    // Helpers
> +
> +    function getLocale(code) {

var locales = this.locales; and use that.

@@ +1021,5 @@
> +    // Helpers
> +
> +    function getLocale(code) {
> +      if (this.locales[code]) {
> +        return this.locales[code];

nit: add a line break

@@ +1078,5 @@
> +    }
> +
> +    function ready(callback) {
> +      if (this.isReady) {
> +        setTimeout(callback);

Don't you want to early return after that point ?

@@ +1085,5 @@
> +    }
> +
> +    function once(callback) {
> +      if (this.isReady) {
> +        setTimeout(callback);

Don't you want to early return after that point ?

@@ +1186,5 @@
>  
> +  if (window.document) {
> +    isPretranslated = (document.documentElement.lang === navigator.language);
> +
> +    // this is a special case for netError bug

Can you add the bug number next to this line.

@@ +1223,5 @@
> +                         .querySelector('script[type="application/l10n"]' +
> +                         '[lang="' + navigator.language + '"]');
> +    if (!script) {
> +      return false;
> +    }

nit: add a line break

@@ +1261,5 @@
> +    function onIniLoaded(err) {
> +      if (err) {
> +        ctx._emitter.emit('error', err);
> +      }
> +      if (--iniLoads <= 0) {

Could it really be below 0 ?

@@ +1276,5 @@
> +    ctx.requestLocales(navigator.language);
> +    if (navigator.mozSettings) {
> +      navigator.mozSettings.addObserver('language.current', function(event) {
> +        navigator.mozL10n.language.code = event.settingValue;
> +      });

Can you add a comment explaining that his can be removed once bug 780953 is fixed ?

@@ +1462,5 @@
> +      setTextContent(element, entity.value);
> +    }
> +
> +    if (entity.attributes) {
> +      for (var key in entity.attributes) {

Just curious. do we need the entity.attributes check since iterating over attributes will not run when attributes does not exists, or does it will throw an error ?

@@ +1491,5 @@
> +    // (non-blank) child textNode and clear other child textNodes
> +    var found = false;
> +    var reNotBlank = /\S/;
> +    for (var child = element.firstChild; child; child = child.nextSibling) {
> +      if (child.nodeType === 3 && reNotBlank.test(child.nodeValue)) {

s/3/Node.TEXT_NODE
Attachment #8400079 - Flags: review?(21) → review-
Thanks Vivien, I appreciate the thoroughness of the review.  I don't mind the line break addiction, I'm a bit pedantic like that myself ;)

(In reply to Vivien Nicolas (:vingtetun) (:21) from comment #31)

> Beside my line break addiction, I know that yesterday I was fine with having
> a dedicated l10n.js file for build time but I was kind of hoping that there
> will be only a simple file that overidde some of the shared/js/l10n.js file.
> Looking at all the code duplication with this version it does not seems a
> good idea anymore :/ 
> 
> r- but that's mostly a lot of nits.

I fixed all of the nits and removed them from my reply since they were kind of obvious.  Thanks for taking the time to comment on them.

> The 2 big things are:
>  - a better approach for the build l10n.js specific solution.
>  - more tests

We're working on these two things and we'll update the patch tomorrow.


> Also while this file is beeing completely refactored can you remove it from
> build/jshint/xfail.list and fix the related messages ? (it can be done as a
> followup but that seems like a good time to do it).

Yup, good idea, I'll make the necessary changes.


> @@ +18,5 @@
> > +            callback(null, xhr.responseText);
> > +          } else {
> > +            callback(new Error('Not found: ' + url));
> > +          }
> > +        }
> 
> Any reasons to not use xhr.onload/onerror ? Or an event listener ?

webapp-optimize uses onreadystatechange, but Gandalf filed bug 934727 to get rid of it once we're done with this bug.


> @@ +734,5 @@
> > +    try {
> > +      val = resolve(this.value, this.env, ctxdata, this.index);
> > +    } finally {
> > +      this.dirty = false;
> > +    }
> 
> Can you explain a little bit what the dirty flag is used for ? It seems like
> it is used as a kind of guard, but what do you want to prevent and in which
> cases it can happens that this code is reentrant.

It prevents cyclic and recursive references between entities, e.g.:

  foo={{bar}}
  bar={{foo}}

or 

  foo={{foo}}

See https://github.com/l20n/l20n.js/blob/b3a14f2e6d2e973e95480ff8ea55107341e78a08/tests/lib/compiler/primitives.js#L63-L105


> @@ +743,5 @@
> > +    try {
> > +      return this.resolve(ctxdata);
> > +    } catch (e) {
> > +      return undefined;
> > +    }
> 
> The try catch combo you use by not catching the error in resolve, but
> catching it in the caller is a bit confusing.
> 
> Can't you simply manage errors inside this.resolve and returns val as
> undefined from there ?

Entity.toString is part of the public API so it catches errors and returns undefined, but Entity.resolve is an internal method and I need it to throw inside of subPlaceable if things go wrong.

 
> @@ +774,5 @@
> > +        env[id] = new Entity(id, env[id], env);
> > +      }
> > +      var value = env[id].resolve(ctxdata);
> > +      if (typeof value === 'string') {
> > +        if (value.length >= MAX_PLACEABLE_LENGTH) {
> 
> Just curious. Where does the value of MAX_PLACEABLE_LENGTH comes from ?

It's a sensible default that we chose in bug 803931 comment 1, based on the average length of one standard page of text.


> @@ +843,5 @@
> > +    this.id = id;
> > +    this.ctx = ctx;
> > +    this.isReady = false;
> > +    this.entries = {
> > +      __plural: getPluralRule(id)
> 
> nit: Any reasons for a double _ instead of a single ?

It's a convention used by node for library identifiers.  We also forbid entities from starting with __ to avoid conflicts, while a single _ is fine (for forward-compatibility with L20n's syntax).

> @@ +866,5 @@
> > +    function onL10nLoaded(err) {
> > +      if (err) {
> > +        self.ctx._emitter.emit('error', err);
> > +      }
> > +      if (--l10nLoads <= 0) {
> 
> Can we really be below 0 ?

Yes, we also use this callback in the shortcut just below it in which the decrement --l10nLoads will return -1:

if (l10nLoads === 0) {
  onL10nLoaded();
  return;
}

> @@ +922,5 @@
> > +    var ast = parse(this.ctx, source);
> > +    this.addAST(ast);
> > +
> > +    // on buildtime, we want to keep the AST in order to serialize it into JSON
> > +    if (this.ctx.isBuildtime) {
> 
> I thought the build time code path was living in the other file ?

You're right, we'll move this out of the runtime version of the library.
 
> @@ +950,5 @@
> > +
> > +    this.id = id;
> > +    this.isBuildtime = false;
> > +    this.isReady = false;
> > +    this.isFrozen = false;
> 
> I wonder if we can call Object.freeze here instead and use Object.isFrozen ?
> That would ensure there should really be no change in this object once it is
> declared as frozen.

The word freeze happens to be the same, but the logic is different.  It's supposed to be prevent running more than one ctx.requestLocales() in quick succession.  I renamed it to isLoading to avoid confusion.

> @@ +1078,5 @@
> > +    }
> > +
> > +    function ready(callback) {
> > +      if (this.isReady) {
> > +        setTimeout(callback);
> 
> Don't you want to early return after that point ?

No, we still need to register the event listener so that next time the context is ready (after a language change) the callback is executed again.

 
> @@ +1085,5 @@
> > +    }
> > +
> > +    function once(callback) {
> > +      if (this.isReady) {
> > +        setTimeout(callback);
> 
> Don't you want to early return after that point ?

Yes, thanks!

 
> @@ +1462,5 @@
> > +      setTextContent(element, entity.value);
> > +    }
> > +
> > +    if (entity.attributes) {
> > +      for (var key in entity.attributes) {
> 
> Just curious. do we need the entity.attributes check since iterating over
> attributes will not run when attributes does not exists, or does it will
> throw an error ?

It was an artifact of an earlier optimization where sometimes entity could be an object with no 'attributes' member.  This isn't needed anymore, thanks.
Great review Vivien! So many good catches! :)

(In reply to Vivien Nicolas (:vingtetun) (:21) from comment #31)
> @@ +576,5 @@
> > +        continue;
> > +      }
> > +
> > +      while (patterns.multiline.test(line) && i < entries.length) {
> > +        line = line.slice(0, line.length - 1) + entries[++i].trim();
> 
> line.slice() without arguments may do what you are looking for here. But
> that's basically copying the entire content right ?
> 
> If yes, can't you simply use += ?

The code is a copy from the current l10n.js and it removes the last character ('\') from the multiline.

We could probably use parenthesis in the regexp to capture the line minus that, but that feels overwhelming. I just shortened line.slice(0, -1).

Hope that's ok with you.
 
> @@ +580,5 @@
> > +        line = line.slice(0, line.length - 1) + entries[++i].trim();
> > +      }
> > +
> > +      var entityMatch = line.match(patterns.entity);
> > +      if (entityMatch && entityMatch.length === 3) {
> 
> Can you add a comment explaining why |3| ?

I actually removed that check. It matches or not. There are no conditional groups in that regexp.

> @@ +589,5 @@
> > +            ctx._emitter.emit('error', e);
> > +          } else {
> > +            throw e;
> > +          }
> > +        }
> 
> I'm not sure try/catch is a good idea here. IIRC it will prevent this code
> to be eventually jitted.

1) We do not use the Parser in runtime currently
2) The parser does nested logic and can throw multiple errors while parsing an entity. try/catch here allows us to recover from that without seriously increasing complexity of the parser.

I would like to keep it here, and when we start using Parser in runtime I'd be happy to test the try/catch impact on parsing performance.

> @@ +650,5 @@
> > +        // special quirk to comply with webl10n's behavior
> > +        if (nestedProps.indexOf(
> > +              nameElements[nameElements.length - 1]) !== -1) {
> > +                attrElements.push(nameElements.pop());
> > +              }
> 
> This code and it's indentation seems a bit weird to me. You're not iterating
> over all the nameElements items, but you check if
> nameElements[nameElements.length - 1] is inside nestedProps and add only the
> last elements of nameElements to attrElements.
> 
> What happens to the possible elements in the middle ? (I may misunderstood
> this code).

I added a pretty detailed explanation of this quirk - https://github.com/l20n/l20n.js/blob/gaia/lib/l20n/parser.js#L99-L114
 
> @@ +671,5 @@
> > +      .replace(/\\f/g, '\f')
> > +      .replace(/\\{/g, '{')
> > +      .replace(/\\}/g, '}')
> > +      .replace(/\\"/g, '"')
> > +      .replace(/\\'/g, '\'');
> 
> Any way that the multiple .replace calls here can be factorize into one
> single call that use a single regexp that use capture parenthesis ?

Wow. How did I miss it for so long?
I actually noticed that we generate all those regexps on each run, so I bundled it into a single regexp and moved it outside of the loop.

Then I realized that on production we don't use the parser so initializing all parser regexp's is a waste of memory and moved it to lazy initialization. Shaved 40kb per app. k'boom baby!

We're still working on applying your review comments. Updated patch will come soon.
We addressed the review comments and nits, removed shared/js/l10n.js from jshint's xfail.list and reduced the size of build/l10n.js to just the required minimum.  Here's the updated patch.  We also started working on porting our tests to the Gallery app, but this has proved to be a bit more time consuming and I believe it would be better to take care of it in a follow-up bug.
Attachment #8400079 - Attachment is obsolete: true
Attachment #8401261 - Flags: review?(21)
Comment on attachment 8401261 [details] [diff] [review]
Patch 5: Small buildtime file, fixed review comments and nits

Review of attachment 8401261 [details] [diff] [review]:
-----------------------------------------------------------------

I still have a mixed feeling about the build/l10n.js file. I think it's good that the buildtime logic and the runtime logic are splitted, but it makes me sad that there is so much differences :/

r+ but this should not land without the set of tests you have that works with our testsuite.

::: build/l10n.js
@@ +95,5 @@
> +      locale = this.getLocale(loc);
> +      if (!locale.isReady) {
> +        // build without callback, synchronously
> +        locale.build(null);
> +      }

nit: add a line break

@@ +98,5 @@
> +        locale.build(null);
> +      }
> +      if (locale.ast && locale.ast.hasOwnProperty(id)) {
> +        return locale.ast[id];
> +      }

nit: add a line break

@@ +186,5 @@
> +    }
> +  }
> +
> +  function flushBuildMessages(variant) {
> +    for (var type in buildMessages) {

|let messages = buildMessages[type];| and use that.

::: shared/js/l10n.js
@@ +16,2 @@
>  
> +      xhr.onreadystatechange = function() {

Can you add a comment explaining that this should be removed once bug 934727 is fixed.

@@ +638,2 @@
>  
> +    pos = id.indexOf('[');

nit: |var pos = id.indexOf('.');| instead of having pos declared in the previous |var| statement.

@@ +735,4 @@
>  
> +  Entity.prototype.resolve = function E_resolve(ctxdata) {
> +    if (this.dirty) {
> +      // prevent cyclic or recursive references by returning undefined early

You may want to consider to move this comment next to the dirty guard declaration.

@@ +994,5 @@
> +                                              loc));
> +          continue;
> +        }
> +        return entry;
> +      }

nit: add a line break
Attachment #8401261 - Flags: review?(21) → review+
(In reply to Vivien Nicolas (:vingtetun) (:21) (NOT reading bugmails, needinfo? please) from comment #35)
> Comment on attachment 8401261 [details] [diff] [review]
> Patch 5: Small buildtime file, fixed review comments and nits
> 
> Review of attachment 8401261 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I still have a mixed feeling about the build/l10n.js file. I think it's good
> that the buildtime logic and the runtime logic are splitted, but it makes me
> sad that there is so much differences :/

Yeah. We'll aim at removing those differences.

> r+ but this should not land without the set of tests you have that works
> with our testsuite.

Great. Bug 991665 is ready.
Component: Gaia → Gaia::L10n
Blocks: 882592
With r+ in bug 991665 we're about to land this refactor.

One change since r+ is that we decided to temporarily keep launching l10n.js runtime code on 'interactive' event instead of the originally planned 'complete'.

The reason for that is that currently Settings app depend on the buggy behavior of mozL10n ready (bug 993189 ) which in turn depends on this (bug 993188)

We'll tackle each of those bugs separately and transition to complete in a followup bug 993193 .

Diff: https://pastebin.mozilla.org/4777319
Depends on: 994459
Blocks: 996479
No longer blocks: 996479
Depends on: 994417
No longer blocks: 993188
Depends on: 1005862
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: