Closed Bug 904723 Opened 11 years ago Closed 10 years ago

Implement ES6 Array.from

Categories

(Core :: JavaScript: Standard Library, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla32
Tracking Status
relnote-firefox --- 32+

People

(Reporter: bbenvie, Assigned: jorendorff)

References

(Blocks 1 open bug)

Details

(Keywords: dev-doc-complete, feature, Whiteboard: [js:p2])

Attachments

(4 files, 3 obsolete files)

This was originally in a bug with Array.of, but has been split off. See bug 866849. See ES6 draft spec (July 2013 revision) section 15.4.2.4.

  Array.from(arrayLike, mapfn=undefined, thisArg=undefined)


`Array.from` accepts either an arrayLike (indexed with a length) or iterable object and optionally a mapfn callback to transform the values as they are inserted into the new object. It operates on the |this| value, creating an instance of it if it is a constructor; otherwise it produces an Array.

The algorithm proceeds differently depending on whether it's operating on an arrayLike vs. an iterable. The iterable must be fully consumed in order to find the total number of elements, at which point the new instance can be created and filled. An arrayLike can create the instance up front based on the "length" property.
Actually ignore that last paragraph. That's only true for the TypedArray version of `from` (which is distinct from the Array version).
Assignee: general → egdelwonk
This doesn't add JIT support for the new intrinsic.
Previous patch was corrupted.
Attachment #826248 - Attachment is obsolete: true
Component: JavaScript Engine → JavaScript: Standard Library
Keywords: feature
Whiteboard: [js:p2]
Here's Will's work.

https://gist.github.com/egdelwonk/a9a1672f5707c2cb2124
https://gist.github.com/egdelwonk/c2ce8ed5780224d7b845

I'll get this organized for handoff tomorrow.

One known bug in the ArrayFrom implementation: `A[k] = mappedValue;` should be defining, not assigning.
Flags: needinfo?(jorendorff)
(In reply to Jason Orendorff [:jorendorff] from comment #4)
> Here's Will's work.
> 
> https://gist.github.com/egdelwonk/a9a1672f5707c2cb2124
> https://gist.github.com/egdelwonk/c2ce8ed5780224d7b845
> 
> I'll get this organized for handoff tomorrow.
> 
Thank, those patches look quite good.
> One known bug in the ArrayFrom implementation: `A[k] = mappedValue;` should
> be defining, not assigning.
I imagine bug 988416 would solve this?
(In reply to Tom Schuster [:evilpie] from comment #5)
> > One known bug in the ArrayFrom implementation: `A[k] = mappedValue;` should
> > be defining, not assigning.
>
> I imagine bug 988416 would solve this?

We can use the same thing ArrayMap is using (use UnsafePutElements and mess with the length to make it safe).
Flags: needinfo?(jorendorff)
Flags: needinfo?(jorendorff)
This change will be in the forthcoming TC39 meeting notes and in the next revision of the ES6 draft specification, but will be useful here: 


- Change usingIterator path callback signature to: value, index
- Change array like path callback signature to: value, index 
- Removes "items" from 17.d.3.1 (array like path)
- Adds "k" to 8.g.7.1


These section/instruction numbers are for revision 22 for http://people.mozilla.org/~jorendorff/es6-draft.html#sec-array.from
Too sleepy to turn those gists into patches tonight. Tomorrow.
Assignee: egdelwonk → jorendorff
Flags: needinfo?(jorendorff)
Attached patch bug-904723-array.from-v1.patch (obsolete) — Splinter Review
At last. I moved the tests to js/src/tests/ecma_6/Array and fixed the define/set thing. So I think this is done -- nothing left but review.
Attachment #826250 - Attachment is obsolete: true
Attachment #8420560 - Flags: review?(till)
Comment on attachment 8420560 [details] [diff] [review]
bug-904723-array.from-v1.patch

Review of attachment 8420560 [details] [diff] [review]:
-----------------------------------------------------------------

This is very nice! I have a lot of nits and a few additional tests that I'd like to see, but otherwise, it looks rock-solid. Also, Array.from is a really nice function.

::: js/src/builtin/Array.js
@@ +612,5 @@
> +
> +    // All elements defined by this algorithm have the same attrs:
> +    var attrs = ATTR_CONFIGURABLE | ATTR_ENUMERABLE | ATTR_WRITABLE;
> +
> +    // Steps 6-7.

Nit: "6-8", as step 8 is wholly contained in the following block

@@ +623,5 @@
> +            if (mapping)
> +                return [for (nextValue of items) callFunction(mapfn, thisArg, nextValue)];
> +            return [...items];
> +        }
> +        

Nit: whitespace

@@ +626,5 @@
> +        }
> +        
> +        // Steps 8.b-c.
> +        var A = new C();
> +        */

Did you intend to leave this commented-out block in? It looks like a left-over from an experiment. If you did intend to enable it, bear in mind that std_Array can't be used. This might be about the only place where explicitly accessing a builtin from the content compartment is in order. This can be done using `global.Array`.

@@ +627,5 @@
> +        
> +        // Steps 8.b-c.
> +        var A = new C();
> +        */
> +        var A = IsConstructor(C) ? new C() : [];

Please add the comment "// Steps a-c." above this (assuming you're removing the block above)

(We usually don't include the main step [at least in self-hosted code, I suppose] in these comments, hence "a-c" instead of "8.a-c". I didn't add nits for this below, but please make the comments consistent either way.)

@@ +632,5 @@
> +
> +        // Step 8.f.
> +        var k = 0;
> +
> +        // Steps 8.d-e and 8.g.i-vi.

Ok, this speaks to having the comments contain the full step names, I guess, as it's not clear to me how else to comment this.

@@ +635,5 @@
> +
> +        // Steps 8.d-e and 8.g.i-vi.
> +        for (var nextValue of items) {
> +            // Steps 8.g.vii-viii.
> +            var mappedValue = mapping ? callFunction(mapfn, thisArg, nextValue) : nextValue;

I guess it won't matter too much, but should we ever need to make this as fast as we can, introducing two different versions of this loop for mapping and non-mapping *might* help.

@@ +642,5 @@
> +            _DefineDataProperty(A, k++, mappedValue, attrs);
> +        }
> +
> +        // Here we're at step 8.g.iv.1. Fall through; it's implemented below.
> +    } else {

Please add "// Step 9 (omitted)."

@@ +648,5 @@
> +        // FIXME: Array operations should use ToLength (bug 924058).
> +        var len = ToInteger(items.length);
> +
> +        // Steps 13-15.
> +        var A = IsConstructor(C) ? new C(len) : std_Array(len);

Use NewDenseArray(len) instead if std_Array. (See comment in Utilities.js)

::: js/src/builtin/Utilities.js
@@ +27,5 @@
>  
>  // Remove unsafe builtin functions.
>  Object.defineProperty = null; // See bug 988416.
>  
>  // Cache builtin functions so using them doesn't require cloning the whole object they're 

Pre-existing nit: whitespace

@@ +31,5 @@
>  // Cache builtin functions so using them doesn't require cloning the whole object they're 
>  // installed on.
>  var std_isFinite = isFinite;
>  var std_isNaN = isNaN;
> +var std_Array = Array;

Full builtins can't be used like this: referring to std_Array would clone over all of Array, including its prototype and methods. I first thought that the resulting instances wouldn't even be instanceof Array in content script, but luckily, that at least isn't the case.

I added a warning about this to https://developer.mozilla.org/en-US/docs/Mozilla/Projects/SpiderMonkey/Internals/self-hosting

Can you maybe add a comment here warning about this footgun?

::: js/src/jsarray.cpp
@@ +3007,5 @@
>      JS_SELF_HOSTED_FN("every",       "ArrayStaticEvery", 2,0),
>      JS_SELF_HOSTED_FN("some",        "ArrayStaticSome",  2,0),
>      JS_SELF_HOSTED_FN("reduce",      "ArrayStaticReduce", 2,0),
>      JS_SELF_HOSTED_FN("reduceRight", "ArrayStaticReduceRight", 2,0),
> +    JS_SELF_HOSTED_FN("from",        "ArrayFrom", 3,0),

The length should be 1, not 3

::: js/src/tests/ecma_6/Array/from_basics.js
@@ +43,5 @@
> +
> +// Even on non-iterable objects.
> +assertDeepEq(Array.from({length: 4}), [undefined, undefined, undefined, undefined]);
> +
> +reportCompare(0, 0);

Guard this with `if (typeof reportCompare === "function")`, probably? (Same for all other test files.)

::: js/src/tests/ecma_6/Array/from_errors.js
@@ +33,5 @@
> +assertThrowsInstanceOf(() => ObjectWithReadOnlyLength.from([0, 10, 20, 30]), TypeError);
> +
> +// The same, but using a builtin type.
> +Uint8Array.from = Array.from;
> +assertThrowsInstanceOf(() => Uint8Array.from([]), TypeError);

Additional things to check for:
- a getter-only `length` accessor on the prototype throws, too
- a throwing setter for `length on the prototype throws, and with the correct type
- the new object has been created before the error is thrown
- the data properties have been defined on it
(the last two are essentially the reverse of the "mapfn not callable throws before everything else" below.)

@@ +55,5 @@
> +    get: function () { log += "2"; },
> +    getOwnPropertyDescriptor: function () { log += "3"; }
> +});
> +assertThrowsInstanceOf(() => Array.from.call(C, p, {}), TypeError);
> +assertEq(log, "");

Additional things to check for:
- the new object is instantiated and the first get operation happens if mapfn throws
- the correct error is thrown if mapfn throws

::: js/src/tests/ecma_6/Array/from_iterable.js
@@ +10,5 @@
> +var a = ['a', 'e', 'i', 'o', 'u'];
> +a["@@iterator"] = function* () {
> +    for (var i = this.length; i--; )
> +        yield this[i];
> +};

Can you also check that .length isn't queried at all if [@@iterator] is used?

::: js/src/tests/ecma_6/Array/from_realms.js
@@ +6,5 @@
> +    // is G.Array.prototype.
> +    var g = newGlobal();
> +    var ga = g.Array.from([1, 2, 3]);
> +    assertEq(ga instanceof g.Array, true);
> +    

Nit: whitespace

::: js/src/tests/ecma_6/Array/from_surfaces.js
@@ +5,5 @@
> +var desc = Object.getOwnPropertyDescriptor(Array, "from");
> +assertEq(desc.configurable, true);
> +assertEq(desc.enumerable, false);
> +assertEq(desc.writable, true);
> +assertEq(Array.from.length, 1);

Huh, I'm surprised this is the case, and you don't hit an assert when cloning the function, as you the length to 3 in the methods list. Guess I mis-remembered the conditions under which this throws.

::: js/src/tests/ecma_6/Array/from_this.js
@@ +19,5 @@
> +    assertEq(this, global);
> +    hits++;
> +}
> +Array.from("def", g);
> +assertEq(hits, 3);

For completeness' sake, please also test the strict case

::: js/src/vm/SelfHosting.cpp
@@ +446,5 @@
>      return true;
>  }
>  
>  bool
> +js::intrinsic_DefineDataProperty(JSContext *cx, unsigned argc, Value *vp)

Huh, I have no clue why I didn't call it that. Thanks for fixing.
Attachment #8420560 - Flags: review?(till) → review+
Oh, one thing I forgot to mention: I think it makes sense to split the patch up into preparatory stuff and Array.from-specific stuff. Specifically, the renaming of js_IsConstructor and the _DefineValueProperty intrinsic would be good to land independently, maybe even as two patches. I'll not hold up the landing of this feature of this, though, so up to you.
Comment on attachment 8420560 [details] [diff] [review]
bug-904723-array.from-v1.patch

Review of attachment 8420560 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/builtin/Array.js
@@ +620,5 @@
> +        // Array constructor, then the rest of the algorithm is equivalent to
> +        // some Array-specific syntactic sugar that compiles to faster code.
> +        if (C === std_Array || !IsConstructor(C)) {
> +            if (mapping)
> +                return [for (nextValue of items) callFunction(mapfn, thisArg, nextValue)];

At the last TC39 meeting it was agreed that mapfn would always receive two arguments: value and index, for both the "usingIterator" and fallback paths. This change is reflected in rev24 at http://people.mozilla.org/~jorendorff/es6-draft.html#sec-array.from 8.g.vii.1 (nextValue, k)

@@ +635,5 @@
> +
> +        // Steps 8.d-e and 8.g.i-vi.
> +        for (var nextValue of items) {
> +            // Steps 8.g.vii-viii.
> +            var mappedValue = mapping ? callFunction(mapfn, thisArg, nextValue) : nextValue;

Needs to be callFunction(mapfn, thisArg, nextValue, items.indexOf(nextValue)) 

(not literally, but the index needs to be passed to mapfn)
> > +        // Steps 8.b-c.
> > +        var A = new C();
> > +        */
> 
> Did you intend to leave this commented-out block in? It looks like a
> left-over from an experiment.

It works and I meant to uncomment it.

> If you did intend to enable it, bear in mind
> that std_Array can't be used. This might be about the only place where
> explicitly accessing a builtin from the content compartment is in order.
> This can be done using `global.Array`.

Is that safe, though? The Array property of the global is a writable, configurable property. Or is global.Array "magical" in self-hosted code, and it always gives you the original Array constructor for that global?

> (We usually don't include the main step [at least in self-hosted code, I
> suppose] in these comments, hence "a-c" instead of "8.a-c". I didn't add
> nits for this below, but please make the comments consistent either way.)

I kept the full step numbers. As you mentioned, it's hard to comment some things clearly without them.

> @@ +31,5 @@
> >  // Cache builtin functions so using them doesn't require cloning the whole object they're 
> >  // installed on.
> >  var std_isFinite = isFinite;
> >  var std_isNaN = isNaN;
> > +var std_Array = Array;
> 
> Full builtins can't be used like this: referring to std_Array would clone
> over all of Array, including its prototype and methods. [...]

I did put a warning in. As discussed on IRC, somehow we ended up with Std_Date and Std_String, introduced in:

changeset:   117469:50914b4cc5d8
user:        Norbert Lindenberg <mozilladev@lindenbergsoftware.com>
date:        Thu Jan 03 12:29:15 2013 -0600
summary:     Bug 769872 - Add self-hosted JavaScript core of Intl constructors Collator, NumberFormat, DateTimeFormat (part 1). r=jwalden

It's hard to enforce rules like this via WARNING comments and wiki pages, so please consider whether it's possible for the self-hosting infrastructure to assert that this isn't happening.

> > +    JS_SELF_HOSTED_FN("from",        "ArrayFrom", 3,0),
> 
> The length should be 1, not 3

Fixed. Should I be worried that this isn't asserting, or is it unused?

Thanks for all the testing suggestions. I implemented them all. And I separated the patch into three changesets as proposed.
Oh, there's the assertion. It fails if I change the value in the C++ file to 1. (!)
Notes:

1. The optimization, using global.Array, is buggy. So I'm just going to remove it. I've added a test that detects the bug:

    // Array.from does not get confused if global.Array is replaced with another
    // constructor.
    function NotArray() {
    }
    var RealArray = Array;
    NotArray.from = Array.from;
    Array = NotArray;
    assertEq(RealArray.from([1]) instanceof RealArray, true);
    assertEq(NotArray.from([1]) instanceof NotArray, true);

Too bad: the fast path that does [...items] is twice as fast as the slow path.  (I tried to improve the slow path by introducing two versions of the loop, mapping and non-mapping, as till suggested. It didn't help.)

2. It's ok to leave the `nargs` value for "ArrayFrom" as 3, because nargs() is just the number of arguments, not the value of the function's .length property:

    uint16_t        nargs_;       /* number of formal arguments
                                     (including defaults and the rest parameter unlike f.length) */
Carrying forward till's review of this part of the patch.
Attachment #8424818 - Flags: review+
This one too.
Attachment #8424820 - Flags: review+
Requesting review for the new code in ArrayFrom. Also these new tests in from_mapping.js:

>+// mapfn is called with two arguments: value and index.
>+var log = [];
>+function f() {
>+    log.push(Array.from(arguments));
>+    return log.length;
>+}
>+assertDeepEq(Array.from(['a', 'e', 'i', 'o', 'u'], f), [1, 2, 3, 4, 5]);
>+assertDeepEq(log, [['a', 0], ['e', 1], ['i', 2], ['o', 3], ['u', 4]]);
>+
>+// If the object to be copied is non-iterable, mapfn is still called with two
>+// arguments.
>+log = [];
>+assertDeepEq(Array.from({0: "zero", 1: "one", length: 2}, f), [0, 1]);
>+assertDeepEq(log, [["zero", 0], ["one", 1]]);
Attachment #8420560 - Attachment is obsolete: true
Attachment #8424837 - Flags: review?(till)
Comment on attachment 8424837 [details] [diff] [review]
bug-904723-part-3-Array.from-v1.patch

Review of attachment 8424837 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/builtin/Array.js
@@ +621,5 @@
> +        // Step 8.f.
> +        var k = 0;
> +
> +        // Steps 8.d-e and 8.g.i-vi.
> +        for (var nextValue of items) {

The for-of loop will query the @@iterator property a second time. The spec performs a single [[Get]] in step 6 (CheckIterable call). Follow-up bug of https://bugs.ecmascript.org/show_bug.cgi?id=2083 needed? :/
Comment on attachment 8424837 [details] [diff] [review]
bug-904723-part-3-Array.from-v1.patch

Review of attachment 8424837 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry for the delay. This looks good, with the ['@@iterator'] double-get resolved.

(In reply to Jason Orendorff [:jorendorff] from comment #15)
> Notes:
> 
> 1. The optimization, using global.Array, is buggy. So I'm just going to
> remove it.

Yeah, I didn't really think this through, sorry.

> 
> Too bad: the fast path that does [...items] is twice as fast as the slow
> path.  (I tried to improve the slow path by introducing two versions of the
> loop, mapping and non-mapping, as till suggested. It didn't help.)

We have to come up with a good story for the Realm record [[intrinsics]] list in any case: it should be possible to expose that to self-hosted code and enable us to get at the original Array function that way. Until then, we probably just have to live with this.

> 
> 2. It's ok to leave the `nargs` value for "ArrayFrom" as 3, because nargs()
> is just the number of arguments, not the value of the function's .length
> property:
> 
>     uint16_t        nargs_;       /* number of formal arguments
>                                      (including defaults and the rest
> parameter unlike f.length) */

Right, this is ok indeed. I mixed it up with ...rest, which does cause problems (and which I have a patch for that I'll land together with the first code that actually needs it.)

::: js/src/builtin/Array.js
@@ +613,5 @@
> +    // All elements defined by this algorithm have the same attrs:
> +    var attrs = ATTR_CONFIGURABLE | ATTR_ENUMERABLE | ATTR_WRITABLE;
> +
> +    // Steps 6-8.
> +    if (items["@@iterator"] !== undefined) {

Too bad `'@@iterator' in items` returns `true` even if items['@@iterator] has the value `undefined`. :(

We can explicitly follow the spec here instead:
var usingIterator a['@@iterator'];
if (usingIterator !== undefined) {
... (see below)

@@ +621,5 @@
> +        // Step 8.f.
> +        var k = 0;
> +
> +        // Steps 8.d-e and 8.g.i-vi.
> +        for (var nextValue of items) {

...
var iterator = callFunction(usingIterator, items);
for (var next; !(next = iterator.next()).done;) {
    var nextValue = next.value;
    ....

::: js/src/builtin/Utilities.js
@@ +33,5 @@
> +//
> +// WARNING: Do not make std_ references to builtin constructors (like Array and
> +// Object) below. Setting `var std_Array = Array;`, for instance, would cause
> +// the entire Array constructor, including its prototype and methods, to be
> +// cloned into content compartments.

Thanks, that's a nice comment.

Theoretically we could do something like nulling out all the builtin ctors after we take references to their prototype functions, and then have an assert that no vars with a naming scheme of std_{All The Builtin Ctors} exist. That would prevent people from creating these references because they didn't read this comment. OTOH, maybe we should trust our review process here?
Attachment #8424837 - Flags: review?(till) → review+
(In reply to Till Schneidereit [:till] from comment #20)
> We have to come up with a good story for the Realm record [[intrinsics]]
> list in any case: it should be possible to expose that to self-hosted code
> and enable us to get at the original Array function that way.

Agreed.

> ::: js/src/builtin/Array.js
> @@ +613,5 @@
> > +    // All elements defined by this algorithm have the same attrs:
> > +    var attrs = ATTR_CONFIGURABLE | ATTR_ENUMERABLE | ATTR_WRITABLE;
> > +

> > +    // Steps 6-8.
> > +    if (items["@@iterator"] !== undefined) {
> 
> Too bad `'@@iterator' in items` returns `true` even if items['@@iterator]
> has the value `undefined`. :(

That would also have observable effects, contrary to the spec, when items is a proxy.

> We can explicitly follow the spec here instead:

This is what I'm doing now...

> [...] OTOH, maybe we should trust our review process here?

It might be a little naive to trust our review process to enforce this rule, which we've already failed to enforce twice. If we just can assert in SelfHosting.cpp:CloneObject that we never clone an object that's stored in any of the global's reserved slots, that would be enough, right?
Depends on: 1008441
I discovered the spec requires a type check that we don't perform (<http://people.mozilla.org/~jorendorff/es6-draft.html#sec-iteratornext> step 4), so I implemented that as well, with a test.

However I only implemented it in Array.from; filed bug 1016936 to fix all the other places where we iterate (in particular, emitting this check into the bytecode that we emit for for-of loops will be a pain).
This fixes the double @@iterator get:

> // When the argument passed to Array.from is a Proxy, Array.from
> // calls handler.get on it.
> log = [];
> assertDeepEq(Array.from(new LoggingProxy([3, 4, 5])), [3, 4, 5]);
>-assertDeepEq(log, ["get @@iterator", "get @@iterator",
>+assertDeepEq(log, ["get @@iterator",
>                    "get length", "get 0", "get length", "get 1", "get length", "get 2",
>                    "get length"]);

With this, we can finally land whenever bug 1008441 gets review.
Attachment #8430025 - Flags: review?(till)
Comment on attachment 8430025 [details] [diff] [review]
bug-904723-part-4-fix-double-iterator-get-v1.patch

Review of attachment 8430025 [details] [diff] [review]:
-----------------------------------------------------------------

Nice! I don't even think it's much uglier than the for-of solution. It'd be sad if it's less optimizable, of course.
Attachment #8430025 - Flags: review?(till) → review+
Comment on attachment 8430025 [details] [diff] [review]
bug-904723-part-4-fix-double-iterator-get-v1.patch

Review of attachment 8430025 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/builtin/Array.js
@@ +631,5 @@
> +        var next;
> +        while (true) {
> +            // Steps 8.g.ii-vi.
> +            next = iterator.next();
> +            if (!IsObject(next))

I think this step won't be required once we fix bug 1016936?
It will still be required, because this code is really a separate copy of the for-of iteration logic.

That is, with this patch, we will have three implementations of the for-of logic rather than two.
Comment on attachment 8430025 [details] [diff] [review]
bug-904723-part-4-fix-double-iterator-get-v1.patch

Review of attachment 8430025 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/builtin/Array.js
@@ +619,5 @@
>          // Steps 8.a-c.
>          var A = IsConstructor(C) ? new C() : [];
>  
> +        // Steps 8.d-e.
> +        var iterator = callFunction(usingIterator, items);

Nitpick: `if (!IsObject(iterator)) ThrowError(...)`
Depends on: 1021835
I think Array.from is worth being mentioned in release notes.
Added in the release notes with the wording:
New Array built-in: Array.from() (learn more)

Learn more points to https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/from
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: