Closed Bug 1347644 Opened 7 years ago Closed 7 years ago

Baldr: implement WebAssembly.compileStreaming/instantiateStreaming

Categories

(Core :: JavaScript Engine, enhancement, P3)

enhancement

Tracking

()

RESOLVED FIXED
mozilla58
Tracking Status
firefox57 --- wontfix
firefox58 --- fixed

People

(Reporter: luke, Assigned: luke)

References

(Blocks 1 open bug)

Details

Attachments

(10 files, 7 obsolete files)

66.30 KB, patch
bkelly
: review+
till
: review+
Details | Diff | Splinter Review
2.75 KB, patch
mccr8
: review+
bkelly
: feedback+
Details | Diff | Splinter Review
41.92 KB, patch
till
: review+
Details | Diff | Splinter Review
841 bytes, patch
till
: review+
Details | Diff | Splinter Review
25.30 KB, patch
till
: review+
Details | Diff | Splinter Review
32.11 KB, patch
till
: review+
Details | Diff | Splinter Review
1.01 KB, patch
till
: review+
Details | Diff | Splinter Review
1.17 KB, patch
bkelly
: review+
Details | Diff | Splinter Review
2.74 KB, patch
lth
: review+
Details | Diff | Splinter Review
35.32 KB, patch
bkelly
: review+
Details | Diff | Splinter Review
As described in:
  https://github.com/WebAssembly/design/blob/master/Web.md#additional-web-embedding-api

For the ordinary network fetch(), a Response is a stream so this also implies streaming compilation of wasm.
(taking)
Assignee: nobody → luke
Attached patch simplify-async-task (obsolete) — Splinter Review
This patch changes how async tasks deal with shutdown. Previously, this responsibility was implicitly on the embedding with a somewhat complicated JS API contract.  Now, the JS API is much simplified and JSRuntimes track all their live async tasks and wait on them when the JSRuntime is about to be destroyed.  This generalization extends naturally to the new type of AsyncTask I'd like to add to implement streaming.

Ben, could you review the (now much simplified) Gecko bits?  See changes in jsapi.h for how the interface changed.

Till, could you review the SM bits?  I ended up slurping the relevant JSRuntime global state (callbacks, the list of live AsyncTasks, the Vector that used to be called asyncTasks) into a new AsyncPromiseTasks (stored in the JSRuntime).  I'd start by reading the .h file changes which are fairly commented.  Note, the SetShellBuildIdOp() change in WorkerMain() is so that I can use evalInWorker() to compile wasm (previously, it'd error out).
Attachment #8896035 - Flags: review?(till)
Attachment #8896035 - Flags: review?(bkelly)
Comment on attachment 8896035 [details] [diff] [review]
simplify-async-task

Review of attachment 8896035 [details] [diff] [review]:
-----------------------------------------------------------------

I agree in principle with what you are doing, but I'd like better documentation.  Also, I think you probably want a control runnable instead of a normal runnable to avoid a situation where you are blocking during thread shutdown and can never complete the task.

::: dom/workers/RuntimeService.cpp
@@ +717,5 @@
>    return asmjscache::OpenEntryForWrite(principal, aBegin, aEnd, aSize, aMemory,
>                                         aHandle);
>  }
>  
> +class AsyncTaskRunnable final : public WorkerRunnable

Please add a comment explaining the mechanism that prevents the WorkerPrivate from closing the thread here.

I assume something in the jsapi blocks the worker threads event loop?  What is that?

Also, does that blocking mechanism require this runnable to actually fire?  If you end up blocking the worker thread when its in the process of closing then this WorkerRunnable will not be allowed to dispatch to the thread.  That would seem to lead to deadlock.

Maybe this is all accounted for, but I think we need some good documentation here since its so different from our current mechanism.
Attachment #8896035 - Flags: review?(bkelly) → review-
Attached patch simplify-async-task (obsolete) — Splinter Review
I added a big comment to AsyncTaskRunnable explaining why a WorkerRunnable is needed and why this shutdown deadlocks.  See also comments in DispatchToEventLoop() on why the WorkerPrivate* is valid.

Also, just to check: if WorkerRunnable::Dispatch() succeeds, the WorkerRunnable is *definitely* run (WorkerRun or Cancel), right?
Attachment #8896035 - Attachment is obsolete: true
Attachment #8896035 - Flags: review?(till)
Attachment #8896333 - Flags: review?(till)
Attachment #8896333 - Flags: review?(bkelly)
heh, I mean to say "and why this *avoids* shutdown deadlocks"
Comment on attachment 8896333 [details] [diff] [review]
simplify-async-task

Review of attachment 8896333 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/workers/RuntimeService.cpp
@@ +731,5 @@
> +// the JSContext of a WorkerPrivate, the JS engine synchronously waits for all
> +// AsyncTasks to finish. Deadlock is avoided because, by this point in time,
> +// DispatchToEventLoop always returns false (WorkerRunanble::Dispatch() fails)
> +// and all WorkerRunnable's that were previously successfully Dispatch()ed have
> +// already been WorkerRun().

This confuses me.  We destroy the JSContext here:

http://searchfox.org/mozilla-central/source/dom/workers/RuntimeService.cpp#2941

However, by the time you get here the WorkerPrivate's status has already reached KILLED.

Consider:

1. You start an async task.
2. Worker shutdowns, reaches KILLED, and blocks in JS_GC(cx) for the async task.
3. Async task completes and tries to issue a runnable back to the worker to signal this.
4. Runnable back from the worker fails because its Dispatch() returns an error code.

Basically the DispatchToEventLoop() can return false in this condition.  What happens in this case?
Attachment #8896333 - Flags: review?(bkelly)
(In reply to Ben Kelly [:bkelly] from comment #6)
> Consider:
> 
> 1. You start an async task.
> 2. Worker shutdowns, reaches KILLED, and blocks in JS_GC(cx) for the async
> task.
> 3. Async task completes and tries to issue a runnable back to the worker to
> signal this.
> 4. Runnable back from the worker fails because its Dispatch() returns an
> error code.
> 
> Basically the DispatchToEventLoop() can return false in this condition. 
> What happens in this case?

The key is that step 2 isn't waiting for the AsyncTask to *finish*; if the DispatchToEventLoop() fails, we're already in the JS engine and we know we're synchronously waiting to shutdown, so we can just deal with it.

In more detail: DispatchToEventLoop() fails on some helper thread and then we increment a counter on the JSRuntime that records "live, but canceled AsyncTasks".  The synchronous wait in step 2 is waiting on the condition that the number of still-live AsyncTasks == the number of canceled AsyncTasks.  Once this condition is met, the JSRuntime knows that none of the live AsyncTasks are concurrently executing and so it simply deletes everything.  So the only thing that JS requires of the embedding is that DispatchToEventLoop() doesn't lie: if it returns "true", the embedding must follows through *before destroying the JSContext*.

Make sense?
Flags: needinfo?(bkelly)
Ok.  Can you expand the comment to explain shutdown does not require the runnable to be dispatched?  We have a lot of other code that does require that, so distinguishing this is different would help future readers.
Flags: needinfo?(bkelly)
Ok, I rewrote the AsyncTaskRunnable comment to more directly explain this and added comments in DispatchToEventLoop().
Attachment #8896333 - Attachment is obsolete: true
Attachment #8896333 - Flags: review?(till)
Attachment #8896360 - Flags: review?(till)
Attachment #8896360 - Flags: review?(bkelly)
Comment on attachment 8896360 [details] [diff] [review]
simplify-async-task

Review of attachment 8896360 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks!  I only reviewed the dom pieces.

::: dom/base/nsJSEnvironment.cpp
@@ +2763,5 @@
> +
> +  // This callback may execute either on the main thread or a random JS-internal
> +  // helper thread. This callback can be called during shutdown so we cannot
> +  // simply NS_DispatchToMainThread. Failure during shutdown is expected and
> +  // properly handled by the JS engine.

We won't block the main thread waiting for async tasks to complete when a window is closed, right?
Attachment #8896360 - Flags: review?(bkelly) → review+
(In reply to Ben Kelly [:bkelly] from comment #10)
> We won't block the main thread waiting for async tasks to complete when a
> window is closed, right?

No, because window-closing shouldn't destroy any JSContexts.  Only process/XPCOM shutdown (which already blocks on various helper threads to finish).
I should say "Right" instead of the more ambiguous "No" ("No, it won't block the main thread during widow close").
This patch (to be folded into the previous one) fixes a shutdown leak.  The problem is that if we only drop roots at the final JS_DestroyContext(), it's too late for cycle collection.  Instead, I believe we need to flush out the async tasks in nsCycleCollector::ShutdownCollect(), right before the Collect() cycle that is suppose to wipe up everything.  The specific failure I was seeing is the PromiseObject (held by the AsyncPromiseTask) rooting the callback rooting the worker's global object.  Happy anyone has any alternative solutions here.
Comment on attachment 8896360 [details] [diff] [review]
simplify-async-task

Review of attachment 8896360 [details] [diff] [review]:
-----------------------------------------------------------------

r=me on the JS bits with the below nits addressed. I really like how much simpler the API surface for Promise tasks becomes with this.

::: js/src/builtin/Promise.cpp
@@ +3459,5 @@
> +    live_.clear();
> +    numCanceled_ = 0;
> +
> +    // After shutdown, there should be no AsyncPromiseTask activity associated
> +    // with this JSRuntime. Revert to the uninitialized to catch bugs.

"the uninitialized state"?

::: js/src/builtin/Promise.h
@@ +170,5 @@
> +// resolve the promise afterwards. Because AsyncPromiseTask contains a
> +// PersistentRooted, it must be destroyed on an active JSContext thread of the
> +// promise's JSRuntime. AsyncPromiseTasks may be dispatched off-thread in
> +// various ways (e.g., see PromiseHelperTask). When the task completes and it's
> +// time to resolve the promise, AsyncPromiseTasks::.dispatchToEventLoop(task)

s/::./::/

@@ +171,5 @@
> +// PersistentRooted, it must be destroyed on an active JSContext thread of the
> +// promise's JSRuntime. AsyncPromiseTasks may be dispatched off-thread in
> +// various ways (e.g., see PromiseHelperTask). When the task completes and it's
> +// time to resolve the promise, AsyncPromiseTasks::.dispatchToEventLoop(task)
> +// should be called. After this, if shutdown does not interrupt, the derive

nit: s/derive/derived/

@@ +203,2 @@
>  
> +    // An initialized AsyncPromiseTask can be dispatched

Nit: missing "."

@@ +210,5 @@
> +                                    SystemAllocPolicy>;
> +
> +using AsyncTaskVector = Vector<JS::AsyncTask*, 0, SystemAllocPolicy>;
> +
> +class AsyncPromiseTasks

I'm not too happy with how similar this name is to the singular. Can we call this PendingAsyncPromiseTasks or AsyncPromiseTasksQueue or something like that?

::: js/src/jsapi.h
@@ +4804,3 @@
>  {
> +  protected:
> +    /* AsyncTasks are created and destroyed by SpiderMonkey. */

I know there are quite a few C-style single-line comments in this file, but there are also lots of C++-style ones, so no need to add new C-style ones.

@@ +4811,2 @@
>      /**
> +     * Methods called by the embedding after DispatchToEventLoopCallback

Perhaps "to be called", or "the embedding must call"?

@@ +4822,5 @@
> + * DispatchToEventLoopCallback may be called from any thread, being passed the
> + * same 'closure' passed to InitDispatchToEventLoop() and a 'task' from the
> + * same JSRuntime. If the embedding returns 'true', the embedding must call
> + * finish() or finishDuringShutdown() on an active JSContext thread for the same
> + * JSRuntime on which 'closure' was registered. If this the callback fails,

s/this the/the/

::: js/src/vm/HelperThreads.h
@@ +472,5 @@
>  
>  /*
> + * If helper threads are available, start executing the PromiseHelperTask
> + * on a helper thread. In all cases, the given task will either (1) be deleted
> + * or (2) have finish(DuringShutdown) called, in both cases from an active

Is this left over from a refactoring before which DuringShutdown was an enum, or do you mean finish/finishDuringShutdown? In either case, it's a bit confusing, perhaps change it to "finish()/finishDuringShutdown()" or "finish{DuringShutdown}()".
Attachment #8896360 - Flags: review?(till) → review+
Attachment #8896504 - Flags: review?(bkelly)
(In reply to Till Schneidereit [:till] from comment #14)
Great points, all fixed.

> > +class AsyncPromiseTasks
> 
> I'm not too happy with how similar this name is to the singular. Can we call
> this PendingAsyncPromiseTasks or AsyncPromiseTasksQueue or something like
> that?

Hah, indeed I confused myself on several occasions.  I'm currently thinking "AsyncTaskRuntimeState".
+1 on AsyncTaskRuntimeState, much better name.
Comment on attachment 8896504 [details] [diff] [review]
fix-shutdown-leak

This seems straightforward and reasonable to me, but I don't feel comfortable reviewing cycle collector stuff.  Redirecting flag.
Attachment #8896504 - Flags: review?(continuation)
Attachment #8896504 - Flags: review?(bkelly)
Attachment #8896504 - Flags: feedback+
Comment on attachment 8896504 [details] [diff] [review]
fix-shutdown-leak

Review of attachment 8896504 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/base/nsCycleCollector.cpp
@@ +3604,5 @@
>  void
>  nsCycleCollector::ShutdownCollect()
>  {
>    FinishAnyIncrementalGCInProgress();
> +  JS::ShutdownAsyncTasks(CycleCollectedJSContext::Get()->Context());

You might want to guard this call on mCCJSRuntime. IIRC in some weird test harness we can run the CC without the JS engine running. At least, that used to be the case. (If a full try run on a platform comes back green, then it should be okay.)
Attachment #8896504 - Flags: review?(continuation) → review+
Attached patch renameSplinter Review
Patch does renamings and tweaks as discussed on IRC.  Briefly:
 - JS::AsyncTask -> JS::Dispatchable (since Runnable is taken :)
 - AsyncTaskRunnable -> JSDispatchableRunnable
 - finish+finishDuringShutdown -> run(cx, MaybeShuttingDown)
 - AsyncPromiseTask -> OffThreadPromiseTask
 - AsyncPromiseRuntimeState -> OffThreadPromiseRuntimeState
 - AsyncPromisTask::dispatchToEventLoop -> OffThreadPromiseTask::dispatchResolve
 - internal*JobQueue* -> internal*DispatchQueue*
Attachment #8897607 - Flags: review?(till)
Comment on attachment 8897607 [details] [diff] [review]
rename

Review of attachment 8897607 [details] [diff] [review]:
-----------------------------------------------------------------

r=me, I like these names much better.

::: js/src/vm/MutexIDs.h
@@ +41,5 @@
>    _(DateTimeInfoMutex,           500) \
>    _(IcuTimeZoneStateMutex,       500) \
>    _(ProcessExecutableRegion,     500) \
>    _(WasmCodeProfilingLabels,     500) \
> +  _(OffThreadPromise,            500) \

Perhaps keep the "State" suffix same as for the TraceLogger-related IDs?
Attachment #8897607 - Flags: review?(till) → review+
(In reply to Till Schneidereit [:till] from comment #20)
> Perhaps keep the "State" suffix same as for the TraceLogger-related IDs?

Agreed
Pushed by lwagner@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/8dfe4d26c70f
Simplify/rename JS::AsyncTask (r=till,bkelly,mccr8)
Keywords: leave-open
Depends on: 1395587
Summary: Baldr: implement Response overload for WebAssembly.compile/instantiate → Baldr: implement WebAssembly.compileStreaming/instantiateStreaming
Attached patch propagate-quitSplinter Review
Without this, quit() within a reaction job fails to propagate all the way out of drainJobQueue() and end execution.
Attachment #8912900 - Flags: review?(till)
This patch just contains superficial renamings and refactorings to prepare for the next patch:
 - instead of passing a promise explicitly to resolve(), a protected promise() getter is used since the next patch needs to get the promise in other non-result() contexts
 - instead of treating the registration of OffThreadPromiseTasks as initialization, it's now named blockShutdownUntilDispatch() so that it's more clear (1) what this means and when you want to do it, (2) it doesn't have to happen immediately (b/c it won't in the next patch).
 - WebAssembly.compile() and instantiate() now share a single task type.  This reduces duplication in a way that is more clear in the next patch.
Attachment #8912901 - Flags: review?(till)
This patch adds the public JSAPI interface for SM to consume a stream, a trivial implementation in the shell, and a trivial consumer in SM.  The patch doesn't add browser support or do actual compilation in parallel with streaming; those both are for later patches.

WebAssembly.(compile|instantiate)Streaming are described in:
  https://github.com/WebAssembly/design/blob/master/Web.md#additional-web-embedding-api
Attachment #8912903 - Flags: review?(till)
Attached patch fix-thread-raceSplinter Review
Short-lived threads created by previous patch exposed a pre-existing race condition on try.
Attachment #8913008 - Flags: review?(till)
Comment on attachment 8912900 [details] [diff] [review]
propagate-quit

Review of attachment 8912900 [details] [diff] [review]:
-----------------------------------------------------------------

Huh, interesting. I could've sworn I fixed this at some point.
Attachment #8912900 - Flags: review?(till) → review+
Comment on attachment 8912901 [details] [diff] [review]
rename-and-tweak-tasks

Review of attachment 8912901 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with or without question addressed.

::: js/src/wasm/WasmJS.cpp
@@ +855,5 @@
>      return true;
>  }
>  
>  static bool
> +InitCompileArgs(JSContext* cx, SharedCompileArgs* compileArgs)

Couldn't this now just return SharedCompileArgs* instead of using an outparam?
Attachment #8912901 - Flags: review?(till) → review+
(In reply to Till Schneidereit [:till] from comment #29)
> Couldn't this now just return SharedCompileArgs* instead of using an
> outparam?

Yes, and will do, thanks!
Comment on attachment 8912903 [details] [diff] [review]
add-initial-shell-streaming

Review of attachment 8912903 [details] [diff] [review]:
-----------------------------------------------------------------

This is surprisingly clean. I really wish we had better support for internal Promise reactions that don't need JSFunctions as reactions, but even without that, this is all quite easy to follow.

::: js/src/js.msg
@@ +366,5 @@
>  MSG_DEF(JSMSG_USE_ASM_TYPE_OK,         1, JSEXN_WARN,    "Successfully compiled asm.js code ({0})")
>  
>  // wasm
>  MSG_DEF(JSMSG_WASM_COMPILE_ERROR,      1, JSEXN_WASMCOMPILEERROR, "{0}")
> +MSG_DEF(JSMSG_WASM_STREAM_ERROR,       0, JSEXN_WASMCOMPILEERROR, "streaming interrupted")

Can you change this to use a slightly more verbose error message? Ideally including that it's about wasm streaming.

::: js/src/jsapi.h
@@ +4950,5 @@
>  InitDispatchToEventLoop(JSContext* cx, DispatchToEventLoopCallback callback, void* closure);
>  
>  /**
> + * The ConsumeStreamCallback is called on the JSContext's owner thread, passing
> + * an StreamConsumer that wishes to consume the given host object as a stream of

nit: s/an/a/

@@ +4952,5 @@
>  /**
> + * The ConsumeStreamCallback is called on the JSContext's owner thread, passing
> + * an StreamConsumer that wishes to consume the given host object as a stream of
> + * bytes. On failure, the embedding must report the appropriate error on 'cx'.
> + * On success, the embedding shall call consumer->consumeChunk() repeatedly

"Shall" is somewhat unclear here. Do you mean "should", or "must"? Perhaps use either of those?

@@ +4968,5 @@
> +    virtual ~StreamConsumer() = default;
> +
> +  public:
> +    // Called by the embedding as each chunk of bytes becomes available.
> +    // If this function returns 'false', the stream shall drop all pointers to

Same comment about "shall" as above.

@@ +4973,5 @@
> +    // this StreamConsumer.
> +    virtual bool consumeChunk(const uint8_t* begin, size_t length) = 0;
> +
> +    // Called by the embedding when the stream is closed according to the
> +    // contract described below.

s/below/above/

::: js/src/shell/js.cpp
@@ +6958,5 @@
>  "  to the 'wasmEval' function together with the specified imports object."),
>  
> +    JS_FN_HELP("setBufferStreamParams", SetBufferStreamParams, 2, 0,
> +"setBufferStreamParams(delayMillis, chunkByteSize)",
> +"  Set the delay time (between calls to StreamConsumer::consumeChunk) and size\n"

s/size of chunk size/chunk size/

::: js/src/vm/TypedArrayObject.h
@@ +328,5 @@
>  
> +// In WebIDL terminology, a BufferSource is either an ArrayBuffer or a typed
> +// array view. In either case, extract the dataPointer/byteLength.
> +bool
> +IsBufferSource(JSObject* object, SharedMem<uint8_t*>* dataPointer, size_t* byteLength);

So this doesn't work for ArrayBufferView objects in WebIDL? If so, I'm a bit surprised, but ok. If it does work for ABVs, please support those here, too.

::: js/src/wasm/WasmJS.cpp
@@ +2171,5 @@
> +
> +    static bool onSourceFulfilled(JSContext* cx, UniquePtr<CompileStreamTask> task, HandleValue v) {
> +        if (!v.isObject()) {
> +            JS_ReportErrorNumberASCII(cx, GetErrorMessage, nullptr, JSMSG_WASM_BAD_RESP_ARG);
> +            return RejectWithPendingException(cx, task->promise(cx));

You can use RejectWithErrorNumber here, right?
Attachment #8912903 - Flags: review?(till) → review+
Comment on attachment 8913008 [details] [diff] [review]
fix-thread-race

Review of attachment 8913008 [details] [diff] [review]:
-----------------------------------------------------------------

r=me
Attachment #8913008 - Flags: review?(till) → review+
Attached patch browser-streaming (WIP) (obsolete) — Splinter Review
Here's a WIP that almost entirely works but if a Worker is shutdown while a fetch()'s Promise is still resolving, the Promise returned by compileStreaming() is leaked, so I need to figure out how that is supposed to work.
Depends on: 1405661
Attached patch browser-streaming (obsolete) — Splinter Review
This patch implements the ConsumeStreamCallback added to jsapi.h in comment 26 and adds a bunch of tests.  Ultimately, the approach seems straightforward; the only complication was listening for shutdown of windows and workers which required splitting out the (Window|Worker)StreamOwner classes.
Attachment #8914921 - Attachment is obsolete: true
Attachment #8915442 - Flags: review?(bkelly)
Attached patch make-proxy-release-cancelable (obsolete) — Splinter Review
This patch makes the runnables created by NS_ProxyRelease() cancelable.
Attachment #8915443 - Flags: review?(bkelly)
Comment on attachment 8915443 [details] [diff] [review]
make-proxy-release-cancelable

Review of attachment 8915443 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/threads/nsProxyRelease.h
@@ +31,4 @@
>  {
>  public:
>    ProxyReleaseEvent(const char* aName, already_AddRefed<T> aDoomed)
> +  : CancelableRunnable(aName), mDoomed(aDoomed.take()) {}

You need to override Cancel() to actually release the doomed values on the target thread.  The default Cancel() does nothing:

https://searchfox.org/mozilla-central/source/xpcom/threads/nsThreadUtils.cpp#77
Attachment #8915443 - Flags: review?(bkelly) → review-
Hah, I *swear* that's what I wrote, I must have qref'd in the wrong patch or something...
Attachment #8915443 - Attachment is obsolete: true
Attachment #8915583 - Flags: review?(bkelly)
Comment on attachment 8915583 [details] [diff] [review]
make-proxy-release-cancelable

Review of attachment 8915583 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/threads/nsProxyRelease.h
@@ +40,5 @@
>    }
>  
> +  NS_IMETHOD Cancel() override
> +  {
> +    NS_IF_RELEASE(mDoomed);

I think its slightly better to call Run() directly, but this works.
Attachment #8915583 - Flags: review?(bkelly) → review+
The beefy new many-streams-at-once tests I added in the other patch found a pre-existing deadlock: canStartPromiseHelperTask wasn't using checkTaskThreadLimit().
Attachment #8915721 - Flags: review?(lhansen)
Attached patch browser-streaming (obsolete) — Splinter Review
Try server found case where closing stream (in WorkerHolder::Notify()) was transitively deleting 'this' for valid reasons, so this patch adds RefPtr(this) to Notify() and Observe().
Attachment #8915442 - Attachment is obsolete: true
Attachment #8915442 - Flags: review?(bkelly)
Attachment #8915722 - Flags: review?(bkelly)
Comment on attachment 8915721 [details] [diff] [review]
fix-promise-helper-task-deadlock

Review of attachment 8915721 [details] [diff] [review]:
-----------------------------------------------------------------

Nice catch.  This would not have happened if we had had a framework to fit all these tasks into, as that would have forced us to think about resource allocation.  Just one more vote for a scheduler rewrite, I guess.
Attachment #8915721 - Flags: review?(lhansen) → review+
(In reply to Lars T Hansen [:lth] from comment #41)
Agreed.
Blocks: 1406421
Pushed by lwagner@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/d252581d930e
Propagate quit() out of drainJobQueue() (r=till)
https://hg.mozilla.org/integration/mozilla-inbound/rev/c504f7082a1f
Baldr: refactor promise compile tasks in prepration for streaming (r=till)
https://hg.mozilla.org/integration/mozilla-inbound/rev/f61947de473f
Baldr: fix race in POSIX Thread impl (r=till)
https://hg.mozilla.org/integration/mozilla-inbound/rev/6c386d38608a
Fix possible deadlock with PromiseHelperTasks (r=lth)
https://hg.mozilla.org/integration/mozilla-inbound/rev/fd99f3454c51
Make NS_ProxyRelease runnables cancelable (r=bkelly)
Comment on attachment 8915722 [details] [diff] [review]
browser-streaming

Review of attachment 8915722 [details] [diff] [review]:
-----------------------------------------------------------------

Some initial comments.  Sorry, I ran out of time today.  I'll try to finish this over the weekend or first thing Monday.

::: dom/fetch/FetchUtil.cpp
@@ +454,5 @@
> +
> +    // Check mConsumerAborted before NS_FAILED to avoid calling streamClosed()
> +    // if consumeChunk() returned false per JS API contract.
> +    uint32_t written;
> +    rv = aStream->ReadSegments(WriteSegment, this, available < 100 ? available : 100, &written);

Capping the buffered reads to 100 byte chunks seems like a de-opt to me.

@@ +512,5 @@
> +
> +  if (response->Type() != ResponseType::Basic &&
> +      response->Type() != ResponseType::Cors &&
> +      response->Type() != ResponseType::Default) {
> +    return ThrowException(aCx, JSMSG_BAD_RESPONSE_CORS_SAME_ORIGIN);

Yea, I think this message should be written in terms of Response.type.

@@ +538,5 @@
> +
> +  IgnoredErrorResult error;
> +  response->SetBodyUsed(aCx, error);
> +  if (NS_WARN_IF(error.Failed())) {
> +    return ThrowException(aCx, JSMSG_RESPONSE_CANNOT_BE_USED);

I don't think devs will understand this error message.  I think it would be better to say "there is an error consuming the Response" and abort if you can.

::: dom/workers/RuntimeService.cpp
@@ +819,5 @@
> +              JS::HandleObject aObj,
> +              JS::MimeType aMimeType,
> +              JS::StreamConsumer* aConsumer)
> +{
> +  WorkerPrivate* worker = GetWorkerPrivateFromContext(aCx);

GetWorkerPrivateFromContext() is fallible during worker shutdown.

if (!worker) {
  return false;
}

::: js/src/js.msg
@@ +636,5 @@
>  
>  // Response-related
>  MSG_DEF(JSMSG_BAD_RESPONSE_VALUE,                        0, JSEXN_TYPEERR,  "expected Response or Promise resolving to Response")
> +MSG_DEF(JSMSG_BAD_RESPONSE_MIME_TYPE,                    0, JSEXN_TYPEERR,  "Response has unsupported MIME type")
> +MSG_DEF(JSMSG_BAD_RESPONSE_CORS_SAME_ORIGIN,             0, JSEXN_TYPEERR,  "response not CORS-same-origin")

Could we rephrase this message in terms of what devs can inspect on the Response object?  For example, does this mean the Response.type is 'opaque'?

nit: Capitalize Response to reflect the interface name in the fetch API?

@@ +638,5 @@
>  MSG_DEF(JSMSG_BAD_RESPONSE_VALUE,                        0, JSEXN_TYPEERR,  "expected Response or Promise resolving to Response")
> +MSG_DEF(JSMSG_BAD_RESPONSE_MIME_TYPE,                    0, JSEXN_TYPEERR,  "Response has unsupported MIME type")
> +MSG_DEF(JSMSG_BAD_RESPONSE_CORS_SAME_ORIGIN,             0, JSEXN_TYPEERR,  "response not CORS-same-origin")
> +MSG_DEF(JSMSG_BAD_RESPONSE_STATUS,                       0, JSEXN_TYPEERR,  "response does not have ok status")
> +MSG_DEF(JSMSG_RESPONSE_ALREADY_USED,                     0, JSEXN_TYPEERR,  "response already used")

nit: Maybe s/used/consumed/ to distinguish between any use and reading the body.

@@ +639,5 @@
> +MSG_DEF(JSMSG_BAD_RESPONSE_MIME_TYPE,                    0, JSEXN_TYPEERR,  "Response has unsupported MIME type")
> +MSG_DEF(JSMSG_BAD_RESPONSE_CORS_SAME_ORIGIN,             0, JSEXN_TYPEERR,  "response not CORS-same-origin")
> +MSG_DEF(JSMSG_BAD_RESPONSE_STATUS,                       0, JSEXN_TYPEERR,  "response does not have ok status")
> +MSG_DEF(JSMSG_RESPONSE_ALREADY_USED,                     0, JSEXN_TYPEERR,  "response already used")
> +MSG_DEF(JSMSG_RESPONSE_CANNOT_BE_USED,                   0, JSEXN_TYPEERR,  "response cannot be used")

What does this mean?
(In reply to Ben Kelly [:bkelly] from comment #45)
> > +    rv = aStream->ReadSegments(WriteSegment, this, available < 100 ? available : 100, &written);
> 
> Capping the buffered reads to 100 byte chunks seems like a de-opt to me.

Oh dear, that was a temporary modification I was using to stress-test, I didn't mean to include it in the posted patch.
Attached patch browser-streaming (obsolete) — Splinter Review
Thanks!
Attachment #8915722 - Attachment is obsolete: true
Attachment #8915722 - Flags: review?(bkelly)
Attachment #8916176 - Flags: review?(bkelly)
Comment on attachment 8915722 [details] [diff] [review]
browser-streaming

Review of attachment 8915722 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/js.msg
@@ +635,5 @@
>  MSG_DEF(JSMSG_STREAM_INVALID_HIGHWATERMARK,             0, JSEXN_RANGEERR, "'highWaterMark' must be a non-negative, non-NaN number.")
>  
>  // Response-related
>  MSG_DEF(JSMSG_BAD_RESPONSE_VALUE,                        0, JSEXN_TYPEERR,  "expected Response or Promise resolving to Response")
> +MSG_DEF(JSMSG_BAD_RESPONSE_MIME_TYPE,                    0, JSEXN_TYPEERR,  "Response has unsupported MIME type")

When changing js.msg, please consider adding documentation for them on MDN:
  https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Errors

Ideally before they appear on:
  https://nbp.github.io/arewedocumentedyet/

You should be able to clone one of the page as a source of inspiration, otherwise ask me on irc.
Comment on attachment 8916176 [details] [diff] [review]
browser-streaming

Review of attachment 8916176 [details] [diff] [review]:
-----------------------------------------------------------------

This looks mostly good, but has a few issues to address.  In particular I feel I need to r- for:

1. I believe this can currently perform JSAPI work on the wrong thread in OnInputStreamReady().
2. I'd like to see if we can avoid the extra manual AddRef/Release on JSStreamConsumer.

I didn't review the tests.

::: dom/fetch/FetchUtil.cpp
@@ +166,5 @@
>  
> +class StreamOwner : public nsISupports
> +{
> +protected:
> +  nsCOMPtr<nsIAsyncInputStream> mStream;

Maybe add a comment that this is set and cleared from the owning thread, but can be read from any thread.  The life cycle of the object should ensure that its not read/cleared simultaneously.

@@ +190,5 @@
> +  nsCOMPtr<nsIGlobalObject> mGlobal;
> +
> +  ~WindowStreamOwner() override
> +  {
> +    MOZ_ASSERT(NS_IsMainThread());

This should remove itself from the ObserverService on destruction.  While its using a weak ref the observer service still has a list entry for it.  Its better for perf to remove these on destruction instead of letting them build up and require additional work to clean later.

@@ +214,5 @@
> +    }
> +
> +    RefPtr<WindowStreamOwner> self = new WindowStreamOwner(aStream, aGlobal);
> +
> +    nsresult rv = os->AddObserver(self, DOM_WINDOW_DESTROYED_TOPIC, true);

Please add a comment noting that this is using a weak ref.

@@ +261,5 @@
> +{
> +  ~WorkerStreamOwner() = default;
> +
> +public:
> +  NS_DECL_THREADSAFE_ISUPPORTS

Its a bit dangerous to make the Owner classes use threadsafe ref-counting.  While you proxy release them on the right thread it opens the risk that something on the wrong thread will grab a ref-count and end up holding it alive past the proxy release.

I think you should be able to just make these NS_DECL_ISUPPORTS instead here.  You create them on the right thread and only .forget() them on the other thread.

A more explicit option would be to not make these implement ISUPPORTS or use ref-counting at all.  Instead just use a UniquePtr<> that you Move() around.  You would have to do your proxy release runnable, though.  This is more work to write, of course.

@@ +319,5 @@
> +   , mConsumer(aConsumer)
> +   , mConsumerAborted(false)
> +  {
> +    MOZ_ASSERT(mStreamOwner);
> +    MOZ_ASSERT(mConsumer);

Please make these and other cheap assertions in this patch MOZ_DIAGNOSTIC_ASSERT().

@@ +327,5 @@
> +  {
> +    // Both WindowStreamOwner and WorkerStreamOwner need to be destroyed on
> +    // their global's event target thread.
> +    NS_ProxyRelease("~JSStreamConsumer", mOwningEventTarget,
> +                    mStreamOwner.forget());

Actually, can you mark JSStreamConsumer as NS_DECL_ISUPPORTS without threadsafe ref-counting at all?  If you pass the correct event target to AsyncWait() I think the streams should never ref-count you on another thread.

@@ +378,5 @@
> +    if (NS_WARN_IF(NS_FAILED(rv))) {
> +      return false;
> +    }
> +
> +    // Use a pipe to create an nsIAsyncInputStream if we don't already have one.

Andrea is on the verge of adding a NS_MakeAsyncNonBlockingInputStream() helper in bug 1371699.  This has an added advantage of avoiding pipe creation in some places where its not necessary, like fixed length string streams.

If you don't want to wait for that, maybe file a follow-up to adjust this code after he lands it.

@@ +414,5 @@
> +    if (!streamConsumer) {
> +      return false;
> +    }
> +
> +    rv = asyncStream->AsyncWait(streamConsumer, 0, 0, nullptr);

Please be aware that this can synchronously trigger a read and stream completion.  The streamConsumer may be dead after this AsyncWait() call.

Also, it would be nice to add a comment about the ref-cycle in place here:

  stream -> consumer -> owner -> stream

And that its broken when the stream consumption completes or errors out.

Also, you really need to pass mOwningEventTarget here!  If this is a pipe being populated on another thread then passing nullptr here allows pipe to call OnInputStreamReady() on that IO thread.  Your code then does JSAPI stuff from OnInputStreamReady() which is definitely not safe.

@@ +419,5 @@
> +    if (NS_WARN_IF(NS_FAILED(rv))) {
> +      return false;
> +    }
> +
> +    // This ref-count will dropped by Finish().

nit: "will be dropped"

Also, why is this necessary?  The AsyncWait() call means the stream is holding a ref to the stream consumer.  This ref should be held across the OnInputStreamReady() and re-newed if you end up calling AsyncWait again.

I just worry about leaks or other issues from these kinds of manual AddRef/Release schemes.

For example, if the AsyncWait() above triggers an immediate completion then you will get Finish() and Release() called synchronously.  This will make the streamConsumer dead even though this code looks like it has a strong ref to.

I really think it would be safer if you could drop the manual Release() in Finish() and just make this:

  return asyncStream->AsyncWait(streamConsumer, 0, 0, nullptr);

As the final line.

@@ +436,5 @@
> +    MOZ_ASSERT(!mConsumerAborted);
> +
> +    nsresult rv;
> +
> +    uint64_t available;

nit: initialize to 0 or something

@@ +453,5 @@
> +    }
> +
> +    // Check mConsumerAborted before NS_FAILED to avoid calling streamClosed()
> +    // if consumeChunk() returned false per JS API contract.
> +    uint32_t written;

nit: initialize to zero

@@ +462,5 @@
> +    if (NS_WARN_IF(NS_FAILED(rv))) {
> +      return OnError();
> +    }
> +
> +    rv = aStream->AsyncWait(this, 0, 0, nullptr);

You must pass mOwningEventTarget here.

@@ +465,5 @@
> +
> +    rv = aStream->AsyncWait(this, 0, 0, nullptr);
> +    if (NS_WARN_IF(NS_FAILED(rv))) {
> +      mConsumer->streamClosed(JS::StreamConsumer::Error);
> +      return Finish();

It looks like this should be return OnError().

::: dom/workers/RuntimeService.cpp
@@ +585,5 @@
>  {
>    WorkerPrivate* worker = GetWorkerPrivateFromContext(aCx);
> +  if (!worker) {
> +    JS_ReportErrorNumberASCII(aCx, js::GetErrorMessage, nullptr,
> +                              JSMSG_ERROR_CONSUMING_RESPONSE);

Did you mean to put this in ConsumeStream()?.

The message works.  The NS_ERROR_DOM_INVALID_STATE_ERR message would also seem appropriate.

@@ +823,5 @@
> +              JS::HandleObject aObj,
> +              JS::MimeType aMimeType,
> +              JS::StreamConsumer* aConsumer)
> +{
> +  WorkerPrivate* worker = GetWorkerPrivateFromContext(aCx);

Need a runtime check for worker existence here.
Attachment #8916176 - Flags: review?(bkelly) → review-
Blocks: 1407084
(In reply to Ben Kelly [:bkelly] from comment #49)
Thanks!  Great review comments.  All addressed except where commented below; patch in a sec.

> This should remove itself from the ObserverService on destruction.  While
> its using a weak ref the observer service still has a list entry for it. 
> Its better for perf to remove these on destruction instead of letting them
> build up and require additional work to clean later.

As a side note, what an awful API that you get a silent performance fault if you rely on the dtor to do what it is usually supposed to do.  It seems like this wouldn't be the case for a good observer service implementation that lazily removed dead elements during iteration, std::remove-style.

> A more explicit option would be to not make these implement ISUPPORTS or use
> ref-counting at all.  Instead just use a UniquePtr<> that you Move() around.
> You would have to do your proxy release runnable, though.  This is more work
> to write, of course.

I totally agree and had the same inclination.  The problem is that JSStreamConsumer wants to store the StreamOwner base class (polymorphically) and WindowStreamOwner does need to be ref-counted (b/c nsIWeakReference).  It's easy, just a bit messier, to drop the polymorphism, and I thought this might be overkill, but if you had the same feeling, I'll do it; I like lifetimes more rigidly defined.

> > +    NS_ProxyRelease("~JSStreamConsumer", mOwningEventTarget,
> > +                    mStreamOwner.forget());
> 
> Actually, can you mark JSStreamConsumer as NS_DECL_ISUPPORTS without
> threadsafe ref-counting at all?  If you pass the correct event target to
> AsyncWait() I think the streams should never ref-count you on another thread.

I considered that, but then it seems like every chunk's delivery serializes through the main thread and could end up blocking on JS, which I'd like to avoid since load-time is often a time for janky expensive main-thread stuff.

> Please be aware that this can synchronously trigger a read and stream
> completion.  The streamConsumer may be dead after this AsyncWait() call.

Indeed, I have a final "Note:" in the comment for ConsumeStreamCallback in jsapi.h about this.

> Also, you really need to pass mOwningEventTarget here!  If this is a pipe
> being populated on another thread then passing nullptr here allows pipe to
> call OnInputStreamReady() on that IO thread.  Your code then does JSAPI
> stuff from OnInputStreamReady() which is definitely not safe.

Not just any JS API code, though: only the two callbacks that are specifically commented (in jsapi.h) to be callable from any thread.  I designed this JS::StreamConsumer around streaming not involving the main thread (b/c of above).

> Also, why is this necessary?  The AsyncWait() call means the stream is
> holding a ref to the stream consumer.  This ref should be held across the
> OnInputStreamReady() and re-newed if you end up calling AsyncWait again.

Good point!  This entered at an earlier iteration of the patch and I didn't realize later it was redundant.
Attachment #8916176 - Attachment is obsolete: true
Attachment #8916834 - Flags: review?(bkelly)
Priority: -- → P3
Comment on attachment 8916834 [details] [diff] [review]
browser-streaming

Review of attachment 8916834 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/fetch/FetchUtil.cpp
@@ +174,5 @@
> +  nsCOMPtr<nsIGlobalObject> mGlobal;
> +
> +  ~WindowStreamOwner()
> +  {
> +    MOZ_DIAGNOSTIC_ASSERT(NS_IsMainThread());

Oh, I think NS_IsMainThread() might be a bit too heavy for diagnostic.  It uses a TLS variable.  I usually leave these as MOZ_ASSERT().  Sorry for the confusion.

@@ +328,5 @@
> +class JSStreamConsumer final : public nsIInputStreamCallback
> +{
> +  nsCOMPtr<nsIEventTarget> mOwningEventTarget;
> +  RefPtr<WindowStreamOwner> mWindowStreamOwner;
> +  UniquePtr<WorkerStreamOwner> mWorkerStreamOwner;

Sorry, I was ok with your explanation of why you wanted nsISupports for the owners.  You could have left them both that way with non-threadsafe ref-counting.

Now I feel compelled to suggest a Variant<> for holding the two different types of owners here.

In the end, though, this is fine.

@@ +380,5 @@
> +                               uint32_t aToOffset,
> +                               uint32_t aCount,
> +                               uint32_t* aWriteCount)
> +  {
> +    JSStreamConsumer* self = reinterpret_cast<JSStreamConsumer*>(aClosure);

Please add a comment that this may be called on any thread and its explicitly supported by JSAPI.

@@ +461,5 @@
> +
> +  NS_IMETHOD
> +  OnInputStreamReady(nsIAsyncInputStream* aStream) override
> +  {
> +    MOZ_DIAGNOSTIC_ASSERT(!mConsumerAborted);

Please add a comment like:

// Can be called on any stream.

@@ +493,5 @@
> +      mConsumer->streamClosed(JS::StreamConsumer::Error);
> +      return NS_OK;
> +    }
> +
> +    rv = aStream->AsyncWait(this, 0, 0, nullptr);

Ok, so I see why you don't want to force on-thread reading now.  This, however, does not actually guarantee you will be off thread.  You can still end up copying data on the current thread this way.  Is this what you intend?

Or do you really want to pass the StreamTransportService as the target for AsyncWait()?

Whats more, if you really support off thread reading and intend to do so, I'm not sure you need to require an async non-blocking stream.  You could perform this logic if thats the case, but otherwise perform blocking reads on STS.  That would in theory avoid copying to a pipe just so you can copy into your JS memory.

I would structure that logic like:

  // if async stream

    // trigger AsyncWait() like you have here

  // Otherwise do this on STS or another IO thread:

    // Check if ReadSegments is supported.  If not create a nsBufferedInputStream wrapper.

    // Perform blocking ReadSegments until the stream is consumed

All of this can be done in a follow-up, though.  Its an optimization.
Attachment #8916834 - Flags: review?(bkelly) → review+
(In reply to Ben Kelly [:bkelly] from comment #52)
Thanks!

> Now I feel compelled to suggest a Variant<> for holding the two different
> types of owners here.
> 
> In the end, though, this is fine.

Yeah, I was wondering about a Variant, but it feels to me to add more verbosity than it's worth in this simple situation.

> All of this can be done in a follow-up, though.  Its an optimization.

Cool, IIUC, commented in bug 1407084 as alternative to optimizing the pipe.
Pushed by lwagner@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/b9053d53c1ca
Baldr: shell WebAssembly.compileStreaming and instantiateStreaming (r=till)
https://hg.mozilla.org/integration/mozilla-inbound/rev/2db8ced6f6cc
Baldr: implement ConsumeStreamCallback in browser (r=bkelly)
Keywords: leave-open
No longer depends on: 1445973
Regressions: 1877586
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: