Closed Bug 1134073 Opened 9 years ago Closed 8 years ago

Show cause of a network request in netmonitor

Categories

(DevTools :: Netmonitor, defect)

defect
Not set
normal

Tracking

(firefox49 verified, relnote-firefox 49+, firefox50 verified)

VERIFIED FIXED
Firefox 50
Tracking Status
firefox49 --- verified
relnote-firefox --- 49+
firefox50 --- verified

People

(Reporter: bgrins, Assigned: jsnajdr, Mentored)

References

(Depends on 2 open bugs, Blocks 1 open bug)

Details

(Keywords: dev-doc-complete, Whiteboard: [lang=js])

Attachments

(6 files, 30 obsolete files)

40.69 KB, image/png
Details
29.22 KB, patch
ochameau
: review+
Details | Diff | Splinter Review
599.64 KB, image/png
Details
312.18 KB, image/png
Details
21.67 KB, patch
jsnajdr
: review+
Details | Diff | Splinter Review
25.66 KB, patch
jsnajdr
: review+
Details | Diff | Splinter Review
We should add a column with the reason for a network request happened (XHR, <img>, <script>, form submit, etc).

If it was triggered by JS we may be able to show additional details like a stack trace to help figure out why a request happened.
Implementation notes:

- the NetworkMonitor (network-monitor.js) should record and transmit the request types defined here:
https://dxr.mozilla.org/mozilla-central/source/dom/base/nsIContentPolicyBase.idl
- we would need to maintain a (localized) number-to-string lookup table in the frontend for these types, but transfer the number in the protocol packet
- NM__matchRequest already has code that checks the right field for TYPE_BEACON
Mentor: past
Whiteboard: [lang=js]
Isn't this a little redundant with the "Type" column ?
Well I thought about it and I came to this conclusion: the current "Type" column is for the type of the resource, and the new column would be for the type of the request (or why it happened). Am I wrong ?
Assignee: nobody → anush
Mentor: jaws
Status: NEW → ASSIGNED
Jaws im working on this, sorry it has been a little long. I just have one question what shall I name the new column "Reason" or any other...?
Flags: needinfo?(jaws)
I'll let bgrins answer that :)
Flags: needinfo?(jaws) → needinfo?(bgrinstead)
In the meantime though, you can use "Reason". We can always change it later.
(In reply to Jared Wein [:jaws] (please needinfo? me) from comment #6)
> In the meantime though, you can use "Reason". We can always change it later.

Agreed - just ship it and we can discus the right label as an aside.

bgrins: I like 'reason', 'cause' maybe. I really dislike chrome's 'initiator':

http://note.io/1aLm1Me

Whatever we do here, I think it should be consistent with other language we might use in performance or other tools for similar reasons.
(In reply to Anush [:bmx] from comment #4)
> Jaws im working on this, sorry it has been a little long. I just have one
> question what shall I name the new column "Reason" or any other...?

"Reason" sounds great.  We may move to "Cause" before landing but feel free to get reviews started with either one.

(In reply to Jeff Griffiths (:canuckistani) from comment #7)
> Whatever we do here, I think it should be consistent with other language we
> might use in performance or other tools for similar reasons.

Sure, it looks like "Cause" is currently being used in the patch in Bug 1050500 for timeline markers.  We may want to stick with the same name for the two tools if the features are sufficiently similar.
Flags: needinfo?(bgrinstead)
See Also: → 1050500
Anush, any update here?
Flags: needinfo?(anush)
Sorry I just got a little Lazy, Im on it....I managed to get the tab I'm working to capture the requests in backed (network-monitor.js) looks like i need some help.
Flags: needinfo?(anush)
Anush, any updates? You can paste a work-in-patch if you have something working but don't have time to polish.
Flags: needinfo?(anush)
Gabriel I couldn't get to know how the network-monitor works, got busy with university exams last month and will be for a few more days.

I just created new tab in template do you want that.?
Flags: needinfo?(anush)
Added a WIP patch that does the following:
- add a new 'cause' field to the NetworkEvent actor and protocol packet
- display the field in a 'Cause' column in the network monitor

What needs to be done:

1. Add localization for the 'Cause' column label (I don't know how this process works).

2. Decide if we like the string names of the various causes (document, subdocument, img, xhr, beacon, ...) and if we want to localize them (my personal opinion is that localization is near to useless here).

3. Use symbolic names for constants in the LOAD_CAUSE_STRINGS map - is it OK to import from Ci module in netmonitor-view.js? I know these modules are discouraged for extensions authors, not sure what the devtools policy is. By the way, very similar code is already in toolkit/modules/addons/WebRequestCommon.jsm

4. Write new tests for this feature and make sure that existing ones are passing.

5. It would be nice if the new feature was documented somehow - providing the user with some detailed information about what the different 'causes' are. For example, I didn't know that things like 'ping' or 'beacon' exist before I started working on this bug.

6. Should the 'Cause' really be a column in the request list? As the columns are not resizable and can't be added/removed, the new columns occupies precious screen real-estate. Maybe the request detail pane would be a better place for it.

7. It would be very nice to include a sourceURL and sourceLine fields pointing to the exact location from where the request was triggered. The sources include CSS, HTML, JS (maybe stack trace would be good here, too) and maybe other places. This would probably require further extension of the NetworkEventActor and the corresponding protocol packets.

8. The NetworkEvent's 'isXHR' field is now redundant, as it's value can be always computed from the 'cause' value. What about removing it? It would be a breaking change in the debugger protocol. I don't know how much public the protocol is and what is the policy for making changes - wouldn't it break Firebug, for example?
Flags: needinfo?(jaws)
Assignee: anush → jsnajdr
Comment on attachment 8724688 [details] [diff] [review]
Show cause of a network request in netmonitor. r=jaws

Thanks for the patch. Honza is in a better position to review this patch. I'll let him take it from here.
Flags: needinfo?(jaws)
Attachment #8724688 - Flags: feedback?(odvarko)
Mentor: past → odvarko
Attachment #8725175 - Flags: feedback?(odvarko)
Comment on attachment 8725175 [details] [diff] [review]
Show cause of a network request in netmonitor. r=Honza

Added a new version of the patch:
- use constants from Ci.nsIContentPolicy to convert ints to strings
- added a mochitest for the feature
(In reply to Jarda Snajdr [:jsnajdr] from comment #14)
> Added a WIP patch that does the following:
> - add a new 'cause' field to the NetworkEvent actor and protocol packet
> - display the field in a 'Cause' column in the network monitor
Thanks for the patch Jardo!

I like that feature in general and I think we should ship it.

> What needs to be done:
> 
> 1. Add localization for the 'Cause' column label (I don't know how this
> process works).
The column label is properly defined in netmonitor.dtd file so, 
we are good here (someone in the l10n team will take care of it).

> 2. Decide if we like the string names of the various causes (document,
> subdocument, img, xhr, beacon, ...) and if we want to localize them (my
> personal opinion is that localization is near to useless here).
I definitely agree.

> 3. Use symbolic names for constants in the LOAD_CAUSE_STRINGS map - is it OK
> to import from Ci module in netmonitor-view.js? I know these modules are
> discouraged for extensions authors, not sure what the devtools policy is. By
> the way, very similar code is already in
> toolkit/modules/addons/WebRequestCommon.jsm
Looks ok to me.


> 4. Write new tests for this feature and make sure that existing ones are
> passing.
Push to try:
treeherder.mozilla.org/#/jobs?repo=try&revision=3f00a361eb95

let's see how tests are working.


> 5. It would be nice if the new feature was documented somehow - providing
> the user with some detailed information about what the different 'causes'
> are. For example, I didn't know that things like 'ping' or 'beacon' exist
> before I started working on this bug.
Yes, flag doc-needed set.

> 6. Should the 'Cause' really be a column in the request list? As the columns
> are not resizable and can't be added/removed, the new columns occupies
> precious screen real-estate. Maybe the request detail pane would be a better
> place for it.
I like having a new column and perhaps it can help to increase priority
of a 'column picker' feature that would allow the user to show/hide columns.
I might look at this, since I did it in Firebug the other day.
As soon as this is done we might discuss what columns should be visible
by default.

Helen what do you think?

> 7. It would be very nice to include a sourceURL and sourceLine fields
> pointing to the exact location from where the request was triggered. The
> sources include CSS, HTML, JS (maybe stack trace would be good here, too)
> and maybe other places. This would probably require further extension of the
> NetworkEventActor and the corresponding protocol packets.
Sounds like a nice follow up to me. 

> 8. The NetworkEvent's 'isXHR' field is now redundant, as it's value can be
> always computed from the 'cause' value. What about removing it? It would be
> a breaking change in the debugger protocol. I don't know how much public the
> protocol is and what is the policy for making changes - wouldn't it break
> Firebug, for example?
Firebug should be fine. But, not sure who's responsible for RDP back comp.
Patrick who should we ask?

Honza
Flags: needinfo?(pbrosset)
Flags: needinfo?(hholmes)
Keywords: user-doc-needed
Attachment #8725175 - Flags: feedback?(odvarko) → feedback+
Comment on attachment 8724688 [details] [diff] [review]
Show cause of a network request in netmonitor. r=jaws

Making the previous patch obsolete.

Honza
Attachment #8724688 - Attachment is obsolete: true
Attachment #8724688 - Flags: feedback?(odvarko)
(In reply to Jan Honza Odvarko [:Honza] from comment #18)
> > 8. The NetworkEvent's 'isXHR' field is now redundant, as it's value can be
> > always computed from the 'cause' value. What about removing it? It would be
> > a breaking change in the debugger protocol. I don't know how much public the
> > protocol is and what is the policy for making changes - wouldn't it break
> > Firebug, for example?
> Firebug should be fine. But, not sure who's responsible for RDP back comp.
> Patrick who should we ask?
Until not long ago, we very much cared about one aspect of our protocol compatibility: that connecting to older debugger servers would still work.
So, in the present case, a network monitor UI would have to both know how to handle communicating with a server that sends the isXHR flag, and one that doesn't (but has the cause instead).
This has changed when we moved FirefoxOS to tier 3 support.
Having said this, we still want to support Firefox for Android up to release. So, if you open the network monitor UI with a connection to an Android phone, this should still work.
See this thread about backward compatibility: https://groups.google.com/forum/#!searchin/mozilla.dev.developer-tools/Tier$203$20support/mozilla.dev.developer-tools/RiKZhVbgn8o/TpExfI9VAwAJ
And: https://wiki.mozilla.org/DevTools/Backwards_Compatibility
Flags: needinfo?(pbrosset)
(In reply to Jan Honza Odvarko [:Honza] from comment #18)
> > 6. Should the 'Cause' really be a column in the request list? As the columns
> > are not resizable and can't be added/removed, the new columns occupies
> > precious screen real-estate. Maybe the request detail pane would be a better
> > place for it.
> I like having a new column and perhaps it can help to increase priority
> of a 'column picker' feature that would allow the user to show/hide columns.
> I might look at this, since I did it in Firebug the other day.
> As soon as this is done we might discuss what columns should be visible
> by default.
> 
> Helen what do you think?

Having it as a column makes sense to me (I'm not a huge fan of our sliding panels, so I prefer an additional column). I agree that a column picker is worth looking into, although that's not within the scope of this bug.
Flags: needinfo?(hholmes)
(In reply to Helen V. Holmes (:helenvholmes) (:✨)(Please needinfo) from comment #22)
> Having it as a column makes sense to me (I'm not a huge fan of our sliding
> panels, so I prefer an additional column). I agree that a column picker is
> worth looking into, although that's not within the scope of this bug.
Great, here is a bug for the column picker: Bug 1252858

Honza
(In reply to Patrick Brosset [:pbrosset] [:pbro] from comment #21)
> (In reply to Jan Honza Odvarko [:Honza] from comment #18)
> > > 8. The NetworkEvent's 'isXHR' field is now redundant, as it's value can be
> > > always computed from the 'cause' value. What about removing it? It would be
> > > a breaking change in the debugger protocol. I don't know how much public the
> > > protocol is and what is the policy for making changes - wouldn't it break
> > > Firebug, for example?
> > Firebug should be fine. But, not sure who's responsible for RDP back comp.
> > Patrick who should we ask?
> Until not long ago, we very much cared about one aspect of our protocol
> compatibility: that connecting to older debugger servers would still work.
> So, in the present case, a network monitor UI would have to both know how to
> handle communicating with a server that sends the isXHR flag, and one that
> doesn't (but has the cause instead).
> This has changed when we moved FirefoxOS to tier 3 support.
> Having said this, we still want to support Firefox for Android up to
> release. So, if you open the network monitor UI with a connection to an
> Android phone, this should still work.
> See this thread about backward compatibility:
> https://groups.google.com/forum/#!searchin/mozilla.dev.developer-tools/
> Tier$203$20support/mozilla.dev.developer-tools/RiKZhVbgn8o/TpExfI9VAwAJ
> And: https://wiki.mozilla.org/DevTools/Backwards_Compatibility

If anyone's unsure if compatibility is being done correctly, please flag me on the patch.  I am happy to take a look to double check.
New try push here:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=a4437075d005

@Jarda Snajdr: Here are some tips for pushing to try.

* MDN page: https://wiki.mozilla.org/ReleaseEngineering/TryServer
* Try Chooser: https://wiki.mozilla.org/Build:TryChooser
* My typical try-commit message for devtools: "try: -b do -p linux64,macosx64,win32 -u xpcshell,mochitest-dt,mochitest-o,mochitest-e10s-devtools-chrome -t none"
* The push command: hg push -f ssh://jodvarko@mozilla.com@hg.mozilla.org/try/

Honza
One more comment to this feature.

My test page has an image in CSS file. Somethings like as follows:

.myRule {
  border-image: url(image.png) 1 1 1 1 repeat;
}

The HTTP request for the image.png displayed in the Net panel says cause: 'img'. But, what I would really expect here (as a web developer) is 'stylesheet'. Since it was the CSS rule in my stylesheet that caused the request to happen. I know the patch is just consuming platform API so, there is not much we (devtools) can do, but if we agree with such behavior we might want to ask the platform folks to improve the API.

Honza
Attached patch Show cause of a network request in netmonitor. (obsolete) — — Splinter Review
Attachment #8726691 - Flags: review?(odvarko)
Attachment #8725175 - Attachment is obsolete: true
Uploaded a new version of the patch with one improvement - retrieve also the URI of the document that originated the request and show it as a tooltip of the 'cause' column value.

These values are exactly the same ones that are passed to nsIContentPolicy::shouldLoad as the aContentType and aRequestOrigin arguments. It's a public API commonly used by content-blocking extensions, so changing and extending it (to report a different TYPE_ for images loaded from CSS, for example) can be quite a big deal. It's probably not possible to get more information about the requests from the current platform API.

What's not very nice is that the requestOrigin URI is always the URI of the loading document (top-level or iframe) instead of being the URL of the stylesheet loading an image, for example.

I decided not to remove the isXHR field from the protocol packet - it requires nontrivial effort for very little benefit, and would require update of MDN pages like this one: https://developer.mozilla.org/en-US/docs/Tools/Web_Console/remoting

The feature can't be significantly improved without some larger update to the platform code. Honza, please decide if it's worth shipping in the current state.
Comment on attachment 8726691 [details] [diff] [review]
Show cause of a network request in netmonitor.

(In reply to Jarda Snajdr [:jsnajdr] from comment #28)
> The feature can't be significantly improved without some larger update to
> the platform code. Honza, please decide if it's worth shipping in the
> current state.

I've been playing with this a bit more and I'll also attach a screenshot that compares the 'Cause' and 'Type' columns. In most cases these columns are mostly showing the same info (slightly different wording). For example: cause for 'css' is 'stylesheet', cause for 'js' is 'script', 'html' -> 'subdocument', 'png' -> 'img', etc. The additional value is very low.

Since the requestOrigin URI is also not adding much (if there was a line number pointing to the source that caused the traffic, that would be something! ... and Chrome does that), so I am voting for not landing this feature. 

But, this was great prototyping! Trying to build a new feature on top of platform API is always great and it's nice to see such effort. I'd recommend to file a platform bug with explanation why the API are not much useful for devtools. Such feedback could help to get better platform support in the future.

To be sure, Victor what do you think? (you might want to read from comment #26

Honza
Flags: needinfo?(vporof)
Attachment #8726691 - Flags: review?(odvarko) → review-
There are actually two different meanings of the "file/line that caused the traffic".

If a page loads a stylesheet.css file that contains one rule:

.highlighted { background-image: url(highlight.png); }

then the event of loading 'highlight.png' can have two sources:

1. The obvious one: stylesheet.css:1 - the place where the image file is referenced.

2. Because the 'highlight.png' is lazy-loaded at the time when the '.highlighted' rule is first used, the source can be the JS code that adds the 'highlighted' class to an element for the first time. This is what Chrome network inspector shows in their 'Initiator' column.

Source #2 could be probably extracted without a new platform API - by looking at the components.stack.caller object. The ConsoleReflowListener code already uses that to report places that trigger CSS reflows.

About source #1, I don't know. The nsIContentPolicy::shouldLoad API gets an aContext parameter that is the DOM node that triggered the request. Maybe something could be extracted from that. But I'm afraid that it's the top-level document/tab element, not the actual <script>, <link> or <img> one.

I'll explore these two ideas and if nothing else, it will make me more qualified to file a relevant platform bug.
I think there's a few cases where the distinction can be useful, e.g. when a script is requesting js, or a xhr is requesting json. In those cases there's quite a few ways in which such actions can be performed. It would be infinitely more useful if these would link to the actual request site (e.g. the source/line in which that xhr was performed, for instance). I personally wouldn't find that much value in seeing these columns as they are, but I haven't been doing actual web development in quite a long time.
Flags: needinfo?(vporof)
(In reply to Jarda Snajdr [:jsnajdr] from comment #31)
> I'll explore these two ideas and if nothing else, it will make me more
> qualified to file a relevant platform bug.
Alright. Please let us know what you've found and whether there is room for any improvement (using the current platform API) and we'll re-evaluate. Victor is voting for it so, we might want to land it.

Honza
(In reply to Victor Porof [:vporof][:vp] from comment #32)

>  It would be
> infinitely more useful if these would link to the actual request site (e.g.
> the source/line in which that xhr was performed, for instance). 

Adding stacks to XHR is bug 9062329.
That one is waiting for the network view refactor.
(In reply to Tom Tromey :tromey from comment #34)
> Adding stacks to XHR is bug 9062329.

It's actually bug 906239.
Attachment #8726691 - Attachment is obsolete: true
New WIP version of the 'show request cause' feature that also shows the stack trace, if available. It appears as a popup bubble over the 'cause' column (very similar to Chrome's Initiator column).

Part 1:
- C++ patch to fire http-on-opening-request also in the content process
- copied without change from tromey's patch in bug 906239

Part 2:
- collect the information about request's stacktrace and type on the server
- based on tromey's patch in bug 906239, merged with my 'request cause' patch back from March

Part 3:
- Tooltip patch needed for the UI - don't hide the tooltip popup when mouse is over it - allows it to be interactive
- I want to eventually land this as a separate bug that improves the behavior of tooltips

Part 4:
- frontend UI: new "Cause" column in netmonitor, shows stacktrace in a popup tooltip on mouseover

Part 5:
- mochitest for the cause column, doesn't include the new stacktrace stuff

Things to do before this can be put up for review and landed:

Backend: There is a serious bug if there are multiple tabs open. The http-on-opening-request observer is observing ALL requests from all tabs, and sending them to all network monitor actors, even for other tabs. Just open multiple tabs and watch the console.logs: many requests are pushed to the this.stacktraces queue, but they are not always shifted.

Frontend: Properly design the UI and finalize it:
- is a popup bubble the right place to show the stack trace?
- the stack should be also displayed in the HTTP request details in the console
- how to properly display the stack and the links to source? At various places, it's done slightly differently. There is console.trace, event handler viewer in inspector, and probably also others
Flags: needinfo?(odvarko)
Depends on: 1269765
I made the backend code work better than before, in both e10s and non-e10s. The whole approach is still very problematic - it's fragile and can easily break. As a next step, I'll focus on the client process notifications in bug 1269765.

To make the code cleaner, I did the following:
- extracted the stack-collecting code to a separate class StackTraceCollector
- extracted the matchRequest function out of the NetworkMonitor class - it's now used by the StackTraceCollector, too

How it works?

In e10s:
- the child's NetworkMonitorChild has a StackTraceCollector that listens on the "http-on-opening-request" and collects a stack trace
- the parent's NetworkMonitor creates a new network event, either on nsIHttpActivityObserver's ACTIVITY_SUBTYPE_REQUEST_HEADER event, or on the "http-on-examine-cached-response" event, in case when the response is from cache
- the parent sends the event through NetworkEventActorProxy to the child
- the child has a stack trace for this request in its queue and attaches it to the network event

In non-e10s:
- there is no parent and child, both the stack collection and network event creation happens in the same NetworkMonitor object

In both cases, we rely on the precise ordering of the events:
1. the "http-on-opening-request" in the child must always fire before the message about the new event arrives from the parent
2. all the events always fire in the same order - they must never be swapped
3. the "http-on-opening-request" fires if and only if the ACTIVITY_SUBTYPE_REQUEST_HEADER/http-on-examine-cached-response in the parent fires, too. I must never happen that one of these event doesn't fire

Assuption #3 doesn't always hold. I encountered problems when the toolbox is being opened in the middle of network activity. Then some events are lost. Also, the "http-on-opening-request" doesn't always fire in case there is going to be a cached response.

Another problem is that the matchRequest function should have exactly the same results in the parent and in the child. It's called for the same request in both processes. However, in parent it tests against topFrame, and in the child it tests against window. I'm not sure if the results are always 100% the same. And of course, I didn't test any b2g stuff (appId).

Things I didn't investigate much yet:
- requests from service workers
- network monitor in the browser toolbox
Attachment #8748622 - Flags: feedback?(poirot.alex)
Attachment #8747763 - Attachment is obsolete: true
Improved mochitest to test the stack traces, too
Attachment #8747766 - Attachment is obsolete: true
Depends on: 1270111
Comment on attachment 8748622 [details] [diff] [review]
Part 2: collect information about request cause and stacktrace in network monitor

Review of attachment 8748622 [details] [diff] [review]:
-----------------------------------------------------------------

::: devtools/shared/webconsole/network-monitor.js
@@ +162,5 @@
> +      }
> +    }
> +
> +    console.log("Queued stacktrace:", this.stacktraces.length, uri);
> +    this.stacktraces.push({ uri, stacktrace });

I looked over nsIHttpChannel, nsIChannel and nsIRequest and I haven't seen any meaningful identifier for a request.
If we end up piling hacks on top of the parent <-> child piping, we should find a better id than just the url. It would address the fact that we mostly rely on various cross process call orders, which is just frightening!

@@ +179,5 @@
> +      return null;
> +    }
> +
> +    console.log("Retrieved stacktrace:", this.stacktraces.length, uri);
> +    return trace.stacktrace;

On non-e10s, and let's say on e10s if bug 1269765 is addressable.
Here we could match against nsIHttpChannel object safely without weak ordering issues, right?

@@ +951,5 @@
> +    let loadingPrincipal = channel.loadInfo.loadingPrincipal;
> +    let causeUri = loadingPrincipal ? loadingPrincipal.URI : null;
> +    event.cause = {
> +      type: causeType,
> +      uri: causeUri ? causeUri.spec : null

event.url, event.cause.uri. Too many uri/l.
What about naming that documentUri or something similar?
Just to be explicit about what causeUri is. It's not the request uri, but the loading document one.
Attachment #8748622 - Flags: feedback?(poirot.alex)
(In reply to Alexandre Poirot [:ochameau] from comment #44)
> I looked over nsIHttpChannel, nsIChannel and nsIRequest and I haven't seen
> any meaningful identifier for a request.
> If we end up piling hacks on top of the parent <-> child piping, we should
> find a better id than just the url. It would address the fact that we mostly
> rely on various cross process call orders, which is just frightening!

I'm now relying on the order of events and I compare the URL just to assert that everything is right. But I don't believe this approach will ever work reliably - I'm already getting random errors when responses are served from cache. I'll focus on studying the C++ netwerk code and bug 1269765 now.

> On non-e10s, and let's say on e10s if bug 1269765 is addressable.
> Here we could match against nsIHttpChannel object safely without weak
> ordering issues, right?

Yes, then the collecting the stack and attaching it to the network event happen in the same process and the channels are the same object. It will be similar to how we use NetworkMonitor.interceptedChannels to identify service worker requests.

> event.url, event.cause.uri. Too many uri/l.
> What about naming that documentUri or something similar?
> Just to be explicit about what causeUri is. It's not the request uri, but
> the loading document one.

I might drop the event.cause.uri altogether, as it's not very useful. But if we keep it, loadingDocumentUri is a better name.
Attachment #8747762 - Attachment is obsolete: true
Attachment #8748622 - Attachment is obsolete: true
Attachment #8747764 - Attachment is obsolete: true
Attachment #8747765 - Attachment is obsolete: true
Attachment #8748623 - Attachment is obsolete: true
Just attached the latest version of the patch that adds a way how to reliably connect the stacktrace collected in the child with the other info collected in the parent (and sent to child over messageManager).

nsIHttpChannel now has a new attribute: channelId. It's a uint32 allocated in the parent when creating a HttpChannel. It's retrieved by NetworkMonitor and send over the messageManager as a property of the network event.

Child can read the property over a synchronous IPDL method "GetChannelId". This happens in the http-on-opening-request handler, while collecting the stacktrace, and the channelId can be used as a key to store the stacktrace in a map (and later picked up when other events arrive from the parent).

Several things that should be improved in the next step:
1. The method "GetChannelId" can cause trouble, because it's sync. For example, when you set a tab.contentWindow.location property in the parent, it sends a sync CPOW request to the child. The handler on the child side immediately (synchronously) initiates a HTTP request to load the new URL, which, in case when NetworkMonitor is active, triggers a sync GetChannelId call back to the parent, and causes a "cross-process JS call failed" error. Try to run the browser_net_cause.js mochitest to see it. I'm going to replace the sync method called by the child by an async notification sent from the parent to child.

2. Redirects are not handled. When a request is redirected (HSTS or HTTP redirect), parent creates a new channel with a new channelId and tells the child to connect to the new one, but the stacktrace is still stored under the old channelId. When an event from parent arrives over messageManager, it has the new channelId, however. And the stack is not found.

3. When redirecting, nsRedirectChannelRegistrar assigns its own "channelId" to the channel, but it's a different ID from the internal channel's channelId. Very ugly and confusing. The solution is to move the channelId from nsIHttpChannel up the hierarchy to nsIChannel. This will require touching many more classes that also implement nsIChannel. But the registrar will be able to reuse the nsIChannel's ID and won't have to assign its own ones.

Alex, what do you think? Jason, can I get some feedback from the Necko team about this?

All the code I just described is in Part 1 of the patch, the rest is "just the remaining easy parts"
Flags: needinfo?(poirot.alex)
Flags: needinfo?(jduell.mcbugs)
Attachment #8751734 - Attachment is obsolete: true
Attachment #8751735 - Attachment is obsolete: true
Attachment #8751736 - Attachment is obsolete: true
Attachment #8751737 - Attachment is obsolete: true
Attachment #8751738 - Attachment is obsolete: true
New version of the patch with the following improvements:

- getting the information about the channelId is no more a sync IPDL request sent by the child, but rather an async notification sent by the parent. The workflow is:

1. Child sends an async constructor request to the parent
2. Until the child is notified back about the ID, its GetChannelId method returns NS_ERROR_NOT_AVAILABLE
3. After the parent channel is initialized, it notifies the child about the assigned ID

- there is a new nsIObserverService notification "http-on-channel-id", fired in the child process after a child channel gets its ID

- the NetworkMonitor's StackCollector does a slightly complicated dance to coordinate all this async action:

1. When the JS stack is collected, the channel doesn't have an ID yet, so we store it in a "stackTracesByChannel" map keyed by the channel object.
2. When the http-on-channel-id notification arrives, we move the stacktrace from stackTracesByChannel map to a stackTracesById map, keyed by the ID.
3. When an information about further network activity arrives from parent to child, the stacktrace is picked up from the stackTracesById map.

How do I handle redirections?

- if it's a normal HTTP redirect, the request has two records in the network monitor: the first is the one that got the 30x response, and the JS stack will be associated with it. The next request, to the redirected URL, won't have a stack trace.

- if it's an "internal" redirect (e.g., a HSTS redirect), the original channel is replaced by another one with a different ID, no network activity ever occurs for the original channel, and we need to associate the JS stack (stored with original ID as the key) with the replacement channel's ID. We do this by implementing a nsIChannelEventSink that gets notified about redirects from channel X to channel Y. The network monitor will then display only one request (from channel Y), with the right stack trace.

There's a mochitest that tests the HSTS redirect.

What needs to be done next?

- check how the channelIds behave when HTTP requests from different processes start to mix in one network monitor. This happens when a service worker is used - service worker's network activity happens in the child process.

- move the channelId property up the hiearchy to nsIChannel, and use the channelId in the RedirectChannelRegistrar.
Haven't looked at the patches yet, but I was wondering, just by reading your comments, if it would be easier to compute the id in the child? I imagine the channels are always created first in the child and then forwarded back to the parent??

See this code which computes cross process unique ids from any process:
  http://mxr.mozilla.org/mozilla-central/source/dom/ipc/ContentChild.cpp#3163
(there is most likely other ways to do similar thing)

What about keeping proper stacks for redirected channels for a followup?
Or pull it out from attachment 8752691 [details] [diff] [review].
(In reply to Alexandre Poirot [:ochameau] from comment #58)
> Haven't looked at the patches yet, but I was wondering, just by reading your
> comments, if it would be easier to compute the id in the child? I imagine
> the channels are always created first in the child and then forwarded back
> to the parent??

Because of redirects that are initiated in parent, it actually works both ways:

1. If child initiates a request, the child creates a channel and then tells the parent to create the corresponding channel in the parent.

2. If a redirect happens, the parent creates a new channel and then tells the child to create the corresponding child channel.

So if there is one authority assigning the ID, there is always a case when the other side needs to be notified about the ID.

> See this code which computes cross process unique ids from any process:
>   http://mxr.mozilla.org/mozilla-central/source/dom/ipc/ContentChild.cpp#3163
> (there is most likely other ways to do similar thing)

This is a great idea, however - give all processes authority to allocate the IDs, and make them unique by making  the process ID a part of the channel ID. I'll definitely try it out.

> What about keeping proper stacks for redirected channels for a followup?
> Or pull it out from attachment 8752691 [details] [diff] [review].

I can divide the backend patch into multiple ones, if it helps the reviewers to understand it.
(In reply to Jarda Snajdr [:jsnajdr] from comment #59)
> (In reply to Alexandre Poirot [:ochameau] from comment #58)
> > Haven't looked at the patches yet, but I was wondering, just by reading your
> > comments, if it would be easier to compute the id in the child? I imagine
> > the channels are always created first in the child and then forwarded back
> > to the parent??
> 
> Because of redirects that are initiated in parent, it actually works both
> ways:
> 
> 1. If child initiates a request, the child creates a channel and then tells
> the parent to create the corresponding channel in the parent.

If the channelID is created in the child, would it be possible to pass its id to the parent when the child "tells the parent to create the corresponding channel in the parent"?
Then both child and parent would have the same id immediately.

> 2. If a redirect happens, the parent creates a new channel and then tells
> the child to create the corresponding child channel.

You have the event-sink listener which handles this, right? It wouldn't depend on http-on-channel-id notification. The original request would have an id computed from the child while the redirected request would be computed from the parent, but that's not an issue. The event-sink doesn't care, it just matches the id. But it depends on the previous comment. ID's should be synced between processes on creation.
(In reply to Alexandre Poirot [:ochameau] from comment #60)
> > 1. If child initiates a request, the child creates a channel and then tells
> > the parent to create the corresponding channel in the parent.
> 
> If the channelID is created in the child, would it be possible to pass its
> id to the parent when the child "tells the parent to create the
> corresponding channel in the parent"?
> Then both child and parent would have the same id immediately.

Yes, it's possible. It would be passed as extra constructor parameter here:
https://dxr.mozilla.org/mozilla-central/source/netwerk/ipc/PNecko.ipdl#64

> 
> > 2. If a redirect happens, the parent creates a new channel and then tells
> > the child to create the corresponding child channel.
> 
> You have the event-sink listener which handles this, right? It wouldn't
> depend on http-on-channel-id notification. The original request would have
> an id computed from the child while the redirected request would be computed
> from the parent, but that's not an issue.

Yes, if the IDs can be allocated both in parent and in child, at the place of channel creation, and are guaranteed to be unique, then we can get rid of the http-on-channel-id notification. The IDs will be passed as constructor arguments and will be in sync from the beginning.

The event sink serves to notify the JS code in NetworkMonitor that channel X was redirected to channel Y. Assigning of the IDs must happen at lower level - in the C++ HttpChannelParent/Child.cpp objects and the IPDL protocol between them.
Flags: needinfo?(poirot.alex)
Flags: needinfo?(odvarko)
Flags: needinfo?(jduell.mcbugs)
Depends on: 1274556
Attachment #8752688 - Attachment is obsolete: true
Attachment #8752689 - Attachment is obsolete: true
Attachment #8752690 - Attachment is obsolete: true
Attachment #8752691 - Attachment is obsolete: true
Attachment #8752692 - Attachment is obsolete: true
The NetworkMonitor backend part (part 1) is ready for review. The UI patch I'll submit for review after I polish it and add the stacktrace also to the HTTP request log in console.

You need patches from bug 1270111 (tooltips), bug 1274556 (channelId in Necko) and bug 1269765 (http-on-opening-request in Necko) to make it work. They should be landing in m-c very soon, but are not there yet at this moment.
Comment on attachment 8758150 [details] [diff] [review]
Part 1: Collect information about request cause and stacktrace in netmonitor backend

Review of attachment 8758150 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, I just have some tweaks here and there.
I haven't tested your patches yet. I'll get to that for the review of UI patches.

::: devtools/server/actors/webconsole.js
@@ +603,5 @@
> +            // NetworkMonitorChild (getting messages about requests from parent) and
> +            // by the NetworkMonitor that directly watches service workers requests.
> +            this.stackTraceCollector = new StackTraceCollector({
> +              window: window,
> +              appId: appId

{ window, appId } ?

::: devtools/shared/webconsole/network-monitor.js
@@ +127,5 @@
> +
> +  QueryInterface: XPCOMUtils.generateQI([Ci.nsIChannelEventSink, Ci.nsIFactory]),
> +
> +  init() {
> +    let registrar = Cm.nsIComponentRegistrar;

That's the first time I see this. It doesn't seem to be very common and it doesn't seem to be "specififed" as it isn't in the related idl.

This version is more common:
  let registrar = Cm.QueryInterface(Ci.nsIComponentRegistrar);

@@ +135,5 @@
> +
> +    registrar.registerFactory(this.classID, this.classDescription, this.contractID, this);
> +    let catMan = Cc["@mozilla.org/categorymanager;1"].getService(Ci.nsICategoryManager);
> +    catMan.addCategoryEntry("net-channel-event-sinks", this.contractID, this.contractID,
> +      false, true);

nit: you may use loader.lazyServiceGetter() like activity distributor.

@@ +144,5 @@
> +    this.collectors.add(collector);
> +  },
> +
> +  unregister(collector) {
> +    this.collectors.delete(collector);

What about unregistering it when this.collectors.size == 0?

@@ +158,5 @@
> +  asyncOnChannelRedirect(oldChannel, newChannel, flags, callback) {
> +    for (let collector of this.collectors) {
> +      collector.onChannelRedirect(oldChannel, newChannel, flags);
> +    }
> +    callback.onRedirectVerifyCallback(Cr.NS_OK);

What happens if collector.onChannelRegister throws?
Does that cancel the redirect or change its behavior in any wait?
If yes, I would suggest adding try/catch around the for() to prevent changing content behavior in unexpected error within devtools.

@@ +166,5 @@
> +};
> +
> +function StackTraceCollector(filters) {
> +  this.filters = filters;
> +  this.stacktracesById = new Map();

WeakMap doesn't work with nsIHttpChannel as key, right?

@@ +173,5 @@
> +}
> +
> +StackTraceCollector.prototype = {
> +  init() {
> +    Services.obs.addObserver(this._onOpeningRequest, "http-on-opening-request", false);

nit: addObserver also accept an object, similar to addEventListener.
It prevents having to bind a callback.
It would do something like this:
  init() {
    Services.obs.addObserver(this, ...)
  },
  observe(subject, topic, data) {
    
  }

@@ +187,5 @@
> +    this.stacktracesById.set(channel.channelId, stacktrace);
> +  },
> +
> +  _onOpeningRequest(subject) {
> +    let channel = subject.QueryInterface(Ci.nsIHttpChannel);

Can't it happen that http-on-opening-request is called on non-http channel? Like chrome://, resource://, about: ?
We can do XHR from these documents.
I'm not asking this patch to support non-http requests.
I'm just wondering if that throws here. If yes, we should just prevent such exception.

To check for interfaces you can use (object instanceof Ci.nsIHttpChannel). It also act as QueryInterface if object implements the interface.

@@ +192,5 @@
> +
> +    if (!matchRequest(channel, this.filters)) {
> +      return;
> +    }
> +

You may briefly explain why you need to serialize the stack here 
instead of just passing components.stack.

@@ +203,5 @@
> +          filename: frame.filename,
> +          lineNumber: frame.lineNumber,
> +          columnNumber: frame.columnNumber,
> +          functionName: frame.name,
> +          asyncCause: frame.asyncCause,

Please ensure copying only the field you actually plan to use in the UI.

@@ +236,5 @@
> +    if (!trace) {
> +      return null;
> +    }
> +
> +    return trace;

nit: `return trace || null;`
But is it really important to be null instead of undefined?

@@ +1436,5 @@
> +
> +    if (this.owner.stackTraceCollector) {
> +      event.cause.stacktrace =
> +        this.owner.stackTraceCollector.getStackTrace(event.channelId);
> +    }

Could you drop a comment about why you also need the stack collector over here? (and not only in createNetworkEvent())
Attachment #8758150 - Flags: review?(poirot.alex) → review+
Attachment #8758150 - Attachment is obsolete: true
(In reply to Alexandre Poirot [:ochameau] from comment #67)
> { window, appId } ?

Fixed.

> > +    let registrar = Cm.nsIComponentRegistrar;
> 
> That's the first time I see this. It doesn't seem to be very common and it
> doesn't seem to be "specififed" as it isn't in the related idl.
> 
> This version is more common:
>   let registrar = Cm.QueryInterface(Ci.nsIComponentRegistrar);

Fixed.

> > +    catMan.addCategoryEntry("net-channel-event-sinks", this.contractID, this.contractID,
> > +      false, true);
> 
> nit: you may use loader.lazyServiceGetter() like activity distributor.

Used the XPCOMUtils.categoryManager getter, which is even better.

> > +  unregister(collector) {
> > +    this.collectors.delete(collector);
> 
> What about unregistering it when this.collectors.size == 0?

Added deleteCategoryEntry when the collectors count drops to zero. However, I failed to unregister the factory. The unregisterFactory expects the factory object that was registered as an argument, but then refuses the JS object I pass with NS_ERROR_XPC_BAD_CONVERT_JS.

I think that calling deleteCategoryEntry is the important part of the cleanup - it stops the event sink handler being called when no network monitor is active. Keeping the component registered doesn't really harm anyone.

> What happens if collector.onChannelRegister throws?
> Does that cancel the redirect or change its behavior in any wait?
> If yes, I would suggest adding try/catch around the for() to prevent
> changing content behavior in unexpected error within devtools.

Good catch, really bad things happen when the channels are not nsIHttpChannels - the redirect is cancelled. Added a try/catch block.

> WeakMap doesn't work with nsIHttpChannel as key, right?

Exactly, I've been advised to not even hope it would work :) Something with cycle collection.

> nit: addObserver also accept an object, similar to addEventListener.

Fixed.

> Can't it happen that http-on-opening-request is called on non-http channel?

No, unlike asyncOnChannelRedirect, here the channel is guaranteed to be nsIHttpChannel. XHRs for other schemes won't be noticed by NetworkMonitor at all.

> To check for interfaces you can use (object instanceof Ci.nsIHttpChannel).
> It also act as QueryInterface if object implements the interface.

Hmm, doesn't work for me. "channel.channelId" stopped being undefined only after explicit "channel.QueryInterface(Ci.nsIHttpChannel)".

> You may briefly explain why you need to serialize the stack here 
> instead of just passing components.stack.

Well, I copied this without thinking from the original tromey's patch :) But: components.stack is an XPCOM wrapper. We need to send it as a JSON through message manager. Explained in a comment.

> Please ensure copying only the field you actually plan to use in the UI.

Removed the asyncCause property, will add back when it's needed.

> @@ +236,5 @@
> > +    if (!trace) {
> > +      return null;
> > +    }
> > +
> > +    return trace;
> 
> nit: `return trace || null;`
> But is it really important to be null instead of undefined?

There used to be a "console.log" inside the "if". Removed the "if" block completely.

> Could you drop a comment about why you also need the stack collector over
> here? (and not only in createNetworkEvent())

One place is in parent process, the other is in child. In order to work in both e10s and non-e10s, it must be at both places.
Comment on attachment 8758274 [details] [diff] [review]
Part 1: Collect information about request cause and stacktrace in netmonitor backend

Review of attachment 8758274 [details] [diff] [review]:
-----------------------------------------------------------------

> > > +  unregister(collector) {
> > > +    this.collectors.delete(collector);
> > 
> > What about unregistering it when this.collectors.size == 0?
> 
> Added deleteCategoryEntry when the collectors count drops to zero. However,
> I failed to unregister the factory. The unregisterFactory expects the
> factory object that was registered as an argument, but then refuses the JS
> object I pass with NS_ERROR_XPC_BAD_CONVERT_JS.

Isn't that because of the duplicated instances introduces by the browser toolbox?
I'm unregistering a component successfully here:
  http://mxr.mozilla.org/mozilla-central/source/devtools/client/framework/about-devtools-toolbox.js#57
But you shouldn't pass the instance. You should pass the component.

My suggestion to use ChannelEventSinkComponent may help there.

::: devtools/shared/webconsole/network-monitor.js
@@ +158,5 @@
> +  register(collector) {
> +    this.init();
> +
> +    // Add a collector to the registered service instance. If we load this file several
> +    // times into different globals, "this" is no longer the right singleton.

s/globals/loaders/

@@ +159,5 @@
> +    this.init();
> +
> +    // Add a collector to the registered service instance. If we load this file several
> +    // times into different globals, "this" is no longer the right singleton.
> +    let me = Cc[this.contractID].getService(Ci.nsIChannelEventSink).wrappedJSObject;

It may be slightly more clear with something like that:
const ChannelEventSinkComponent = {
  ...
}
ChannelEventSinkComponent.init();

And then wherever you use it do:
const ChannelEventSink = Cc[ChannelEventSinkComponent.contractID].getService(Ci.nsIChannelEventSink).wrappedJSObject;
// wrappedJSObject to have access to register/unregister which aren't part of nsIChannelEventSink.
Attachment #8758274 - Flags: review?(poirot.alex) → review+
Attachment #8758274 - Attachment is obsolete: true
I divided the event sink code into ChannelEventSink that's the actual sink, and ChannelEventSinkFactory that implements all the bookkeeping. Suddenly everything is much cleaner and works like a charm - registering, unregistering, multiple loaders...

BTW the NS_ERROR_XPC_BAD_CONVERT_JS is XPConnect's way to say that a function argument is undefined.
Comment on attachment 8758605 [details] [diff] [review]
Part 1: Collect information about request cause and stacktrace in netmonitor backend

Review of attachment 8758605 [details] [diff] [review]:
-----------------------------------------------------------------

Looks perfect.

Thanks for addressing every single comment I made!
Attachment #8758605 - Flags: review?(poirot.alex) → review+
UI for the network request cause and stack trace. Adds a new column to the request list, displays a "JS" marker for requests that have a stacktrace, and displays a popup with the stack trace on mouseover. The stack frames are cliclickable and open the source in the debugger.

Helen, please have a look. How do you like it? I attached also a screenshot to make review easier.

As a next and final step, I'll add the stack trace also to the network event details in the console.
Attachment #8758697 - Flags: ui-review?(hholmes)
Attachment #8758697 - Flags: review?(poirot.alex)
Attachment #8758151 - Attachment is obsolete: true
Mochitests for the cause/stacktrace feature. They test various kind of requests, including redirects, service worker requests, and internal redirects like HSTS.
Attachment #8758698 - Flags: review?(poirot.alex)
Attachment #8758152 - Attachment is obsolete: true
Comment on attachment 8758697 [details] [diff] [review]
Part 2: Show network request cause and stacktrace in netmonitor UI

Looking at the screenshot also attached, I think we might want to make the background of these popups full opacity. It's opaque enough that you can't get any info from below the tooltip anyway, and the slight opacity makes it harder to read.

I notice in other HTML tooltip work (such as bug 1266450) their heights are bound to the size of devtools—is this bug affected by that work at all?
(In reply to Helen V. Holmes (:helenvholmes) (:✨)(pls ni?) from comment #77)
> Looking at the screenshot also attached, I think we might want to make the
> background of these popups full opacity. It's opaque enough that you can't
> get any info from below the tooltip anyway, and the slight opacity makes it
> harder to read.

Good idea, I'll remove the opacity.

> I notice in other HTML tooltip work (such as bug 1266450) their heights are
> bound to the size of devtools—is this bug affected by that work at all?

No, this is still pure XUL. I plan to migrate to the HTML tooltip as soon as Julian's changes are stable and get landed. In the meantime, I'll limit the height. A very long stack trace could break the UI now.
- changed the tooltip background to non-opaque theme background color
- when the tooltip is too high, allow scrolling with overflow-y: auto
- did some tooltip sizing tweaks similar to the Event tooltip in Inspector

It's hard to improve further with the XUL tooltip. I plan to migrate this to the HTML tooltip when it's ready.
Attachment #8758726 - Flags: ui-review?(hholmes)
Attachment #8758726 - Flags: review?(poirot.alex)
Attachment #8758697 - Attachment is obsolete: true
Attachment #8758697 - Flags: ui-review?(hholmes)
Attachment #8758697 - Flags: review?(poirot.alex)
Attached image buggy borders —
Oh! This is so great to finally see the stacks appear in the UI!!

I'm seeing some minor style issues.
Here the right border is missing, most likely some css/layout issue.
You can reproduce that while inspecting XHR requests on google search on linux/ubuntu.

Also, the stacks may end up very long, I'm wondering if we should have an overflowable popup with a maximal height.
Oh! And we get stacks on not just XHR, this is so handy!

But some are leaking chrome behavior. Like the stacks for documents shows a chrome stack which may mislead beginners by showing sources from browser chrome...
(In reply to Alexandre Poirot [:ochameau] from comment #82)
> But some are leaking chrome behavior. Like the stacks for documents shows a
> chrome stack which may mislead beginners by showing sources from browser
> chrome...

I plan to solve this as one of the followup bugs (I already have quite a long todo list). Hide the chrome: stack frames in page toolbox, but show them in browser toolbox. I'll have to figure out the best way to do this.

And yes, whenever there is a something on the JS stack at the moment we open the HTTP channel, we display it. Unfortunately, it doesn't work with the Fetch API, because it does something like executeSoon(() => channel.asyncOpen(...)). Another followup, will probably need a small FetchDriver patch.
Comment on attachment 8758726 [details] [diff] [review]
Part 2: Show network request cause and stacktrace in netmonitor UI

Review of attachment 8758726 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good codewise. But please address the overflow and border issue.
You may keep the chrome stacks for a followup. We can ship with them being displayed.

::: devtools/client/netmonitor/netmonitor-view.js
@@ +2295,5 @@
>      }
>  
>      let hovered = requestItem.attachment;
> +    if (hovered.responseContent &&
> +        target.closest(".requests-menu-icon-and-file", requestItem.target)) {

Here and in the `else if`, closest only accepts a selector parameter and not a second argument:
http://mxr.mozilla.org/mozilla-central/source/dom/webidl/Element.webidl#58

@@ +2325,5 @@
> +          frameEl.className = "requests-menu-stack-frame devtools-monospace";
> +
> +          let funcEl = doc.createElement("label");
> +          funcEl.className = "requests-menu-stack-frame-function-name";
> +          funcEl.setAttribute("value", functionName || "<anonymous>");

You may use a l10n string for <anonymous>:
stacktrace.anonymousFunction

@@ +2352,5 @@
> +
> +        tooltip.content = el;
> +        tooltip.panel.setAttribute("wide", "");
> +
> +        return true;

You may split each popup creation: image popup and stack popup in two dedicated functions.

::: devtools/client/themes/netmonitor.css
@@ +699,5 @@
> +.requests-menu-stack-trace {
> +  background-color: var(--theme-body-background);
> +  margin: -4px; /* Compensate for the .panel-arrowcontent padding. */
> +  width: 600px;
> +  overflow-y: auto;

I don't know why, but this overflow doesn't seem to work?
Or the maximal height is too big?
Attachment #8758726 - Flags: review?(poirot.alex) → review+
Comment on attachment 8758698 [details] [diff] [review]
Part 3: Show network request cause and stacktrace in netmonitor - mochitests

Review of attachment 8758698 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good.

It would be great to augment browser_net_cause.js or provide yet another test to assert at least one stack.

::: devtools/client/netmonitor/test/browser_net_cause.js
@@ +36,5 @@
> +
> +  EXPECTED_REQUESTS.forEach((spec, i) => {
> +    let [ method, url, causeType, causeUri, hasStack ] = spec;
> +
> +    let requestItem = RequestsMenu.getItemAtIndex(i);

Are you confident requests are going to be in the same order, always?
Could you spawn multiple runs on linux32 or win debug to ensure it won't be intermittent?

@@ +42,5 @@
> +      method, url, { cause: { type: causeType, loadingDocumentUri: causeUri } }
> +    );
> +
> +    let { stacktrace } = requestItem.attachment.cause;
> +    let stackLen = (stacktrace || []).length;

nit: (here and in other tests) I would have done: stacktrace ? stacktrace.length : 0
instead of creating an array for nothing.

::: devtools/client/netmonitor/test/browser_net_cause_service_worker.js
@@ +31,5 @@
> +  info("Registering the service worker...");
> +  yield debuggee.registerServiceWorker();
> +
> +  info("Performing requests...");
> +  debuggee.performXHRRequests();

Given that you are using status-code.html
I think you should merge this test into browser_net_service-worker-status.js
(do not hesitate to rename this test accordingly)
Attachment #8758698 - Flags: review?(poirot.alex) → review+
Addressed review issues:
- removed second argument to Element.closest
- use l10n.getStr("stacktrace.anonymousFunction")
- divided tooltip onHover callback into image/stacktrace sub-methods

Also tried to solve the overflow issue:
- removed the ugly "margin: -4px" CSS hack
- had to revert to the transparent background, otherwise the tooltip would look bad in dark theme. Helen's no-transparency suggestion will be addressed in the HTMLTooltip followup
- specified fixed 400px max-height for the tooltip, should now limit the height and scroll on all platforms

Alex, please verify the tooltip overflow/scrolling on Ubuntu. It was always right on OS X.
Attachment #8759164 - Flags: review?(poirot.alex)
Attachment #8758726 - Attachment is obsolete: true
Attachment #8758726 - Flags: ui-review?(hholmes)
Attachment #8759173 - Flags: review?(poirot.alex)
Attachment #8758698 - Attachment is obsolete: true
(In reply to Alexandre Poirot [:ochameau] from comment #85)
> It would be great to augment browser_net_cause.js or provide yet another
> test to assert at least one stack.

Added a check of the function name to browser_net_cause and also to the service worker test.

> Are you confident requests are going to be in the same order, always?
> Could you spawn multiple runs on linux32 or win debug to ensure it won't be
> intermittent?

I ran this test many times both locally and in try runs and the order was always right. The requests are added to the NetworkMonitor immediately after they are started (http-on-opening-request), which is a quite synchronous operation that shouldn't leave much room for race conditions. I started 3 try runs today.

> nit: (here and in other tests) I would have done: stacktrace ?
> stacktrace.length : 0
> instead of creating an array for nothing.

Fixed.

> Given that you are using status-code.html
> I think you should merge this test into browser_net_service-worker-status.js
> (do not hesitate to rename this test accordingly)

Good idea. Merged both tests into one - I just added stacktrace check to browser_net_service-worker-status.js. I also removed the crazy workaround that created an embedded iframe and replaced it with normal XHR request. The reason this hack was needed is that the page is not immediately controlled by the service worker on first load. Solved this by adding code where the SW explicitly claims control over the page when activated, and the test waits for the page being controlled before it issues requests.
Comment on attachment 8759164 [details] [diff] [review]
Part 2: Show network request cause and stacktrace in netmonitor UI

Review of attachment 8759164 [details] [diff] [review]:
-----------------------------------------------------------------

Overflow works fine now, but border are still broken.
I've debugged that and provided multiple ways to address that.

I've also another UI minor issue. When you hover "JS" logo,
let the tooltip display, move over the tooltip, and then move against over the same "JS" logo,
the tooltip is going to blink (hide and redisplayed).

::: devtools/client/themes/netmonitor.css
@@ +697,5 @@
>  
> +/* Requests menu stacktrace tooltip */
> +.requests-menu-stack-trace {
> +  max-height: 400px;
> +  width: 590px;

The border is still broken here.
But you can fix it by setting this width to 586px (max, you can set less).

Or can set set this max-width to 603px (min, you can set more)
.devtools-tooltip[wide] {max-width: 603px;}
Attachment #8759164 - Flags: review?(poirot.alex) → review+
Comment on attachment 8759173 [details] [diff] [review]
Part 3: Show network request cause and stacktrace in netmonitor - mochitests

Review of attachment 8759173 [details] [diff] [review]:
-----------------------------------------------------------------

That would be great to test more than just function names in one test.
At least the values we display in the toolbox: uri, line and column.

Thanks for addressing all my comments in your patches.
Everything looks good. Great piece of work there and in all dependent bugs!
Attachment #8759173 - Flags: review?(poirot.alex) → review+
Attachment #8759164 - Attachment is obsolete: true
Attachment #8759173 - Attachment is obsolete: true
(In reply to Alexandre Poirot [:ochameau] from comment #91)
> I've also another UI minor issue. When you hover "JS" logo,
> let the tooltip display, move over the tooltip, and then move against over
> the same "JS" logo,
> the tooltip is going to blink (hide and redisplayed).

Yes, this is a bug in TooltipToggle. If there is an image request in netmonitor, it shows a little preview icon next to the file name. If you move the mouse from the icon to the filename label and back, the tooltip will be blinking, too. I plan to fix it as a followup during the HTMLTooltip migration.

> The border is still broken here.
> But you can fix it by setting this width to 586px (max, you can set less).
> 
> Or can set set this max-width to 603px (min, you can set more)
> .devtools-tooltip[wide] {max-width: 603px;}

I changed the width to 586px. The .devtools-tooltip[wide] style is used also by other tooltips and could break if the style was changed.
(In reply to Alexandre Poirot [:ochameau] from comment #92)
> That would be great to test more than just function names in one test.
> At least the values we display in the toolbox: uri, line and column.

At first I wanted to avoid this, because it makes the tests fragile (adding a line to the HTML file breaks the test), but then I remembered that the debugger tests do this all the time - they wait for a breakpoint to be triggered at line 34 etc. I added a check for funcName/fileName/lineNo to the browser_net_cause test.

> Thanks for addressing all my comments in your patches.
> Everything looks good. Great piece of work there and in all dependent bugs!

Thanks for all your help! I can't believe this is finally landing. At first it looked like a good-first-bug :)
Change "const L10N" back to "var L10N" in netmonitor-view.js. Some tests rely on L10N being a property of the window.
Attachment #8759588 - Flags: review+
Attachment #8759556 - Attachment is obsolete: true
The try runs often have failures in browser_net_statistics_0x.js - already reported in bug 1253794. Let's watch if this patch increases the frequency of these intermittents. I don't see any direct reason why it should. I'll have a look at the statistics intermittents asap.
Keywords: checkin-needed
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/fx-team/rev/f12a69ba9122
Part 1: Collect information about request cause and stacktrace in netmonitor backend. r=ochameau
https://hg.mozilla.org/integration/fx-team/rev/572ebec612e8
Part 2: Show network request cause and stacktrace in netmonitor UI. r=ochameau
https://hg.mozilla.org/integration/fx-team/rev/a43d99734390
Part 3: Show network request cause and stacktrace in netmonitor - mochitests. r=ochameau
Keywords: checkin-needed
Release Note Request (optional, but appreciated)
[Why is this notable]: New Developer Tools feature
[Suggested wording]: Network monitor now displays the cause and the stack trace of each request
[Links (documentation, blog post, etc)]: none yet
relnote-firefox: --- → ?
I'm going to back this out because it's my first guess at what caused the clipboard tests to fail like https://treeherder.mozilla.org/logviewer.html#?job_id=9754279&repo=fx-team

If it's still busted after the backout, feel free to ask for this to be re-landed.

https://hg.mozilla.org/integration/fx-team/rev/ab5e81678aaa
Flags: needinfo?(jsnajdr)
This feature shouldn't affect anything clipboard related at all, and has no relation to devtools/client/eyedropper. If there are no suspicious failures in devtools/client/netmonitor/*, then please re-land.
Flags: needinfo?(jsnajdr)
Keywords: checkin-needed
After sorting through all of the bustage on fx-team at the moment, there actually have been some netmonitor failures like https://treeherder.mozilla.org/logviewer.html#?job_id=9758518&repo=fx-team
(In reply to Wes Kocher (:KWierso) from comment #107)
> After sorting through all of the bustage on fx-team at the moment, there
> actually have been some netmonitor failures like
> https://treeherder.mozilla.org/logviewer.html#?job_id=9758518&repo=fx-team

They are unrelated according to comment 101.
Keywords: checkin-needed
Pushed by ntim.bugs@gmail.com:
https://hg.mozilla.org/integration/fx-team/rev/691b0e0a6ca1
Part 1: Collect information about request cause and stacktrace in netmonitor backend r=ochameau
https://hg.mozilla.org/integration/fx-team/rev/616b2bbf8fd0
Part 3: Show network request cause and stacktrace in netmonitor - mochitests r=ochameau
https://hg.mozilla.org/integration/fx-team/rev/a58743e540f5
Part 2: Show network request cause and stacktrace in netmonitor UI r=ochameau
Keywords: checkin-needed
(In reply to Tim Nguyen :ntim from comment #108)
> They are unrelated according to comment 101.

It's possible that for some mysterious reason, this patch increased the frequency of the browser_net_statistics_* intermittents. I'm getting a lot of failures on linux64/e10s and when looking at these tests, they are not very e10s friendly - they start waiting for events at times when these event could have already fired etc.

If these tests turn out to be a problem, I recommend to disable them in e10s (skip-if=e10s) until we get to rewrite them properly.
(In reply to Jarda Snajdr [:jsnajdr] from comment #110)
> (In reply to Tim Nguyen :ntim from comment #108)
> > They are unrelated according to comment 101.
> 
> It's possible that for some mysterious reason, this patch increased the
> frequency of the browser_net_statistics_* intermittents. I'm getting a lot
> of failures on linux64/e10s and when looking at these tests, they are not
> very e10s friendly - they start waiting for events at times when these event
> could have already fired etc.
> 
> If these tests turn out to be a problem, I recommend to disable them in e10s
> (skip-if=e10s) until we get to rewrite them properly.

Looks like this patch caused browser_net_statistics_01.js to perma-fail in addition to browser_net_statistics_02.js failures getting its frequency increased. The first failure is mentioned in comment 107, although I misread it as the second failure (which I why I thought it was OK to reland). Going to back this out again.
(In reply to Tim Nguyen :ntim from comment #111)
> Looks like this patch caused browser_net_statistics_01.js to perma-fail in
> addition to browser_net_statistics_02.js failures getting its frequency
> increased. The first failure is mentioned in comment 107, although I misread
> it as the second failure (which I why I thought it was OK to reland). Going
> to back this out again.

Does the statistics_01.js test fail on all platforms, or just linux64? I saw it fail many times on Linux, but never on OS X, both on try and localhost. Any difference between e10s and non-e10s?

Can I see these stats anywhere?
(In reply to Jarda Snajdr [:jsnajdr] from comment #112)
> (In reply to Tim Nguyen :ntim from comment #111)
> > Looks like this patch caused browser_net_statistics_01.js to perma-fail in
> > addition to browser_net_statistics_02.js failures getting its frequency
> > increased. The first failure is mentioned in comment 107, although I misread
> > it as the second failure (which I why I thought it was OK to reland). Going
> > to back this out again.
> 
> Does the statistics_01.js test fail on all platforms, or just linux64? I saw
> it fail many times on Linux, but never on OS X, both on try and localhost.
> Any difference between e10s and non-e10s?
> 
> Can I see these stats anywhere?

You can see the failures here: https://treeherder.mozilla.org/#/jobs?repo=fx-team&revision=a58743e540f5894ce8d802784bbd16737f5ae2c3
Notably dt5 on Windows and Linux. 

Those seemed to happen every time when it was landed.

You can retrigger jobs to check the frequency of intermittents.

Here's the backout changeset in case you missed it: https://hg.mozilla.org/integration/fx-team/rev/a139db504ae189bb3115ff9156ab47fc7bbddcda
I've rewritten the failing tests, let's see if it fixes things: https://treeherder.mozilla.org/#/jobs?repo=try&revision=7fddea9e5134
Try push with `devtools.dump.emit` --> true: https://treeherder.mozilla.org/#/jobs?repo=try&revision=f1716b4ed77bb105218bf478a1efeb1efb469a5a

My guess is that the order of events is not being respected on Linux for whatever reason. It works fine on OSX though.
The browser_net_statistics test failures were solved in bug 1253794. After it's landed, please re-land also this bug.

Try run with both patches:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=908cc361d1a3
Depends on: 1253794
Keywords: checkin-needed
Pushed by pbrosset@mozilla.com:
https://hg.mozilla.org/integration/fx-team/rev/372af96174c3
Part 1: Collect information about request cause and stacktrace in netmonitor backend r=ochameau
https://hg.mozilla.org/integration/fx-team/rev/91dbcbbb29ef
Part 2: Show network request cause and stacktrace in netmonitor UI r=ochameau
https://hg.mozilla.org/integration/fx-team/rev/9f5dcabac6c7
Part 3: Show network request cause and stacktrace in netmonitor - mochitests r=ochameau
Keywords: checkin-needed
It would be awesome to get this in time for FF49 as this is one of the main devtools features of this release.
If we can't, uplifting after it lands in 50 seems low risk. It relies on platform changes that have landed in other bugs a week ago, and is mostly new feature code, so risks of regressions in the netmonitor are low.
Comment on attachment 8758605 [details] [diff] [review]
Part 1: Collect information about request cause and stacktrace in netmonitor backend

Approval Request Comment
[Feature/regressing bug #]: new devtools feature we want to land in 49, see comment 120, we plan to write about it in the hacks post in DE 49.
[User impact if declined]: feature not landed
[Describe test coverage new/current, TreeHerder]: on mc, local testing done, soon in nightly
[Risks and why]: low, very early in the cycle, and see comment 120 as well
[String/UUID change made/needed]: Yes, 1 new string.
Attachment #8758605 - Flags: approval-mozilla-aurora?
Attachment #8759557 - Flags: approval-mozilla-aurora?
Attachment #8759588 - Flags: approval-mozilla-aurora?
Depends on: 1278778
Depends on: 1278797
flod, are you ok with this uplift to aurora 49 given that there is a new string? It seems fairly minor. Is this ok with the l10n team?
Flags: needinfo?(francesco.lodolo)
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #123)
> flod, are you ok with this uplift to aurora 49 given that there is a new
> string? It seems fairly minor. Is this ok with the l10n team?

Yes, we also already warned localizers about this bug on merge day.
Flags: needinfo?(francesco.lodolo)
Please note that bug 1253794 is a dependency of this bug.
Comment on attachment 8758605 [details] [diff] [review]
Part 1: Collect information about request cause and stacktrace in netmonitor backend

OK to uplift to aurora, new feature for dev tools.
Attachment #8758605 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
All 3 patches should uplift.
Depends on: 1280247
Depends on: 1280266
Attachment #8759557 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #8759588 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Depends on: 1285895
I've updated the network monitor MDN page with information about the new Cause column. This includes new screenshots so you can see it in action, an entry in the Network request fields list:

https://developer.mozilla.org/en-US/docs/Tools/Network_Monitor#Network_request_list

And a separate section to provide more information:

https://developer.mozilla.org/en-US/docs/Tools/Network_Monitor#Cause_column

I've also updated the Fx 49 release notes to mention this change:

https://developer.mozilla.org/en-US/Firefox/Releases/49#Developer_Tools

A quick review would be much appreciated. Thanks!
(In reply to Chris Mills (Mozilla, MDN editor) [:cmills] from comment #130)
> A quick review would be much appreciated. Thanks!

Looks good, I only have one idea for improvement: displaying the JS stack trace is actually the most important and valuable aspect of this feature, so maybe the documentation should give it more attention.

There's also a typo (reuqest) in the note describing JS stack traces.
(In reply to Jarda Snajdr [:jsnajdr] from comment #131)
> (In reply to Chris Mills (Mozilla, MDN editor) [:cmills] from comment #130)
> > A quick review would be much appreciated. Thanks!
> 
> Looks good, I only have one idea for improvement: displaying the JS stack
> trace is actually the most important and valuable aspect of this feature, so
> maybe the documentation should give it more attention.
> 
> There's also a typo (reuqest) in the note describing JS stack traces.

Thanks! Typo fixed; I've also added in a screenshot showing the stack trace in action, plus some more details about it. Let me know if you think the text needs more work.
(In reply to Chris Mills (Mozilla, MDN editor) [:cmills] from comment #132)
> Thanks! Typo fixed; I've also added in a screenshot showing the stack trace
> in action, plus some more details about it. Let me know if you think the
> text needs more work.

I would leave out the description of the tooltip with the full URL - it's nothing else but a full expansion of the URL that's already visible, only truncated. That's a widely used UI pattern that doesn't need extra documentation. The screenshot would also look better and cleaner without the tooltip.
(In reply to Jarda Snajdr [:jsnajdr] from comment #133)
> (In reply to Chris Mills (Mozilla, MDN editor) [:cmills] from comment #132)
> > Thanks! Typo fixed; I've also added in a screenshot showing the stack trace
> > in action, plus some more details about it. Let me know if you think the
> > text needs more work.
> 
> I would leave out the description of the tooltip with the full URL - it's
> nothing else but a full expansion of the URL that's already visible, only
> truncated. That's a widely used UI pattern that doesn't need extra
> documentation. The screenshot would also look better and cleaner without the
> tooltip.

Ok, no problem - updated.
Depends on: 1290868
Added to the release notes using the same wording as mdn
"The Network Monitor now shows a Cause column, which provides an indication of what caused each particular network request"
I have successfully reproduce this bug on firefox nightly 38.0a1 (2015-02-17)
with windows 7 (32 bit)
Mozilla/5.0 (Windows NT 6.1; rv:38.0) Gecko/20100101 Firefox/38.0

I found this fix on latest beta 49.0b3

Mozilla/5.0 (Windows NT 6.1; rv:49.0) Gecko/20100101 Firefox/49.0
Build ID : 20160811031722

[testday-20160812]
Status: RESOLVED → VERIFIED
I have reproduced this bug with Nightly 38.0a1 (2015-02-17) on Ubuntu 14.04, 64 bit!

The bug's fix is now verified on latest Beta 49.0b4, Aurora 50.0a2.

Beta 49.0b4:
Build ID 	20160814184416
User Agent 	Mozilla/5.0 (X11; Linux x86_64; rv:49.0) Gecko/20100101 Firefox/49.0

Aurora 50.0a2:
Build ID 	20160817004002
User Agent 	Mozilla/5.0 (X11; Linux x86_64; rv:50.0) Gecko/20100101 Firefox/50.0

[bugday-20160817]
Depends on: 1306261
See Also: → 1350812
Depends on: 1392408
Depends on: 1392411
Product: Firefox → DevTools
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: