Closed Bug 1151899 Opened 9 years ago Closed 8 years ago

Integrate the rust-url parser into necko

Categories

(Core :: Networking, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla53
Tracking Status
firefox53 --- fixed

People

(Reporter: valentin, Assigned: valentin)

References

Details

(Whiteboard: [necko-would-take])

Attachments

(5 files, 14 obsolete files)

58 bytes, text/x-review-board-request
valentin
: review+
bagder
: review+
Details
58 bytes, text/x-review-board-request
bagder
: review+
Details
58 bytes, text/x-review-board-request
bagder
: review+
Details
58 bytes, text/x-review-board-request
bagder
: review+
Details
58 bytes, text/x-review-board-request
manishearth
: review+
Details
The rust-url official repo:
https://github.com/servo/rust-url

The C wrapper I'm building to be able to call it from Necko
https://github.com/valenting/rust-url-capi

While we could just strip out the C++ parser and use the Rust one, even if it's a build option, I think it would be a good idea to be able to run them both in parallel to see if there are any differences. For this we would need a build option, and a pref to determine which implementation is active.
I get an LLVM failure trying to build the capi on rustc 1.0.0-nightly (00978a987 2015-04-18) (built 2015-04-19)

rustc: /home/rustbuild/src/rust-buildbot/slave/nightly-dist-rustc-linux/build/src/llvm/lib/CodeGen/LexicalScopes.cpp:179: llvm::LexicalScope* llvm::LexicalScopes::getOrCreateRegularScope(llvm::MDNode*): Assertion `DISubprogram(Scope).describes(MF->getFunction())' failed.

Have you seen this before?
(In reply to Manish Goregaokar [:manishearth] from comment #2)
> I get an LLVM failure trying to build the capi on rustc 1.0.0-nightly
> (00978a987 2015-04-18) (built 2015-04-19)
> 
> rustc:
> /home/rustbuild/src/rust-buildbot/slave/nightly-dist-rustc-linux/build/src/
> llvm/lib/CodeGen/LexicalScopes.cpp:179: llvm::LexicalScope*
> llvm::LexicalScopes::getOrCreateRegularScope(llvm::MDNode*): Assertion
> `DISubprogram(Scope).describes(MF->getFunction())' failed.
> 
> Have you seen this before?

Yes, it's a bug with LTO builds. https://github.com/rust-lang/rust/issues/24220
I disabled query_encoding for now. Be sure to pull the latest changes from https://github.com/valenting/rust-url-capi . I hadn't pushed all the changes needed for this patch to work :)
Steps to try this out (trying to make this accessible to both Rust folks and Firefox folks):

 - If you don't have one already, set up a simple firefox build (https://developer.mozilla.org/en/docs/Simple_Firefox_build). Apply the patch above (`hg qqueue -c rusturl; hg qimport /path/to/patch; hg qpush -a` is my preferred method)
 - Install the latest Cargo and Rust nightlies from rust-lang.org and crates.io
 - Clone https://github.com/valenting/rust-url-capi
 - In rust-url-capi, run `cargo build` (alternatively, `cargo build --release` for a non-debug+opt version)
 - In netwerk/base/nsStandardURL.h, modify the include path to rust-url-capi.h to point to your local clone
 - In toolkit/library/moz.build, modify the `OS_LIBS` line that adds a path to rust-url-capi to point to the .a file you generated. If you ran `cargo build`, this .a file should be /path/to/rust-url-capi/target/debug/librust_url_capi-somesha.a; and if you ran `cargo build --release`, it will be in target/release
 - Cancel any existing builds and rerun ./mach build for Firefox
 - When the build completes, launch Firefox with ./mach run
 - Open about:config and set the network.standard-url.use-rust to true


To test this out, try `div=document.createElement('div'); div.innerHTML='<a href="http://localhost:99999/"></a>'; console.log(div.childNodes[0].port)` in the JS console on a page with a DOM. On a regular build this should output 99999, but the one with rust-url will overflow.

If you want to test performance differences, be sure to use `cargo build --release`, and make the comparisons by toggling the `about:config` pref (not by switching browsers).
Whiteboard: [necko-would-take]
I have a pull request for rust-url with large breaking changes: https://github.com/servo/rust-url/pull/176

I’d appreciate feedback (preferably in github comments) on the API design. What would make Gecko integration easier?
That pull request has landed. rust-url 1.0.0 is now published on crates.io and in use in Servo.
Are we waiting for better rust or cargo support in gecko, or is there some other reason rust-url-capi has not been updated after the breaking changes to rust-url?
Flags: needinfo?(valentin.gosu)
I've got this working at https://github.com/Manishearth/gecko-dev/tree/rust-url-capi , using the toolkit/library/rust cargo support (I'm not using it correctly, but it's close enough for now). It works with the latest rust-url (see https://github.com/valenting/rust-url-capi/pull/1/files)

Sadly, it segfaults:

Exception Type:        EXC_BAD_ACCESS (SIGSEGV)
Exception Codes:       KERN_INVALID_ADDRESS at 0x00000000000000a8

VM Regions Near 0xa8:
--> 
    __TEXT                 0000000108633000-0000000108637000 [   16K] r-x/rwx SM=COW  /Users/USER/*/NightlyDebug.app/Contents/MacOS/firefox

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   XUL                           	0x000000010c7046e5 nsScrollbarFrame::SetScrollbarMediatorContent(nsIContent*) + 37 (nsCOMPtr.h:374)
1   XUL                           	0x000000010c5858df nsHTMLScrollFrame::TryLayout(mozilla::ScrollReflowInput*, mozilla::ReflowOutput*, bool, bool, bool) + 463 (nsGfxScrollFrame.cpp:357)
2   XUL                           	0x000000010c586ed9 nsHTMLScrollFrame::ReflowContents(mozilla::ScrollReflowInput*, mozilla::ReflowOutput const&) + 921 (nsGfxScrollFrame.cpp:707)
3   XUL                           	0x000000010c588229 nsHTMLScrollFrame::Reflow(nsPresContext*, mozilla::ReflowOutput&, mozilla::ReflowInput const&, unsigned int&) + 649 (BaseMargin.h:89)
4   XUL                           	0x000000010c5505ff nsContainerFrame::ReflowChild(nsIFrame*, nsPresContext*, mozilla::ReflowOutput&, mozilla::ReflowInput const&, int, int, unsigned int, unsigned int&, nsOverflowContinuationTracker*) + 159 (nsContainerFrame.cpp:1071)

There are also a bunch of mismatches print out beforehand. I can look into those, but I have no idea why there's a segfault in the first place.
Ah, there were some typos/copy paste errors I introduced when merging. Fixed now, and it works. Will clean up the patches and push up in a bit.
Only mismatches I can find now are:

[RUST] Difference occured in ::Resolve
	 c++  : https://login.wikimedia.org
	 rust : https://login.wikimedia.org/
[RUST] Difference occured in ::GetHasRef
	 c++  : 0
	 rust : -2147024809
[RUST] Difference occured in ::GetScheme
	 c++  : https
	 rust :

(the second two are from Google, the first is on wikipedia)
Attached patch Part 1 - Include rust-url-capi (obsolete) — Splinter Review
Attachment #8596878 - Attachment is obsolete: true
Flags: needinfo?(valentin.gosu)
Attachment #8802602 - Flags: review?(valentin.gosu)
Attached file Part 2 - Vendor rust-url-capi deps (obsolete) —
Attachment #8802603 - Flags: review?(valentin.gosu)
Attachment #8802602 - Flags: review?(ted)
Attachment #8802603 - Flags: review?(ted)
Attachment #8802606 - Flags: review?(ted)
Attachment #8802603 - Attachment mime type: text/plain → application/x-url
Attachment #8802603 - Attachment mime type: application/x-url → text/x-github-pull-reques
Attachment #8802603 - Attachment mime type: text/x-github-pull-reques → text/x-github-pull-request
Pushed cleaned-up versions of the patches. Patch #2 is 21MB and reviewboard keels over trying to handle it, so I've just linked to a commit instead.

The older patch said WIP so I assume there are still bits missing from this. It works, though, with some minor hiccups (IDNA parsing and ::Resolve appending a slash).

Should we look into landing this? I'm not sure what work is left to be done here. I personally would like to make this use mystor's nsstring work, and I would like to reduce some of the cloning here, but those are minor fixes we can land afterwards. I'd love to land it first so that we can iterate on it, and even perhaps get some nightly telemetry.

A few questions:

 - Where should rust-url-capi be placed? I put it inside netwerk/base, but it might belong elsewhere
 - Do we need some compile-time flags for this? It's currently an off-by-default runtime pref.
 - What bits are missing?

If not the whole thing, I'd like to at least land the first two commits (perhaps without `extern crate rust_url_capi` to avoid the binary bloat), since they don't change any behavior.
Attachment #8802603 - Attachment mime type: text/x-github-pull-request → text/plain
(In reply to Manish Goregaokar [:manishearth] from comment #16)
> Should we look into landing this? I'm not sure what work is left to be done
> here. I personally would like to make this use mystor's nsstring work, and I
> would like to reduce some of the cloning here, but those are minor fixes we
> can land afterwards. I'd love to land it first so that we can iterate on it,
> and even perhaps get some nightly telemetry.

I think it should be fine to land this, as long as it's off-by-default. If we enable it we'll almost certainly want to do side-by-side comparisons with telemetry like we have with the mp4 parser.

>  - Where should rust-url-capi be placed? I put it inside netwerk/base, but
> it might belong elsewhere

Is this a Gecko-specific C API? If so then somewhere in netwerk is probably fine. You might ask the necko module owner what they prefer. If it's generic to rust-url then we could just vendor it into third_party/rust.

>  - Do we need some compile-time flags for this? It's currently an
> off-by-default runtime pref.

As long as it's obeying the existing `MOZ_RUST` configure option I think a runtime pref should be fine.
Sounds great!

> Is this a Gecko-specific C API?

Basically, yes. The C API uses the quirks module which is browser-specific. But we can vendor if valentin wants rust-url-capi to live out of tree.

> As long as it's obeying the existing `MOZ_RUST` configure option I think a runtime pref should be fine.

I don't think it uses MOZ_RUST yet. But I can make it do that.
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #17)
> (In reply to Manish Goregaokar [:manishearth] from comment #16)
> >  - Do we need some compile-time flags for this? It's currently an
> > off-by-default runtime pref.
> 
> As long as it's obeying the existing `MOZ_RUST` configure option I think a
> runtime pref should be fine.

The existing mp4parser support has a build-time flag associated with it (separate from the MOZ_RUST configure option), so we could not enable it on some platforms if we didn't want to take the compile-size hit.
I generally follow the YAGNI principle here. :) If we decide we need such a thing later we can always add it.
Attachment #8802606 - Attachment is obsolete: true
Attachment #8802606 - Flags: review?(valentin.gosu)
Attachment #8802606 - Flags: review?(ted)
Attachment #8802603 - Attachment is obsolete: true
Attachment #8802603 - Flags: review?(valentin.gosu)
Attachment #8802603 - Flags: review?(ted)
Attachment #8802602 - Attachment is obsolete: true
Attachment #8802602 - Flags: review?(valentin.gosu)
Attachment #8802602 - Flags: review?(ted)
I reordered the commits and pushed to mozreview. The vendoring commit is still there in the hg history (https://reviewboard-hg.mozilla.org/gecko/rev/bc2d6405b18b), but it's not included in the review.
Try at https://treeherder.mozilla.org/#/jobs?repo=try&revision=5e05b470d2881f73dcbc5002d7bf668f56f927f9 . I should make it crash on a mismatch and see what happens on try as well.
From try:

rust-url also seems to return punycode serializations, which nsstandardurl doesn't. We parse into punycode form. We'll need to explicitly de-punycode. Perhaps we should add a parse flag to not do this?

There are also IP issues like "Got [::192.168.1.1]/file.ext, expected [::192.168.1.1]</file.ext>", and I recall seeing an IPV6 issue somewhere.

dom/tests/mochitest/fetch/test_fetch_cors.html  fails with IDNA issues and password issues, but they pass locally. Not sure what that's about.

There are a bunch of crashes which I'll look into later. We probably should not land this enabled by default for now.
added conditional compilation.
(In reply to Manish Goregaokar [:manishearth] from comment #25)
> From try:
> 
> rust-url also seems to return punycode serializations, which nsstandardurl
> doesn't. We parse into punycode form. We'll need to explicitly de-punycode.

There is work under way to make nsStandardURL do the same: bug 945240
> I think it should be fine to land this, as long as it's off-by-default. If we enable it we'll almost certainly want to do side-by-side comparisons with telemetry like we have with the mp4 parser.

FWIW with the current patch, an --enable-rust build with the pref set to off will still parse the thing using rust-url, it just won't use the result. Is this behavior acceptable? Might be a perf hit since we're parsing twice.

We can remedy this by gating more things on sUseRust or turning MOZ_RUST_URLPARSE off in releases
That's probably a better question for a Necko peer. Assuming we're properly handling failures and not going to add any extraneous crashes from panicing it might be OK. The perf hit might suck, maybe rillian can comment on what he did with the mp4parser.
We're still calling both the rust and C++ mp4parser when playback starts, so we can compare the output. The performance hit isn't significant for something which is done once per resource. Always parsing and having a pref to switch which result you use sounds fine in the near term.
Comment on attachment 8802624 [details]
Bug 1151899 - Add code to run both URL parsers at the same time;

https://reviewboard.mozilla.org/r/86944/#review87002

The build integration looks fine.
Attachment #8802624 - Flags: review?(ted) → review+
Comment on attachment 8802625 [details]
Bug 1151899 - Make Necko support rust-url as a parser;

https://reviewboard.mozilla.org/r/86958/#review87004

The build system bits are fine, just a couple of comments on other things.

::: modules/libpref/init/all.js:1866
(Diff revision 2)
>  pref("network.standard-url.encode-utf8", true);
>  
>  // The maximum allowed length for a URL - 1MB default
>  pref("network.standard-url.max-length", 1048576);
>  
> +pref("network.standard-url.use-rust", false);

Wouldn't hurt to put a comment here like the surrounding prefs have.

::: netwerk/test/unit/test_rusturl.js:1
(Diff revision 2)
> +const StandardURL = Components.Constructor("@mozilla.org/network/standard-url;1",

Does this test do anything to explicitly test the Rust URL parser, or is this just "we're running both in parallel so we get test coverage by parsing URLs"?

You might want to put a comment here mentioning this, since it's not clear.

Alternately: make the test flip the pref to actually test just the Rust URL parser.
Attachment #8802625 - Flags: review?(ted) → review+
> Does this test do anything to explicitly test the Rust URL parser, or is this just "we're running both in parallel so we get test coverage by parsing URLs"?

I think it's the latter. Added the comments and made it use the pref.

(Bear in mind -- I've not actually written that patch, I just took the old one and made it work with rust-url/m-c master. This also means that the patches probably need review from a Necko peer other than valentin)
(having trouble pushing again, so those issues are marked as fixed but the corresponding patch hasn't been pushed up)
Comment on attachment 8802624 [details]
Bug 1151899 - Add code to run both URL parsers at the same time;

https://reviewboard.mozilla.org/r/86944/#review87426

::: netwerk/base/rust-url-capi/test/Makefile:4
(Diff revision 1)
> +all:
> +	cd .. && cargo build
> +	g++ -Wall -o test test.cpp ../target/debug/librust*.a -ldl -lpthread -lrt -lgcc_s -lpthread -lc -lm -std=c++0x
> +	./test

I don't think this would be needed anymore.

::: toolkit/library/rust/Cargo.lock:19
(Diff revision 1)
>  name = "gkrust-shared"
>  version = "0.1.0"
>  dependencies = [
>   "mp4parse_capi 0.5.1",
>   "nsstring 0.1.0",
> + "rust_url_capi 0.0.1",

Maybe it would have been better to use nsstring in the API? I can think of both pros and cons.
Comment on attachment 8802624 [details]
Bug 1151899 - Add code to run both URL parsers at the same time;

https://reviewboard.mozilla.org/r/86944/#review87428
Attachment #8802624 - Flags: review?(valentin.gosu) → review+
Comment on attachment 8802625 [details]
Bug 1151899 - Make Necko support rust-url as a parser;

https://reviewboard.mozilla.org/r/86958/#review87430

So, one of the reasons why I felt it's best to change nsStandardURL was that a few other classes extend nsStandardURL and just override a couple of methods.
However, the changes are a bit too verbose + a lot of duplicated code. And this has the side effect on making it difficult to test just the rust URL parser by itself.

What are your thoughts on creating a new class which implements nsIURI/nsIURL/etc and calls into rust-url? This would make it easy to test.
Running them in parallel would mean doing about the same, but instead of exposing nsStandardURL to the rust object/working with raw pointers, and ifdefs, we'd just have a RefPtr<RustURL> and use that in a cleaner way.

::: netwerk/base/nsStandardURL.cpp:1728
(Diff revision 3)
> +#ifdef MOZ_RUST_URLPARSE
> +    nsAutoCString rustSpec;
> +    rusturl_get_spec(mURL, &rustSpec);
> +    nsAutoCString cSpec(mSpec);
> +    CheckDifference(__FUNCTION__, cSpec, rustSpec);
> +#endif

These are way too many ifdefs to include in nsStandardURL. It feels a lot like we're polluting the code :)

The best way to handle this is to come up with a couple of macros that depend on MOZ_RUST_URLPARSE and become no-ops when it is not defined, and use the macros to forward setter operations, and check getter operations have the same result.

::: netwerk/base/nsStandardURL.cpp:1728
(Diff revision 3)
> +#ifdef MOZ_RUST_URLPARSE
> +    nsAutoCString rustSpec;
> +    rusturl_get_spec(mURL, &rustSpec);
> +    nsAutoCString cSpec(mSpec);
> +    CheckDifference(__FUNCTION__, cSpec, rustSpec);
> +#endif

These are way too many ifdefs to include in nsStandardURL. It feels a lot like we're polluting the code :)

The best way to handle this is to come up with a couple of macros that depend on MOZ_RUST_URLPARSE and become no-ops when it is not defined, and use the macros to forward setter operations, and check getter operations have the same result.
Attachment #8802625 - Flags: review?(valentin.gosu) → review-
> Maybe it would have been better to use nsstring in the API? I can think of both pros and cons.

This is the plan, I just don't want to do that in this patch -- I'd prefer to land a minimal working product which we can iterate on once it lands.

> What are your thoughts on creating a new class which implements nsIURI/nsIURL/etc and calls into rust-url? This would make it easy to test.

Wouldn't it make it harder to replace nsIStandardURI with this (given the amount of places that refer to it directly)? We want to be able to compare both side-by-side as well, so we'd end up with code in nsIStandardURI anyway.

> The best way to handle this is to come up with a couple of macros that depend on MOZ_RUST_URLPARSE and become no-ops when it is not defined,

Will do.
After chatting in IRC with valentin I'm landing the first two patches (the Cargo vendoring patch not shown on mozreview, and the rust-url-capi vendoring). The nsStandardURI changes can be made on top of this by either me or valentin.
Yeah, parts of the third patch (which I hadn't included in the push) are necessary for the second patch to work. Will amend, push to try, and push again.
Flags: needinfo?(valentin.gosu)
Pushed by manishearth@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/3d88f715c2d2
Vendor rust-url-capi deps; r=valentin,ted
https://hg.mozilla.org/integration/mozilla-inbound/rev/54a28d6fbed4
Include rust-url-capi (leave-open); r=valentin,ted
https://hg.mozilla.org/mozilla-central/rev/3d88f715c2d2
https://hg.mozilla.org/mozilla-central/rev/54a28d6fbed4
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
Looks like leave-open needs to be used on both.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(In reply to Manish Goregaokar [:manishearth] from comment #48)
> Looks like leave-open needs to be used on both.

leave-open needs to be set as a keyword on the bug, not as a thing in the commit message.

Also, now when I build on Linux with --enable-rust, I get linker errors:

23:10.38 rust/x86_64-unknown-linux-gnu/release/libgkrust.a(gkrust.0.o):gkrust.cgu-0.rs:function rusturl_get_spec: error: undefined reference to 'c_fn_set_size'
23:10.38 rust/x86_64-unknown-linux-gnu/release/libgkrust.a(gkrust.0.o):gkrust.cgu-0.rs:function rusturl_get_spec: error: undefined reference to 'c_fn_get_buffer'
23:10.38 rust/x86_64-unknown-linux-gnu/release/libgkrust.a(gkrust.0.o):gkrust.cgu-0.rs:function rusturl_get_scheme: error: undefined reference to 'c_fn_set_size'
23:10.38 rust/x86_64-unknown-linux-gnu/release/libgkrust.a(gkrust.0.o):gkrust.cgu-0.rs:function rusturl_get_scheme: error: undefined reference to 'c_fn_get_buffer'
23:10.38 rust/x86_64-unknown-linux-gnu/release/libgkrust.a(gkrust.0.o):gkrust.cgu-0.rs:function rusturl_get_username: error: undefined reference to 'c_fn_set_size'
23:10.38 rust/x86_64-unknown-linux-gnu/release/libgkrust.a(gkrust.0.o):gkrust.cgu-0.rs:function rusturl_get_username: error: undefined reference to 'c_fn_get_buffer'
23:10.38 rust/x86_64-unknown-linux-gnu/release/libgkrust.a(gkrust.0.o):gkrust.cgu-0.rs:function rusturl_get_username: error: undefined reference to 'c_fn_set_size'
23:10.38 rust/x86_64-unknown-linux-gnu/release/libgkrust.a(gkrust.0.o):gkrust.cgu-0.rs:function rusturl_get_password: error: undefined reference to 'c_fn_get_buffer'

Do you know if there's an external library that needs to be linked to make this work?
Oh, I see.

I thought I fixed the linker errors in https://hg.mozilla.org/mozilla-central/rev/54a28d6fbed4#l2.12 Does removing the ifdef work?
Removing the ifdef does work. I also tried building with --enable-rust on OS X and that worked fine even with the #ifdef. It might be that my Linux build needs clobbering or something. I'll try that at some point and let you know.
I think it's fine to land a patch with the ifdefs removed anyway. Those functions are extraneous in non-rust builds, but won't break anything. On the plus side if you had a build with rust on but MOZ_RUST_URLPARSE off it would still compile if the ifdefs are removed. These functions will be moved or removed eventually anyway, their current position is temporary (and we might be able to just use the nsstring bindings here).
I think it was just an issue in my local setup, a clobber build didn't run into any problems. Sorry for the noise.
What's the license of files under netwerk/base/rust-url-capi/ ? According to https://www.mozilla.org/en-US/MPL/

  Correctly Licensing New Source Code

  Any new code checked into Mozilla's source repositories needs to
  comply with Mozilla's source code licensing policy. Please use
  the appropriate header text at the top of each file.
The only contributors to that are me and Valentin. I release it under any license Valentin wants to put it under. Recommend MPL, since it's supposed to be part of m-c anyway.
Attachment #8807884 - Attachment is obsolete: true
Attachment #8807885 - Attachment is obsolete: true
Attachment #8807886 - Attachment is obsolete: true
Attachment #8807887 - Attachment is obsolete: true
Comment on attachment 8808407 [details]
Bug 1151899 - Add RustURL which implements nsIURI,nsIURL,nsIStandardURL,etc

https://reviewboard.mozilla.org/r/91216/#review91414

::: netwerk/base/RustURL.cpp:9
(Diff revision 1)
> +
> +#ifndef MOZ_RUST_URLPARSE
> +  #error "Should be defined"
> +#endif
> +
> +extern "C" int32_t c_fn_set_size(void * container, size_t size)

Leave a comment here that this is temporary and should be replaced with use of the rust nsstring bindings.
Attachment #8802625 - Attachment is obsolete: true
Looks like rusturl_get_path only got the path, not the path+ref+query (which nsStandardURL::GetPath does). Fixed that.

Also updated all getters and setters. It no longer complains about any mismatches :D Last patch ready for review?
Comment on attachment 8807883 [details]
Bug 1151899 - rust-url-capi error code changes

https://reviewboard.mozilla.org/r/90870/#review91332
Attachment #8807883 - Flags: review?(manishearth) → review+
Attachment #8808410 - Flags: review?(valentin.gosu)
Attachment #8808410 - Flags: review?(mcmanus)
Surprisingly no failures on try? https://treeherder.mozilla.org/#/jobs?repo=try&revision=a4d99b46032b

IIRC we run with rust enabled -- is rust-url being disabled somehow? I didn't need any special config other than --enable-rust locally.

The Web seems to work fine locally though. Still getting errors on IDNA, but that is to be expected.

The patch currently isn't pref gated fwiw, so we should be careful about landing.
Aha, the failures on try are because our getter macros fallback to nsStandardURL's result if things fail.

We probably want this for the shipped version, but for try results I'm going to make it possible to switch the fallback mode off.
Old try run: https://treeherder.mozilla.org/#/jobs?repo=try&revision=5e05b470d2881f73dcbc5002d7bf668f56f927f9&selectedJob=29482689

Seems like dom/tests/mochitest/whatwg/test_postMessage_origin.xhtml still fails because of IDNA (locally, try still running), but  browser/base/content/test/urlbar/browser_urlHighlight.js does not, even though it does report mismatches (the corresponding rust-url fix hasn't synced to m-c yet.)
That test didn't fail on try, I guess rust-urlparse is disabled somehow?
(In reply to Manish Goregaokar [:manishearth] from comment #77)
> That test didn't fail on try, I guess rust-urlparse is disabled somehow?

If MOZ_RUST isn't enabled, the test is skipped.

(In reply to Manish Goregaokar [:manishearth] from comment #76)
> Old try run:
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=5e05b470d2881f73dcbc5002d7bf668f56f927f9&selectedJob=2
> 9482689
> 
> Seems like dom/tests/mochitest/whatwg/test_postMessage_origin.xhtml still
> fails because of IDNA (locally, try still running), but 
> browser/base/content/test/urlbar/browser_urlHighlight.js does not, even
> though it does report mismatches (the corresponding rust-url fix hasn't
> synced to m-c yet.)

Rust-url does the right thing here. We are trying to fix this on the gecko side, in bug 945240.
> If MOZ_RUST isn't enabled, the test is skipped.

Right, but IIRC it's enabled by default.

(This isn't the rust-url test, this is a different test that's known to fail with rust-url)


> Rust-url does the right thing here.

Yeah, I'm using the test as a canary to see if the infra is running with Rust on.

----


Looking at the raw log, it is indeed running with the new rust url stuff. See https://public-artifacts.taskcluster.net/O5eIMi03TVGDMUeVF3vCvQ/0/public/logs/live_backing.log :


> [task 2016-11-09T23:15:46.954962Z] 23:15:46     INFO - Spec diff detected after setter (SetSpec): rust: http://sub1.xn--hxajbheg2az3al.xn--jxalpdlp/tests/dom/tests/mochitest/whatwg/postMessage_origin_helper.xhtml standard-url: http://sub1.παÏάδειγμα.δοκιμή/tests/dom/tests/mochitest/whatwg/postMessage_origin_helper.xhtml

So the CI did run with rust-url enabled, but somehow the test only failed on my local machine. Huh.


------

Regardless, we can land this now if we want, with the fallback macro turned on. Just needs review.
Attachment #8808407 - Flags: review?(mcmanus) → review?(daniel)
Attachment #8808408 - Flags: review?(mcmanus) → review?(daniel)
Attachment #8808410 - Flags: review?(mcmanus) → review?(daniel)
Attachment #8808409 - Flags: review?(mcmanus) → review?(daniel)
Comment on attachment 8808407 [details]
Bug 1151899 - Add RustURL which implements nsIURI,nsIURL,nsIStandardURL,etc

https://reviewboard.mozilla.org/r/91216/#review93102

I don't have any major concernts, it all looks pretty good. But because of the number of little nits I'm marking this a r-.

::: netwerk/base/RustURL.cpp:18
(Diff revision 1)
> +}
> +
> +extern "C" char * c_fn_get_buffer(void * container)
> +{
> +  return ((nsACString *) container)->BeginWriting();
> +}

Maybe this is obvious to everyone else, then free to ignore. But these two functions provided here look very magical so maybe they could get a short little comment describing their purposes?

::: netwerk/base/RustURL.cpp:38
(Diff revision 1)
> +  NS_INTERFACE_MAP_ENTRY(nsIStandardURL)
> +  // NS_INTERFACE_MAP_ENTRY(nsISerializable)
> +  NS_INTERFACE_MAP_ENTRY(nsIClassInfo)
> +  NS_INTERFACE_MAP_ENTRY(nsIMutable)
> +  // NS_INTERFACE_MAP_ENTRY(nsIIPCSerializableURI)
> +  // NS_INTERFACE_MAP_ENTRY(nsISensitiveInfoHiddenURI)

Is there a reason to keep the commented out lines here? If so, mention that. If not, remove them?

::: netwerk/base/RustURL.cpp:101
(Diff revision 1)
> +  if (NS_FAILED(rv)) {
> +    return rv;
> +  }
> +  if (!part.IsEmpty()) {
> +      rustResult += part;
> +      rustResult += "@";

Suddenly 4-spaces indent? The rest seems to be 2-spaces!

::: netwerk/base/RustURL.cpp:117
(Diff revision 1)
> +}
> +
> +NS_IMETHODIMP
> +RustURL::GetScheme(nsACString & aScheme)
> +{
> +  return (nsresult) rusturl_get_scheme(mURL.get(), &aScheme);

This style of C typecast is done on multiple places in this file but is generally frowned upon in our styleguide. static_cast<> ?

::: netwerk/base/RustURL.cpp:213
(Diff revision 1)
> +  if (NS_FAILED(rv)) {
> +      return rv;
> +  }
> +  if (port >= 0) {
> +      aHostPort.Append(':');
> +      aHostPort.AppendInt(port);

A few 4-space indents again

::: netwerk/base/RustURL.cpp:230
(Diff revision 1)
> +
> +NS_IMETHODIMP
> +RustURL::SetHostAndPort(const nsACString & hostport)
> +{
> +  ENSURE_MUTABLE();
> +  // TODO: check that this behaves properly

Is this still pending? What check is there left to do?

::: netwerk/base/RustURL.cpp:302
(Diff revision 1)
> +
> +  return SetSpec(path);
> +}
> +
> +NS_IMETHODIMP
> +RustURL::Equals(nsIURI *other, bool *_retval)

aRetVal?

::: netwerk/base/RustURL.cpp:323
(Diff revision 1)
> +  return NS_OK;
> +}
> +
> +NS_IMETHODIMP
> +RustURL::SchemeIs(const char * aScheme, bool *_retval)
> +{

and here too, and a few more places I won't repeat. In general the a-naming for arguments seems not very consistently used. On purpose?

::: netwerk/base/RustURL.cpp:380
(Diff revision 1)
> +}
> +
> +NS_IMETHODIMP
> +RustURL::GetOriginCharset(nsACString & aOriginCharset)
> +{
> +  // TODO: do we want to support other charsets?

Seems like an unnecssary question. If we need to, something will break with this code (and we fix it). If not, the comment just adds confusion.
Attachment #8808407 - Flags: review?(daniel) → review-
Comment on attachment 8808407 [details]
Bug 1151899 - Add RustURL which implements nsIURI,nsIURL,nsIStandardURL,etc

https://reviewboard.mozilla.org/r/91216/#review93102

> Maybe this is obvious to everyone else, then free to ignore. But these two functions provided here look very magical so maybe they could get a short little comment describing their purposes?

It's a temporary function, will be replaced by mystor's string work later. But I'll add a comment.

> Is there a reason to keep the commented out lines here? If so, mention that. If not, remove them?

valentin would know
I'm not sure what's going on here. There seem to be two review requests open?

https://reviewboard.mozilla.org/r/86944/ is the one that got updated when I just pushed, but https://reviewboard.mozilla.org/r/91216/#issue-summary is the one with your comments. Furthermore, I don't see the buttons to mark your comments as addressed.

Let's use https://reviewboard.mozilla.org/r/86944/#issue-summary from now on I guess?
Comment on attachment 8810988 [details]
Bug 1151899 - rust-url-capi error code changes;

https://reviewboard.mozilla.org/r/93246/#review93224
Attachment #8810988 - Flags: review?(manishearth) → review+
(In reply to Daniel Stenberg [:bagder] from comment #80)
> > +NS_IMETHODIMP
> > +RustURL::GetScheme(nsACString & aScheme)
> > +{
> > +  return (nsresult) rusturl_get_scheme(mURL.get(), &aScheme);
> 
> This style of C typecast is done on multiple places in this file but is
> generally frowned upon in our styleguide. static_cast<> ?

IMO in this context static_cast<nsresult>() is too verbose and offers no real safety improvement.
I would keep using the C-style casts, or change the C-API to return nsresult or uint32_t so no cast is needed.
Attachment #8808408 - Flags: review?(daniel)
Attachment #8808409 - Flags: review?(daniel)
Attachment #8808410 - Flags: review?(valentin.gosu)
Attachment #8808410 - Flags: review?(daniel)
Attachment #8807883 - Attachment is obsolete: true
Attachment #8808407 - Attachment is obsolete: true
Attachment #8808408 - Attachment is obsolete: true
Attachment #8808409 - Attachment is obsolete: true
Attachment #8808410 - Attachment is obsolete: true
Comment on attachment 8802624 [details]
Bug 1151899 - Add code to run both URL parsers at the same time;

https://reviewboard.mozilla.org/r/86944/#review93726

LGTM. Just one little question/nit that I leave to you to deal with as you see fit.

::: netwerk/base/nsStandardURL.cpp:68
(Diff revision 9)
> +#define CALL_RUST_GETTER_INT(result, func, ...)  \
> +do {                                             \
> +    int32_t backup = *result;                    \
> +    mRustURL->func(__VA_ARGS__);                 \
> +    if (backup != *result) {                     \
> +        printf("Diff detected calling getter (%s): rust: %d standard-url: %d\n", #func, *result , backup); \

Shouldn't these printf() lines rather be LOG() lines ?
Attachment #8802624 - Flags: review?(daniel) → review+
Comment on attachment 8810983 [details]
Bug 1151899 - Add RustURL which implements nsIURI,nsIURL,nsIStandardURL,etc;

https://reviewboard.mozilla.org/r/93240/#review93728

Looks good! I've mentioned some nits but I leave those to you to handle as you see fit.

::: netwerk/base/RustURL.cpp:603
(Diff revision 3)
> +}
> +
> +// nsIStandardURL
> +
> +NS_IMETHODIMP
> +RustURL::Init(uint32_t aUrlType, int32_t aDefaultPort, const nsACString & aSpec, const char * aOriginCharset, nsIURI *aBaseURI)

Our style guide says try to keep code < 80 columns so maybe line break this and a few other really long lines?

::: netwerk/base/RustURL.cpp:608
(Diff revision 3)
> +RustURL::Init(uint32_t aUrlType, int32_t aDefaultPort, const nsACString & aSpec, const char * aOriginCharset, nsIURI *aBaseURI)
> +{
> +  ENSURE_MUTABLE();
> +
> +  if (aBaseURI && net_IsAbsoluteURL(aSpec)) {
> +      aBaseURI = nullptr;

4-indent

::: netwerk/base/RustURL.cpp:617
(Diff revision 3)
> +      return SetSpec(aSpec);
> +  }
> +
> +  nsAutoCString buf;
> +  nsresult rv = aBaseURI->Resolve(aSpec, buf);
> +  if (NS_FAILED(rv)) return rv;

braces!
Attachment #8810983 - Flags: review?(daniel) → review+
Comment on attachment 8810984 [details]
Bug 1151899 - Add RustURL to netmodules;

https://reviewboard.mozilla.org/r/93242/#review93732
Attachment #8810984 - Flags: review?(daniel) → review+
Comment on attachment 8810985 [details]
Bug 1151899 - Add test for RustURL;

https://reviewboard.mozilla.org/r/93244/#review93734
Attachment #8810985 - Flags: review?(daniel) → review+
Okay, looks like we're almost ready!

I've turned on MOZ_RUST_URLPARSE_FALLBACK so that these patches will only report failures, and not break anything (by falling back to nsStandardURL when RustURL doesn't match). This means that users of --enable-rust will just see a slight perf hit since we parse twice. We can hammer away at the tests later, and worry about turning it on by default then.

I'm not sure if we should turn the printf into a LOG. valentin, what do you think?


ted: If the try build passes, are we okay to land this? As I said above, this should not impact behavior, only perf. (I'm a bit wary of turning on a chunk of Rust code in Firefox, even if it has no impact on behavior)
Flags: needinfo?(valentin.gosu)
Flags: needinfo?(ted)
I don't see any reason not to. If we find any serious problems after landing you can disable it.
Flags: needinfo?(ted)
(In reply to Manish Goregaokar [:manishearth] from comment #109)
> Okay, looks like we're almost ready!
> 
> I've turned on MOZ_RUST_URLPARSE_FALLBACK so that these patches will only
> report failures, and not break anything (by falling back to nsStandardURL
> when RustURL doesn't match). This means that users of --enable-rust will
> just see a slight perf hit since we parse twice. We can hammer away at the
> tests later, and worry about turning it on by default then.
> 
> ted: If the try build passes, are we okay to land this? As I said above,
> this should not impact behavior, only perf. (I'm a bit wary of turning on a
> chunk of Rust code in Firefox, even if it has no impact on behavior)

The performance hit would indeed be a problem. The URL parser is quite slow as it is.
I would propose this:
1. Only build RustURL.cpp on nightly, until we can iron out all of the differences.
2. Have a runtime pref to decide if we call the rust methods.

After landing this we can push to fix all of the issues that are left, and encourage people to report differences.
When we are confident enough with the implementation, we can run them in parallel for aurora/beta maybe even release, depending on how much slower it gets. Then we can run experiments, switching people to the rust implementation, and finally remove the older C++ URL parser completely.
We will probably need some telemetry for how often the parsers differ, and in which part of the URL.

> I'm not sure if we should turn the printf into a LOG. valentin, what do you
> think?
If I remember correctly, printfs don't show up in opt builds, so we should convert them to LOG()
Flags: needinfo?(valentin.gosu)
> Only build RustURL.cpp on nightly

Yeah, I think this was part of the plan. Doing.

> Have a runtime pref to decide if we call the rust methods.

Do you mean have a pref that decides whether or not we call the Rust methods or a pref that decides whether or not we use the Rust result? I.e. will it be the rough equivalent of MOZ_RUST_URLPARSE or of MOZ_RUST_URLPARSE_FALLBACK?

Also, given that we are landing this with fallback on, does this need to be done in these patches itself?

> If I remember correctly, printfs don't show up in opt builds, so we should convert them to LOG()

It seems like you need to pass special env vars to make LOG show up, though: https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging
Made it conditional on it being nightly/default.

Didn't fix change to LOG because it seems like LOG is getting used for other things; this difference should be a rare occasion and shouldn't require changes to the env to display imo. Let me know what you think.
(In reply to Manish Goregaokar [:manishearth] from comment #115)
> Made it conditional on it being nightly/default.
> 
> Didn't fix change to LOG because it seems like LOG is getting used for other
> things; this difference should be a rare occasion and shouldn't require
> changes to the env to display imo. Let me know what you think.

Recently we added the ability to start logging at runtime (bug 1303762) - Go to about:networking in the Logging tab.
I still haven't checked, but if we keep it with printfs, we might not get any output at all in opt builds. Please change to LOG. After that let's land it. We'll see what the performance hit is, and we'll then add the runtime pref.

Thanks for all the great work!
Fixed log, fixed configure.
Pushed by manishearth@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/4984c5478d52
rust-url-capi error code changes; r=manishearth
https://hg.mozilla.org/integration/autoland/rev/f87f327d7ef2
Add RustURL which implements nsIURI,nsIURL,nsIStandardURL,etc; r=bagder
https://hg.mozilla.org/integration/autoland/rev/2b97d2ef1895
Add RustURL to netmodules; r=bagder
https://hg.mozilla.org/integration/autoland/rev/c778f23ccaf9
Add test for RustURL; r=bagder
https://hg.mozilla.org/integration/autoland/rev/910b4b74261d
Add code to run both URL parsers at the same time; r=bagder,ted,valentin
Backed out for build bustage:

https://hg.mozilla.org/integration/autoland/rev/099dfc7c8f94fca23a079ee00299d5223ea20ebb
https://hg.mozilla.org/integration/autoland/rev/7b53b23cea1670245f05c2ecc9d43a549babd3b6
https://hg.mozilla.org/integration/autoland/rev/e53848c09086409b9c4728d38c289b2e36f64fab
https://hg.mozilla.org/integration/autoland/rev/752fa0d6f6ed2f27e5345bc2657de4d7dfc9b496
https://hg.mozilla.org/integration/autoland/rev/c1ff69d38065f9205fdf04289dc60eab36ebd7e2

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=910b4b74261d8b974cf023fa6eb981fa9dfce66d
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=6818395&repo=autoland

 /builds/slave/autoland-lx-000000000000000000/build/src/netwerk/base/RustURL.cpp:342:37: error: cannot allocate an object of abstract type 'mozilla::net::RustURL'
/builds/slave/autoland-lx-000000000000000000/build/src/netwerk/base/RustURL.cpp:474:53: error: no 'nsresult mozilla::net::RustURL::GetFilePath(nsACString_internal&)' member function declared in class 'mozilla::net::RustURL'
/builds/slave/autoland-lx-000000000000000000/build/src/netwerk/base/RustURL.cpp:480:59: error: no 'nsresult mozilla::net::RustURL::SetFilePath(const nsACString_internal&)' member function declared in class 'mozilla::net::RustURL'
/builds/slave/autoland-lx-000000000000000000/build/src/netwerk/base/RustURL.cpp:487:47: error: no 'nsresult mozilla::net::RustURL::GetQuery(nsACString_internal&)' member function declared in class 'mozilla::net::RustURL'
/builds/slave/autoland-lx-000000000000000000/build/src/netwerk/base/RustURL.cpp:493:53: error: no 'nsresult mozilla::net::RustURL::SetQuery(const nsACString_internal&)' member function declared in class 'mozilla::net::RustURL'
/builds/slave/autoland-lx-000000000000000000/build/src/netwerk/base/RustURL.cpp:554:37: error: cannot allocate an object of abstract type 'mozilla::net::RustURL'
/builds/slave/autoland-lx-000000000000000000/build/src/netwerk/base/RustURL.cpp:570:37: error: cannot allocate an object of abstract type 'mozilla::net::RustURL'
Flags: needinfo?(valentin.gosu)
bug 1310483 added a new interface to the mix
Rebased build works locally, and logging works when the log string is nsStandardURL:5.
Pushed by manishearth@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/627cc696e6e5
rust-url-capi error code changes; r=manishearth
https://hg.mozilla.org/integration/autoland/rev/96f002110a9a
Add RustURL which implements nsIURI,nsIURL,nsIStandardURL,etc; r=bagder
https://hg.mozilla.org/integration/autoland/rev/ca90dbb06b74
Add RustURL to netmodules; r=bagder
https://hg.mozilla.org/integration/autoland/rev/17436aa19280
Add test for RustURL; r=bagder
https://hg.mozilla.org/integration/autoland/rev/79ec544204fc
Add code to run both URL parsers at the same time; r=bagder,ted,valentin
this landed in 53, not 52, updating status.
Target Milestone: mozilla52 → mozilla53
Depends on: 1318943
Depends on: 1318978
Blocks: 1319141
Depends on: 1319347
valentin, report on this in HI?
Flags: needinfo?(valentin.gosu)
Depends on: 1340695
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: