Closed Bug 808292 Opened 12 years ago Closed 10 years ago

Implement path-level host-source matching to CSP

Categories

(Core :: DOM: Core & HTML, enhancement, P1)

enhancement

Tracking

()

RESOLVED FIXED
mozilla35

People

(Reporter: emk, Assigned: ckerschb)

References

(Blocks 2 open bugs, )

Details

(Keywords: dev-doc-complete, Whiteboard: [CSP 1.1])

Attachments

(5 files, 8 obsolete files)

18.88 KB, patch
ckerschb
: review+
Details | Diff | Splinter Review
11.71 KB, patch
ckerschb
: review+
Details | Diff | Splinter Review
6.58 KB, patch
ckerschb
: review+
Details | Diff | Splinter Review
9.65 KB, patch
ckerschb
: review+
Details | Diff | Splinter Review
15.18 KB, patch
ckerschb
: review+
Details | Diff | Splinter Review
      No description provided.
Severity: normal → enhancement
Priority: -- → P2
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → DUPLICATE
Why is this bug duped forward?
I duped it because the initial bug comment is empty and the other bug had more info.  I should not have duped it because I think maybe this is not the same as bug 916054.  This is about including the path in host-source matching, not calling to disregard paths in source-expressions, right?
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
(In reply to Sid Stamm [:geekboy or :sstamm] from comment #3)
> This is about including the path in host-source matching, not
> calling to disregard paths in source-expressions, right?

Right. We should wait until the new parser is landed, it's not worth it to add support to the old parser since it's going to be removed soon.
The new parser already supports the parsing of paths and also stores the path along with the scheme, host, port, etc. Still needs code for enforcing the path though.
Providing a WIP patch for enforcing path-level host source matching. This needs testing but should work pretty much as is.

Further, this patch changes ::toString functions so that nsCSPHostSrc also appends the path after a URI. In other words, we have to update TestCSPParser to also include a path.
Assignee: nobody → mozilla
Priority: P2 → P1
Attached patch 1_bug_808292_parser_tests.patch (obsolete) — Splinter Review
Attachment #8435284 - Attachment is obsolete: true
Attachment #8472557 - Flags: review?(grobinson)
Attachment #8472558 - Flags: review?(grobinson)
Attached patch 3_bug_808292_utils_changes.patch (obsolete) — Splinter Review
Attachment #8472559 - Flags: review?(grobinson)
Attached patch 4_bug_808292_mochitests.patch (obsolete) — Splinter Review
Attachment #8472561 - Flags: review?(grobinson)
Attachment #8472562 - Flags: review?(grobinson)
Everything should be pretty much self explanatory, the only thing important to note is, that I cleaned up the parser by removing fileAndArguments. Since all that information should be stored in the path, I don't think it makes sense to have functions and also members explicit for files. Storing all in mPath is what we want.

Garrett, wanna take a first look?
Once I get your green lights, I will flag sstamm.
Comment on attachment 8472557 [details] [diff] [review]
1_bug_808292_parser_tests.patch

Review of attachment 8472557 [details] [diff] [review]:
-----------------------------------------------------------------

r+, see comment and consider using sensible extensions for tests to avoid general "wat" reactions :)

::: content/base/test/TestCSPParser.cpp
@@ +629,4 @@
>      { "report-uri /report.py",
>        "report-uri http://www.selfuri.com/report.py"},
>      { "img-src http://foo.org:34/report.py",
> +      "img-src http://foo.org:34/report.py" },

It's valid, but .py is a kind of a weird choice for testing img-src. Same goes for media-src and font-src.
Attachment #8472557 - Flags: review?(grobinson) → review+
Attachment #8472558 - Flags: review?(grobinson) → review+
Comment on attachment 8472559 [details] [diff] [review]
3_bug_808292_utils_changes.patch

Review of attachment 8472559 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/base/src/nsCSPUtils.cpp
@@ +345,5 @@
> +  // If there is a path, we have to enforce path-level matching,
> +  // unless the channel got redirected, see:
> +  // http://www.w3.org/TR/CSP11/#source-list-paths-and-redirects
> +  if (!aRedirected && !mPath.IsEmpty()) {
> +    // cloning uri so we can irgnore the ref

Typo: irgnore
Attachment #8472559 - Flags: review?(grobinson) → review+
Attachment #8472561 - Flags: review?(grobinson) → review+
Comment on attachment 8472562 [details] [diff] [review]
5_bug_808292_redirect_tests.patch

Review of attachment 8472562 [details] [diff] [review]:
-----------------------------------------------------------------

Would it make sense to incorporate the functionality of file_csp_path_matching_redirect_server.sjs into file_csp_testserver.sjs instead? I want to avoid adding lots of .sjs files that do *almost* the same thing. For example, this redirect .sjs is very similar to file_csp_redirects_resource.sjs and file_redirect_content.sjs.

::: content/base/test/csp/test_csp_path_matching_redirect.html
@@ +50,5 @@
> +    var divcontent = testframe.contentWindow.document.getElementById('testdiv').innerHTML;
> +    is(divcontent, curTest.expected, "should be blocked in test " + (counter - 1) + "!");
> +  }
> +  catch (e) {
> +    ok(false, "ERROR: could not access content in test " + (coutner - 1) + "!");

Typo: coutner
Attachment #8472562 - Flags: review?(grobinson)
(In reply to Garrett Robinson [:grobinson] from comment #15)
> Comment on attachment 8472562 [details] [diff] [review]
> 5_bug_808292_redirect_tests.patch
> 
> Review of attachment 8472562 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Would it make sense to incorporate the functionality of
> file_csp_path_matching_redirect_server.sjs into file_csp_testserver.sjs
> instead? I want to avoid adding lots of .sjs files that do *almost* the same
> thing. For example, this redirect .sjs is very similar to
> file_csp_redirects_resource.sjs and file_redirect_content.sjs.

I agree, it's not great to have so many *.sjs files around that almost do the same thing. I would like to land it the way the tests are right now anyway and file a follow up where we can collapse as many .sjs files into one as possible.
Attachment #8472557 - Flags: review+ → review?(sstamm)
Attachment #8472558 - Flags: review+ → review?(sstamm)
Attachment #8472559 - Flags: review+ → review?(sstamm)
Attachment #8472561 - Flags: review+ → review?(sstamm)
Attachment #8472562 - Flags: review?(sstamm)
Since Garret r+ed the patches, I am going to flag Sid for his input right away and incorporate Garret's changes once I got review comments from Sid.
Comment on attachment 8472557 [details] [diff] [review]
1_bug_808292_parser_tests.patch

Review of attachment 8472557 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/base/test/TestCSPParser.cpp
@@ +326,5 @@
> +      "script-src http://www.example.com" },
> +    { "script-src http://www.example.com:8888#foo",
> +      "script-src http://www.example.com:8888" },
> +    { "script-src http://www.example.com:8888?foo",
> +      "script-src http://www.example.com:8888" },

What about query string *and* fragment? http://x.com/path?foo#bar
Attachment #8472557 - Flags: review?(sstamm) → review+
Chris: FYI, when you have an r+ on a patch, you can use the "additional review" flag to request another one without clearing the first one.
Comment on attachment 8472561 [details] [diff] [review]
4_bug_808292_mochitests.patch

Review of attachment 8472561 [details] [diff] [review]:
-----------------------------------------------------------------

These mochitests look pretty good.  It would be cleaner if the test used postmessage instead of the DOM "load" event (since we're waiting on a script to load and run, it could post-message), but I think this should be fine since it would be harder to catch the "blocked" cases with postmessage and the scripts should run before "load".

::: content/base/test/csp/test_csp_path_matching.html
@@ +31,5 @@
> +
> +  ["allowed", "test1.example.com?foo"],
> +  ["allowed", "test1.example.com/?foo"],
> +  ["allowed", "test1.example.com/tests/content/base/test/csp/?foo"],
> +  ["allowed", "test1.example.com/tests/content/base/test/csp/file_csp_path_matching.js?foo"],

You test querystrings here, maybe add a fragment # test too.  And/or a querystring with values (?foo=bar).
Attachment #8472561 - Flags: review?(sstamm) → review+
Comment on attachment 8472558 [details] [diff] [review]
2_bug_808292_parser_changes.patch

Review of attachment 8472558 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry this took so long.  It's close, but not quite there yet...

::: content/base/src/nsCSPParser.cpp
@@ +248,5 @@
>    // in case we are parsing unrecognized characters in the following loop.
>    uint32_t charCounter = 0;
>  
> +  while (!atEndOfURI()) {
> +    advance();

I'm not sure this is correct.  The ABNF says things, including:

 unreserved    = ALPHA / DIGIT / "-" / "." / "_" / "~"
 segment       = *pchar
 pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"
 sub-delims    = "!" / "$" / "&" / "'" / "(" / ")"
                  / "*" / "+" / "," / ";" / "="

http://tools.ietf.org/html/rfc3986#section-2.3
http://tools.ietf.org/html/rfc3986#section-3.3

"unreserved" does not include all characters except '#' and '?'.  For example, '[' is not a valid character (nor is whitespace) for a URI path segment.

You should rename atEndOfURI to atEndOfPath and also look for characters not allowed by the RFC's productions for URI paths so we don't accidentally consume too much.
Attachment #8472558 - Flags: review?(sstamm) → review-
Comment on attachment 8472559 [details] [diff] [review]
3_bug_808292_utils_changes.patch

Review of attachment 8472559 [details] [diff] [review]:
-----------------------------------------------------------------

As much as I don't like adding another arg to permits(), I think it's probably the best option right now.  This part looks good.

::: content/base/src/nsCSPContext.cpp
@@ +172,5 @@
>                                 aContentLocation,
>                                 nonce,
> +                               // aExtra is only non-null if
> +                               // the channel got redirected.
> +                               aExtra ? true : false,

eugh, can we just convert aExtra directly into a boolean?  Maybe use "aExtra == nullptr"?

::: content/base/src/nsCSPUtils.cpp
@@ +213,5 @@
>  // ::permits is only called for external load requests, therefore:
>  // nsCSPKeywordSrc and nsCSPHashSource fall back to this base class
>  // implementation which will never allow the load.
>  bool
> +nsCSPBaseSrc::permits(nsIURI* aUri, const nsAString& aNonce, bool aRedirected) const

This is super nitty (feel free to disagree and ignore), but wouldn't aRedirected be a bit more clearly named if it were aWasRedirected)?

@@ +396,5 @@
>        return false;
>      }
>    }
>  
> +  // At the end: scheme, host, parth, and port match -> allow the load.

typo: parth -> path
Attachment #8472559 - Flags: review?(sstamm) → review+
(In reply to Sid Stamm [:geekboy or :sstamm] from comment #21)
> Comment on attachment 8472558 [details] [diff] [review]
> 2_bug_808292_parser_changes.patch
> Sorry this took so long.  It's close, but not quite there yet...

No problem at all.

> ::: content/base/src/nsCSPParser.cpp
> @@ +248,5 @@
> >    // in case we are parsing unrecognized characters in the following loop.
> >    uint32_t charCounter = 0;
> >  
> > +  while (!atEndOfURI()) {
> > +    advance();
> 
> I'm not sure this is correct.  The ABNF says things, including:
> 
>  unreserved    = ALPHA / DIGIT / "-" / "." / "_" / "~"
>  segment       = *pchar
>  pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"
>  sub-delims    = "!" / "$" / "&" / "'" / "(" / ")"
>                   / "*" / "+" / "," / ";" / "="
> 
> http://tools.ietf.org/html/rfc3986#section-2.3
> http://tools.ietf.org/html/rfc3986#section-3.3
> 
> "unreserved" does not include all characters except '#' and '?'.  For
> example, '[' is not a valid character (nor is whitespace) for a URI path
> segment.

I am aware of the spec here, so to answer your specific concern:
The tokenizer splits tokens based on whitespace, so there can't be a whitespace within a uri.
I agree that the spec does not include '[' for example, but a filename can contain '[' (e.g. my[file.js). I would rather relax the spec here a little and also support such special characters and only stop parsing once we hit '#' or '?'. Hence my decision. I think that is reasonable.

> You should rename atEndOfURI to atEndOfPath and also look for characters not
> allowed by the RFC's productions for URI paths so we don't accidentally
> consume too much.

The path is part of an URI in this parsing context. Also, consider the following example: www.test.com#foo. There is no path involved, but the uri ends right before #. I would rather keep the current name.
(In reply to Sid Stamm [:geekboy or :sstamm] from comment #22)
> Comment on attachment 8472559 [details] [diff] [review]
> 3_bug_808292_utils_changes.patch
> 
> Review of attachment 8472559 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> As much as I don't like adding another arg to permits(), I think it's
> probably the best option right now.  This part looks good.
> 
> ::: content/base/src/nsCSPContext.cpp
> @@ +172,5 @@
> >                                 aContentLocation,
> >                                 nonce,
> > +                               // aExtra is only non-null if
> > +                               // the channel got redirected.
> > +                               aExtra ? true : false,
> 
> eugh, can we just convert aExtra directly into a boolean?  Maybe use "aExtra
> == nullptr"?

We can, but I think we are more explicit by what we are doing if we use 'true' or 'false'. Easier to read, but if you want, I can incorporate that change. Doesn't matter a whole lot.

> ::: content/base/src/nsCSPUtils.cpp
> @@ +213,5 @@
> >  // ::permits is only called for external load requests, therefore:
> >  // nsCSPKeywordSrc and nsCSPHashSource fall back to this base class
> >  // implementation which will never allow the load.
> >  bool
> > +nsCSPBaseSrc::permits(nsIURI* aUri, const nsAString& aNonce, bool aRedirected) const
> 
> This is super nitty (feel free to disagree and ignore), but wouldn't
> aRedirected be a bit more clearly named if it were aWasRedirected)?

That sounds reasonable to me.
Comment on attachment 8472562 [details] [diff] [review]
5_bug_808292_redirect_tests.patch

Review of attachment 8472562 [details] [diff] [review]:
-----------------------------------------------------------------

I think the tests here are fine, but it's a little complicated to figure out what's happening in the test given all the various files involved.  Maybe add some more comments to clear it up before landing. (See my comments below.)

::: content/base/test/csp/test_csp_path_matching_redirect.html
@@ +26,5 @@
> + * as described in the spec, see:
> + * http://www.w3.org/TR/CSP11/#source-list-paths-and-redirects
> + */
> +
> +var policy = "script-src http://example.com http://test1.example.com/fooFolder";

We should test a more representative path (use more variety of characters, please)

@@ +31,5 @@
> +
> +var tests = [
> +  {
> +    expected: "blocked",
> +    uri: "tests/content/base/test/csp/file_csp_path_matching.html"

This is a little confusing.  This document *does* load, but the script it references is blocked because the path in test1.example.com is different than "fooFolder", right?  That's not at all obvious from this data structure (I have to go dig into this file to figure out what the hell is going on).  Can you merge the comment above (description of the test) with this data structure so without opening file_csp_path_matching.html, I can see what it's supposed to do?  (Same with the second test, it's not clear why it's allowed or that file_csp_path_matching_redirect.html loads a script that redirects.)
Attachment #8472562 - Flags: review?(sstamm) → review+
(In reply to Garrett Robinson [:grobinson] from comment #15)
> Would it make sense to incorporate the functionality of
> file_csp_path_matching_redirect_server.sjs into file_csp_testserver.sjs
> instead? I want to avoid adding lots of .sjs files that do *almost* the same
> thing. For example, this redirect .sjs is very similar to
> file_csp_redirects_resource.sjs and file_redirect_content.sjs.

I think actually file_csp_redirects_resource.sjs can be reused for this purpose with very little (if any) modification.
(In reply to Christoph Kerschbaumer [:ckerschb] from comment #23)
> The tokenizer splits tokens based on whitespace, so there can't be a
> whitespace within a uri.

Great.  Whitespace won't be an issue here.

> I agree that the spec does not include '[' for example, but a filename can
> contain '[' (e.g. my[file.js).

Not according to rfc3986; the ABNF doesn't differentiate between a path segment and a filename, so "[" is not valid in a filename.

> I would rather relax the spec here a little
> and also support such special characters and only stop parsing once we hit
> '#' or '?'. Hence my decision. I think that is reasonable.

Sure... but what about '@' or ':'?  What does it mean to encounter them in the path?

> The path is part of an URI in this parsing context. Also, consider the
> following example: www.test.com#foo. There is no path involved, but the uri
> ends right before #. I would rather keep the current name.

But you are not actually at the end of the URI in your example, so this name is confusing!  The URI does not end until the second o in foo.  A fragment is part of the URI, so the name is wrong.  An empty path is still a valid path (as per the RFC), so I'd prefer atEndOfPath or atEndOfURIPath to atEndOfURI.


(In reply to Christoph Kerschbaumer [:ckerschb] from comment #24)
> > > +                               aExtra ? true : false,
> > 
> > eugh, can we just convert aExtra directly into a boolean?  Maybe use "aExtra
> > == nullptr"?
> 
> We can, but I think we are more explicit by what we are doing if we use
> 'true' or 'false'. Easier to read, but if you want, I can incorporate that
> change. Doesn't matter a whole lot.

Your comment says "aExtra is only non-null if the channel got redirected."  If the goal is to pass in "true" if the channel got redirected, then we want "!(aExtra == nullptr)" for an explicit value mapping or x!=z instead of !(x==z).  

One thing that drives me nuts in code everywhere is any sort of "if(x) { return true; } else { return false; }" or equivalent.
I incorporated all of your suggestions and nits in all of the different patches. In particular I added tests involving querystrings and fragments for both, the compiled code tests and mochitests. I added and updated comments where appropriate. Of particular importance is to closely review the parser part again. You convinced me to use atEndOfPath() and also to introcude a new function called isValidPathChar() which allows us keep the current structure in everything else but subPath() but with the upside of additionaly checking for valid path characters in subPath(). It's more restrictive than my inital approach but probably adds more robustness to the parser for paths.
Carrying over r+ from sstamm and grobinson.
Attachment #8472557 - Attachment is obsolete: true
Attachment #8472558 - Attachment is obsolete: true
Attachment #8472559 - Attachment is obsolete: true
Attachment #8472561 - Attachment is obsolete: true
Attachment #8472562 - Attachment is obsolete: true
Attachment #8493008 - Flags: review+
Attachment #8493009 - Flags: review?(sstamm)
Attached patch 3_bug_808292_utils_changes.patch (obsolete) — Splinter Review
Carrying over r+ from sstamm and grobinson.
Attachment #8493010 - Flags: review+
Carrying over r+ from sstamm and grobinson.
Attachment #8493011 - Flags: review+
Carrying over r+ from sstamm and grobinson.
Attachment #8493012 - Flags: review+
Comment on attachment 8493009 [details] [diff] [review]
2_bug_808292_parser_changes.patch

Review of attachment 8493009 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good.  Just a couple minors.  r=me

::: content/base/src/nsCSPParser.cpp
@@ +157,5 @@
> +  if (atEnd() || peek(QUESTIONMARK) || peek(NUMBER_SIGN)) {
> +    return true;
> +  }
> +  return false;
> +}

Please optimize this to:
return atEnd() || peek(QUESTIONMARK) || peek(NUMBER_SIGN);

@@ +167,5 @@
> +      peek(DASH) || peek(DOT) ||
> +      peek(UNDERLINE) || peek(TILDE)) {
> +    return true;
> +  }
> +  return false;

Again, same reduction here.  return X || X || X;

Also, this function name suggests you would pass it a parameter.  Maybe consider renaming it "atValidPathChar()" or tell me to shut up.

@@ +280,1 @@
>      if (charCounter > kSubHostPathCharacterCutoff) {

You could put the ++ on charCounter inside this if condition if you want to skip adding a line, but it's not necessary.
Attachment #8493009 - Flags: review?(sstamm) → review+
Incorporated minor nits from Comment 34. Carrying over r+
Attachment #8493009 - Attachment is obsolete: true
Attachment #8495179 - Flags: review+
Unbitroted patch. Carrying over r+ from sstamm.

Also, since this is a new CSP feature, pushing to try, testing this on all platforms:

 https://tbpl.mozilla.org/?tree=Try&rev=e9f83a563a27
Attachment #8493010 - Attachment is obsolete: true
Attachment #8495181 - Flags: review+
Depends on: 1122445
Depends on: 1147026
Sounds like a behavior change that should be covered in docs, even if only briefly.
Keywords: dev-doc-needed
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: