Closed Bug 1050063 Opened 10 years ago Closed 10 years ago

http2 npn not aware of tls version - twitter.com occasional failure

Categories

(Core :: Networking: HTTP, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla34

People

(Reporter: masayuki, Assigned: mcmanus)

References

()

Details

(Whiteboard: [spdy])

Attachments

(1 file)

When I access twitter.com with Nightly, the tab becomes just about:blank. If I turn off network.http.spdy.enabled.http2draft, I can access it.

I'm not sure if this is an evangelism bug.
Reproduced on Mac.
OS: Windows 8.1 → All
Hardware: x86_64 → All
Blocks: 1047594
Reproduced on Windows 7 64-bit.
(In reply to kellybmcc from comment #2)
> Reproduced on Windows 7 64-bit.

Reproduced again on Windows 7 64-bit
Reproduced on Windows 8.1 X64.
Also Linux
Nick,

Can you have a look at this?
Flags: needinfo?(hurley)
Another data-point: curl also doesn't get any content from twitter.com with h2-14.
twitter does negotiate h2-14 this morning.

its surprising that twitter negotiates h2-14 - I reached out to them before turning this on and they said they weren't quite ready.

the first byte sequence from the server is 

2014-08-07 10:57:15.202000 UTC - 3348[c0f530]: 00000000: 00 00 08 07 00 00 00 00 00

That's a GOAWAY (and its also not a settings, which is the required first frame)

I'll reach out to our twitter h2 contact
And another point of this bug is, no error shows to user in this case...
It's working now.
Still Not working :(
(In reply to Rufus C Sharma from comment #12)
> Still Not working :(

Yeah, sorry for a whole minute it works for me but it's broken again.
For clarification: Sometimes you encounter non-HTTP2 enabled twitter server (or somehow a fallback to HTTP1.1 occurs). That's when it's working for a short time.
curl fails because twitter sends back cookie which includes NUL (0x00) character, which is illegal according to RFC 7230.
I think this is draft-13 legacy, where concatenation of header field values with NUL is allowed (or sometimes required).
Bug 1047594 has now been backed out; nightlies respinning as we speak.
As soon as I either (1) get confirmation that this is (as we suspect) a twitter bug and they've pulled back h2-14, or (2) get information that allows me to fix a bug on our end (and I fix it), I'll be re-landing bug 1047594
Flags: needinfo?(hurley)
(In reply to Nicholas Hurley [:hurley] from comment #17)
> As soon as I either (1) get confirmation that this is (as we suspect) a
> twitter bug

Surely there's something better we could be doing on our end, to make it clearer to the user what's going on? (showing a some flavor of about:neterror perhaps?)

From the user's perspective, it looks to the user very much like Firefox is just rejecting the user-input.  I type in a URL, hit enter, and nothing happens. (I'm still at the page I was at before.)  Same if I click a link.

Can we make this sort of server-bug trigger some flavor of about:neterror?
(ah, I guess the lack-of-an-error-page was brought up in comment 10, as well.)
(In reply to Daniel Holbert [:dholbert] from comment #18)
> Can we make this sort of server-bug trigger some flavor of about:neterror?

That's bug 1050329 :)
See Also: → 1050329
Reproduced on Fedora 20 64-bit running from the Nightly tarballs
(In reply to Daniel Holbert [:dholbert] from comment #19)
> (ah, I guess the lack-of-an-error-page was brought up in comment 10, as
> well.)

this is the equivalent of h1 doing a connect and not returning any bytes.. same behavior over there fwiw and not traditionally an issue. So a worthy idea, but not a blocker.
This should be fixed. Let me know if you run into any more issues :)
Thanks, Jeff! I'll be re-enabling h2-14 in Firefox as soon as inbound re-opens. Sadly, it probably won't make it into tonight's nightly :(
(In reply to Jeff Pinner from comment #24)
> This should be fixed. Let me know if you run into any more issues :)

Hmm, not currently working for me.
Working for me. (latest nightly, fresh profile with network.http.spdy.enabled.http2draft manually turned on, with a browser-restart after the pref-flip for good measure.)
(In reply to Blair McBride [:Unfocused] from comment #26)
> (In reply to Jeff Pinner from comment #24)
> > This should be fixed. Let me know if you run into any more issues :)
> 
> Hmm, not currently working for me.

wfm

Blair - this is usually the easiest route to diagnosis

https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging

(either start a new profile to ensure you don't get private data in there, or mark the attachment private)
Blair, at some point in here, we had server-dependent behavior. Given you're at the other end of the world, you might. DNS offers me (in Berlin):

twitter.com has address 199.16.156.230
twitter.com has address 199.16.156.102
twitter.com has address 199.16.156.70
twitter.com has address 199.16.156.6
This was working again for me yesterday but now broken again.

twitter.com has address 199.16.156.102
twitter.com has address 199.16.156.6
twitter.com has address 199.16.156.70
twitter.com has address 199.16.156.38
(In reply to Dave Townsend [:mossop] from comment #30)
> This was working again for me yesterday but now broken again.
> 
> twitter.com has address 199.16.156.102
> twitter.com has address 199.16.156.6
> twitter.com has address 199.16.156.70
> twitter.com has address 199.16.156.38

wfm today at all 4 of those addresses. They all do h2-14 for me most of the time, but the first gave me spdy/3.1 on a couple reloads. (welcome to load balancers, I assume).

if it persists, the log from comment 28 is what we would need.
(In reply to Patrick McManus [:mcmanus] from comment #31)
> if it persists, the log from comment 28 is what we would need.

Unfortunately, I can only reproduce the issue on my main profile. Oddly, a clean profile works fine. So the log I get ends up being a couple of hundred Mb, and includes all sorts of stuff :\

Looking through it, I believe this may be the issue:

2014-08-12 06:52:54.748000 UTC - 11632[c0f530]: Http2Session::ConfirmTLSProfile 2a504400 sslsocketcontrol=ae1a9e8
2014-08-12 06:52:54.748000 UTC - 11632[c0f530]: Http2Session::ConfirmTLSProfile 2a504400 version=301
2014-08-12 06:52:54.748000 UTC - 11632[c0f530]: Http2Session::ConfirmTLSProfile 2a504400 FAILED due to lack of TLS1.2


This is with twitter.com resolving to 199.59.149.198.
Same problem here:

2014-08-12 11:12:40.566000 UTC - 4188[250f530]: nsHttpConnection::StartSpdy [this=ad17600]
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: ReportSpdyConnection twitter.com 199.16.156.198~.:443 ent=1d7b4c80 preferred=1d7b4c80
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: nsHttpConnectionMgr::ProcessPendingQ [ci=.S..twitter.com:443]
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: STS dispatch [10e06aa0]
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: STS dispatch [11a4c0a0]
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: Http2Session::Http2Session a97a400 serial=0x5
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: Http2Session::SendHello a97a400
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: Http2Session::LogIO a97a400 stream=0 id=0x0 [Magic Connection Header]
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: 00000000: 50 52 49 20 2A 20 48 54 54 50 2F 32 2E 30 0D 0A 
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: 00000010: 0D 0A 53 4D 0D 0A 0D 0A 
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: Http2Session::LogIO a97a400 stream=0 id=0x0 [Generate Settings]
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: 00000000: 00 00 0C 04 00 00 00 00 00 00 04 00 02 00 00 00 
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: 00000010: 05 00 00 40 00 
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: Session Window increase at start of session a97a400 268369921
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: Http2Session::LogIO a97a400 stream=0 id=0x0 [Session Window Bump ]
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: 00000000: 00 00 04 08 00 00 00 00 00 0F FF 00 01 
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: nsHttpConnection::StartSpdy moves single transaction ad4fc00 into SpdySession a97a400
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: nsHttpConnection::AddTransaction for SPDY
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: Http2Stream::Http2Stream 1492c760
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: Http2Session::AddStream session=a97a400 stream=1492c760 serial=5 NextID=0x0 (tentative)
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: Http2Session::AddStream a97a400 stream 1492c760 activated immediately.
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: Http2Session::AddStream a97a400 activating stream 1492c760 Currently 1 streams in session, high water mark is 1
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: nsHttpConnection::ResumeSend [this=ad17600]
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: nsSocketOutputStream::AsyncWait [this=c5eedc8]
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: nsHttpConnection::ResumeSend [this=ad17600]
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: nsSocketOutputStream::AsyncWait [this=c5eedc8]
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: nsHttpConnection::DisableTCPKeepalives [ad17600]
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: nsSocketTransport::SetKeepaliveEnabled [c5eed00] disabled, idle time[10s] retry interval[1s] packet count[10]: globally enabled.
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: JIMB: ReleaseFD_Locked: mFDref = 2
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: nsHttpConnection::EnsureNPNComplete setting complete to true
2014-08-12 11:12:40.566000 UTC - 4188[250f530]:   writing transaction request stream
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: Http2Session::ConfirmTLSProfile a97a400 mConnection=c581710
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: nsHttpConnection::GetSecurityInfo trans=a97a400 tlsfilter=0 socket=c5eed10
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: Http2Session::ConfirmTLSProfile a97a400 sslsocketcontrol=abb7348
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: Http2Session::ConfirmTLSProfile a97a400 version=301
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: Http2Session::ConfirmTLSProfile a97a400 FAILED due to lack of TLS1.2
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: nsHttpConnection::OnSocketWritable ad17600 ReadSegments returned [rv=80070057 read=0 sock-cond=0]
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: nsHttpConnection::CloseTransaction[this=ad17600 trans=a97a400 reason=80070057]
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: Http2Session::Close a97a400 80070057
2014-08-12 11:12:40.566000 UTC - 4188[250f530]: Http2Session::CloseStream a97a400 1492c760 0x0 80004004

Setting network.http.spdy.enabled.http2draft to false fixes it for me.
blair, thomas - thanks for the logs.

They show a twitter.com problem that I will get in touch with twitter about.

Specifically they show twitter negotiating h2-14 (http/2) over TLS 1.0. Using >= TLS 1.2 is a MUST level requirement for HTTP/2. If for some reason we are rolled back to 1.0, then the server must choose some other protocol (presumably http/1).
Actually, I know what's going on and its our bug.

twitter uses NPN not ALPN for negotiation. That's not the standard way (but we allow it) so this pattern didn't come up in the design. The basic difference between the two is that the server normally selects between h1 and h2 with alpn, but the client makes the choice with NPN. So its our code that isn't enforcing the appropriate requirement.
Assignee: nobody → mcmanus
Keywords: qawanted
Summary: Cannot access twitter.com with Nightly if network.http.spdy.enabled.http2draft is true → http2 npn not aware of tls version - twitter.com occasional failure
Whiteboard: [spdy]
(In reply to Patrick McManus [:mcmanus] from comment #35)
> twitter uses NPN not ALPN for negotiation. That's not the standard way (but
> we allow it) so this pattern didn't come up in the design.

We really should just disable NPN now and only use ALPN. Allowing this is a violation of spec and will probably encourage more interop problems like this the longer it's enabled.
(In reply to Dave Garrett from comment #36)
> (In reply to Patrick McManus [:mcmanus] from comment #35)
> > twitter uses NPN not ALPN for negotiation. That's not the standard way (but
> > we allow it) so this pattern didn't come up in the design.
> 
> We really should just disable NPN now and only use ALPN. Allowing this is a
> violation of spec and will probably encourage more interop problems like
> this the longer it's enabled.

spdy is defined to use NPN, so until we transition off spdy we will use them both.

I don't actually believe h2 over NPN is a violation of anything - though ALPN is defined as the base case (and its the one firefox tries to get the server to use). the h2 spec explicitly leaves room for other undefined ways for discovering h2 support ('prior knowledge') - npn is a reasonable one.

We could disable h2 for NPN but not ALPN, but I don't really see a point in doing so. It would effectively require a similar PSM patch
(In reply to Patrick McManus [:mcmanus] from comment #37)
> spdy is defined to use NPN, so until we transition off spdy we will use them
> both.

SPDY can be negotiated over ALPN and really should be at this point.

> I don't actually believe h2 over NPN is a violation of anything - though
> ALPN is defined as the base case (and its the one firefox tries to get the
> server to use). the h2 spec explicitly leaves room for other undefined ways
> for discovering h2 support ('prior knowledge') - npn is a reasonable one.

I don't agree that this counts as 'prior knowledge'. ALPN is the explicit replacement to NPN. I think the expectation is that both shouldn't be used for the same protocol.

> We could disable h2 for NPN but not ALPN, but I don't really see a point in
> doing so. It would effectively require a similar PSM patch

Well, it'd help avoid issues like this one. In lieu of disabling NPN altogether, requiring ALPN for HTTP/2 really seems like a good idea.
(In reply to Dave Garrett from comment #38)
> (In reply to Patrick McManus [:mcmanus] from comment #37)
> > spdy is defined to use NPN, so until we transition off spdy we will use them
> > both.
> 
> SPDY can be negotiated over ALPN and really should be at this point.

there is no basis for the "really should be" part of that statement. NPN is fine - in most ways it is better than ALPN other than the fact that it isn't standards track. As we retire the non-standards protocols that use it (and we will), we will retire NPN too.

> I think the expectation is that both shouldn't be used
> for the same protocol.

I think you'll find its actually the common case at this stage:
https://github.com/http2/http2-spec/wiki/Implementations
(In reply to Patrick McManus [:mcmanus] from comment #39)
> there is no basis for the "really should be" part of that statement. NPN is
> fine - in most ways it is better than ALPN other than the fact that it isn't
> standards track. As we retire the non-standards protocols that use it (and
> we will), we will retire NPN too.

What I'm saying is that phasing it out should start with not using it for new protocols or in cases where ALPN can be used instead. ALPN should be heavily preferred.

> > I think the expectation is that both shouldn't be used
> > for the same protocol.
> 
> I think you'll find its actually the common case at this stage

Unfortunately true, however I'm saying it would be better for interop to move away from doing this. There's only one implementation that only supports NPN with the current draft and they really need to fix that. If ALPN isn't required everywhere there's the real risk of multiple implementations lingering on a non-standardized negotiation method when ALPN was pretty much created for HTTP/2 to use. The transition has to start at some point, and I think an interop failure like this bug is encouragement enough to do so now.

Granted, Twitter also needs to fix its server to not use an invalid TLS version for HTTP/2 negotiation, regardless of how it's done.
Twitter will negotiate TLSv1.2 -- so something is interfering with your negotiation.

As for downgrading to HTTP/1.1 for TLS < 1.2, you're correct that the server is not currently enforcing the TLS version requirement for negotiation.
As for ALPN, we will transition from NPN to ALPN once it is available in OpenSSL. Last I checked it was included in the 1.0.2 build that is currently in beta.
There are a few related problems here that can be thought about distinctly.

1] we send a 1.0 HELLO with h2 in the ALPN extension list - h2 requires 1.2. That's our bug - I'll attach a patch

2] twitter, in response to our 1.0 hello, responds with a 1.0 HELLO of its own with an NPN offer list that includes h2. That's twitter's bug, by the same logic #1 is our bug.

3a] we accept h2 from the NPN offer list even though we sent a 1.0 Hello. That's our bug.
3b] we accept h2 from the NPN offer list even though we received a 1.0 Hello in response to a 1.2 client hello. That's our bug - but it isn't actually in evidence here.

I'm going to attach a patch that fixes #1 and #3a. That will make #2 moot for us and everything should run fine.

3b is going to take a different patch and is dependent on a tiny nss tweak. There is no reason to think it needs to be rushed as it isn't happening in practice. I have written that patch and will put it in a separate bug.
Blocks: 1054347
Comment on attachment 8474162 [details] [diff] [review]
consider tls client hello version in alpn/npn offer list

Review of attachment 8474162 [details] [diff] [review]:
-----------------------------------------------------------------

r?nick for the necko parts and r?keleer for the little psm bit
Attachment #8474162 - Flags: review?(hurley)
Comment on attachment 8474162 [details] [diff] [review]
consider tls client hello version in alpn/npn offer list

Review of attachment 8474162 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/protocol/http/nsHttpConnection.cpp
@@ +498,3 @@
>          for (uint32_t index = SpdyInformation::kCount; index > 0; --index) {
> +            if (info->ProtocolEnabled(index - 1) &&
> +                info->ALPNCallbacks[index -1](ssl)) {

nit: space after the -
Attachment #8474162 - Flags: review?(hurley) → review+
Comment on attachment 8474162 [details] [diff] [review]
consider tls client hello version in alpn/npn offer list

Review of attachment 8474162 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM.
Attachment #8474162 - Flags: review?(dkeeler) → review+
https://hg.mozilla.org/mozilla-central/rev/bd50eac73b51
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
See Also: → 1077806
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: