Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Editorial: make WebSocket use obtain a connection #1241

Closed
wants to merge 1 commit into from

Conversation

annevk
Copy link
Member

@annevk annevk commented May 25, 2021

On closer inspection "obtain a WebSocket connection" does not appear to do anything that "obtain a connection" does not do, especially now the latter has a dedicated argument.

This brings "obtain a connection" closer to the goal of covering all web platform connections and a being a policy-enforcement point for them.

cc @MattMenke2


Preview | Diff

On closer inspection "obtain a WebSocket connection" does not appear to do anything that "obtain a connection" does not do, especially now the latter has a dedicated argument.

This brings "obtain a connection" closer to the goal of covering all web platform connections and a being a policy-enforcement point for them.
@annevk annevk requested a review from ricea May 25, 2021 13:11
@annevk annevk added the do not merge yet Pull request must not be merged per rationale in comment label May 25, 2021
@annevk
Copy link
Member Author

annevk commented May 25, 2021

As per #1122 forcing a new connection should only happen for H1.

"<code>http</code>", and true otherwise.

<li><p>Follow the requirements stated in step 2 to 5, inclusive, of the first set of steps in
<a href=http://tools.ietf.org/html/rfc6455#section-4.1>section 4.1</a> of The WebSocket Protocol
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears that this loses the WebSocket-specific connection throttling in section 4.1.2?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we could try to reinstate that somehow although it seems that would require all connections to be in some kind of collection, right?

Also, how does that work with partitioning?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the only choice is to make it work per-partition. That's unfortunate if an attacker can manage to use many partitions at the same time, but I think we generally assume that is unlikely.

Moving the logic here rather than delegating to the RFC would let us make that unambiguous.

How it essentially works is you have a per-ip:port queue, where only the WebSocket at the front of the queue is permitted to be in a connecting/handshake state. WebSockets are removed from the queue when the handshake succeeds, fails, or is cancelled.

What's troublesome is that you need to do a DNS lookup to work out which queue a WebSocket belongs in. And if there are multiple IP addresses you might need to try multiple queues before you find one that connects.

Dealing with proxies is awful and Chromium does it completely wrong.

Now that I say all this I think there's definite value is specifying this in the Fetch standard (or in the WebSocket standard, whichever comes first).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ddragana convinced me that the win from partitioning might be small, as it might be indistinguishable from network jitter (as it's only blocking during connection creation). So maybe that's not needed.

But I also reached the conclusion that this brings up the define "DNS lookup" question again. @sleevi would like to keep everything in "obtain a connection", but this seems to make that rather hard. I suppose we could wave our hands a bunch and describe it implicitly as the current RFC is doing, but ugh.

(I'll try to clean this up in due course by moving this discussion into its own issue, but for now this suffices I think.)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@annevk Yeah, entirely fair, as you also mentioned in WICG/private-network-access#51

No need to rehash the discussion here, but mostly just wanted to acknowledge that I’m not in principle opposed to trying to tackle it, just that it’s going to be complex either way to do so.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ricea Could you expand on the following comment?

Dealing with proxies is awful and Chromium does it completely wrong.

I’m having trouble making sense of it in the context of this PR, or in general for Chrome, and just trying to sort through if this was a remark specific to WebSockets and the throttling not being able to adhere fully to the spec (because the spec is wrong), or if this was something else.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ricea Could you expand on the following comment?

Dealing with proxies is awful and Chromium does it completely wrong.

The problem at hand is the following text from RFC6455:

   If the client cannot determine the IP address of the remote host
   (for example, because all communication is being done through a
   proxy server that performs DNS queries itself), then the client
   MUST assume for the purposes of this step that each host name
   refers to a distinct remote host, and instead the client SHOULD
   limit the total number of simultaneous pending connections to a
   reasonably low number (e.g., the client might allow simultaneous
   pending connections to a.example.com and b.example.com, but if
   thirty simultaneous connections to a single host are requested,
   that may not be allowed).  

Problems:

  1. On most networks we can determine the IP address, even if we're behind a proxy, However, for privacy reasons we shouldn't look up the IP address unless we already looked it up in the process of determining the proxy to use (ie. for proxy.pac). But AFAIK in Chromium we don't have a concept of "I already looked up this IP address", nor do we have a concept of "check the cache for this IP address, but don't hit the network".
  2. Does MUST assume ... that each host name refers a distinct remote host imply that we throttle by hostname:port the way we usually throttle by ip:port? That's what I've been assuming up until now, but the rest of the sentence maybe implies a completely different throttling strategy?
  3. In practice what we do in Chromium is throttle by the IP address of the proxy server. This is completely wrong, but causes surprisingly few problems in practice.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the only choice is to make it work per-partition. That's unfortunate if an attacker can manage to use many partitions at the same time, but I think we generally assume that is unlikely.

Note that chrome uses top-level-frame schemeful site and innermost iframe schemeful site, so an attacker could, within the context of a frame, open as many iframes as it wants (with different sites it controls) to get as many NIKs as it needs for any putative attack.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that chrome uses top-level-frame schemeful site and innermost iframe schemeful site, so an attacker could, within the context of a frame, open as many iframes as it wants (with different sites it controls) to get as many NIKs as it needs for any putative attack.

I see. So it's a potential DDoS vector. However, because the attacker needs to load HTML from a whole bunch of different sites to make it work, the attack magnification looks small.

<dd><p>Let <var>connection</var> be the result of
<a lt="obtain a WebSocket connection">obtaining a WebSocket connection</a>, given
<var>request</var>'s <a for=request>current URL</a>.
<li><p>Let <var>dedicatedConnection</var> be true if <var>request</var>'s <a for=request>mode</a>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might change with WebSocket over HTTP/2?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, so what should we do. Add a dedicatedIfHTTP1 parameter?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not pretty, but it would work.

@annevk
Copy link
Member Author

annevk commented May 26, 2021

Let's continue the discussion in #1243. Really appreciate your feedback @ricea (and @ddragana via Matrix)!

@annevk annevk closed this May 26, 2021
@annevk annevk deleted the annevk/connections-websocket branch May 26, 2021 16:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do not merge yet Pull request must not be merged per rationale in comment
Development

Successfully merging this pull request may close these issues.

4 participants