``` Filename: 319-wide-everything.md Title: RELAY_FRAGMENT cells Author: Nick Mathewson Created: 11 May 2020 Status: Open ```
(This proposal is part of the Walking Onions spec project.)
# Introduction
Proposal 249 described a system for `CREATE` cells to become wider, in order to accommodate hybrid crypto. And in order to send those cell bodies across circuits, it described a way to split `CREATE` cells into multiple `EXTEND` cells.
But there are other cell types that can need to be wider too. For example, `INTRODUCE` and `RENDEZVOUS` cells also contain key material used for a handshake: if handshakes need to grow larger, then so do these cells.
This proposal describes an encoding for arbitrary "wide" relay cells, that can be used to send a wide variant of anything.
To be clear, although this proposal describes a way that all relay cells can become "wide", I do not propose that wide cells should actually be _allowed_ for all relay cell types.
# Proposal
We add a new relay cell type: `RELAY_FRAGMENT`. This cell type contains part of another relay cell. A `RELAY_FRAGEMENT` cell can either introduce a new fragmented cell, or can continue one that is already in progress.
The format of a RELAY_FRAGMENT body is one of the following:
// First body in a series struct fragment_begin { // What relay_command is in use for the underlying cell? u8 relay_command; // What will the total length of the cell be once it is reassembled? u16 total_len; // Bytes for the cell body u8 body[]; }
// all other cells. struct fragment_continued { // More bytes for the cell body. u8 body[]; }
To send a fragmented cell, first a party sends a RELAY_FRAGMENT cell containing a "fragment_begin" payload. This payload describes the total length of the cell, the relay command
Fragmented cells other than the last one in sequence MUST be sent full of as much data as possible. Parties SHOULD close a circuit if they receive a non-full fragmented cell that is not the last fragment in a sequence.
Fragmented cells MUST NOT be interleaved with other relay cells on a circuit, other than cells used for flow control. (Currently, this is only SENDME cells.) If any party receives any cell on a circuit, other than a flow control cell or a RELAY_FRAGEMENT cell, before the fragmented cell is complete, than it SHOULD close the circuit.
Parties MUST NOT send extra data in fragmented cells beyond the amount given in the first 'total_len' field.
Not every relay command may be sent in a fragmented cell. In this proposal, we allow the following cell types to be fragmented: EXTEND2, EXTENDED2, INTRODUCE1, INTRODUCE2, RENDEZVOUS. Any party receiving a command that they believe should not be fragmented should close the circuit.
Not all lengths up to 65535 are valid lengths for a fragmented cell. Any length under 499 bytes SHOULD cause the circuit to close, since that could fit into a non-fragmented RELAY cell. Parties SHOULD enforce maximum lengths for cell types that they understand.
All `RELAY_FRAGMENT` cells for the fragmented cell must have the same Stream ID. (For those cells allowed above, the Stream ID is always zero.) Implementations SHOULD close a circuit if they receive fragments with mismatched Stream ID.
# Onion service concerns.
We allocate a new extension for use in the ESTABLISH_INTRO by onion services, to indicate that they can receive a wide INTRODUCE2 cell. This extension contains:
struct wide_intro2_ok { u16 max_len; }
We allocate a new extension for use in the `ESTABLISH_RENDEZVOUS` cell, to indicate acceptance of wide `RENDEZVOUS2` cells. This extension contains:
struct wide_rend2_ok { u16 max_len; }
(Note that `ESTABLISH_RENDEZVOUS` cells do not currently have a an extension mechanism. They should be extended to use the same extension format as `ESTABLISH_INTRO` cells, with extensions placed after the rendezvous cookie.)
# Handling RELAY_EARLY
The first fragment of each EXTEND cell should be tagged with `RELAY_EARLY`. The remaining fragments should not. Relays should accept `EXTEND` cells if and only if their _first_ fragment is tagged with `RELAY_EARLY`.
Rationale: We could allow any fragment to be tagged, but that would give hostile guards an opportunity to move RELAY_EARLY tags around and build a covert channel. But if we later move to a relay encryption method that lets us authenticate RELAY_EARLY, we could then require only that _any_ fragment has RELAY_EARLY set.
# Compatibility
This proposal will require the allocation of a new 'Relay' protocol version, to indicate understanding of the RELAY_FRAGMENTED command.
Hi Nick,
On 12 May 2020, at 06:47, Nick Mathewson nickm@torproject.org wrote:
In this proposal, we allow the following cell types to be fragmented: EXTEND2, EXTENDED2, INTRODUCE1, INTRODUCE2, RENDEZVOUS. Any party receiving a command that they believe should not be fragmented should close the circuit.
Do you mean RENDEZVOUS1 and RENDEZVOUS2 here?
T
On 11 May (16:47:24), Nick Mathewson wrote:
Filename: 319-wide-everything.md Title: RELAY_FRAGMENT cells Author: Nick Mathewson Created: 11 May 2020 Status: Open
(This proposal is part of the Walking Onions spec project.)
# Introduction
Proposal 249 described a system for `CREATE` cells to become wider, in order to accommodate hybrid crypto. And in order to send those cell bodies across circuits, it described a way to split `CREATE` cells into multiple `EXTEND` cells.
But there are other cell types that can need to be wider too. For example, `INTRODUCE` and `RENDEZVOUS` cells also contain key material used for a handshake: if handshakes need to grow larger, then so do these cells.
This proposal describes an encoding for arbitrary "wide" relay cells, that can be used to send a wide variant of anything.
To be clear, although this proposal describes a way that all relay cells can become "wide", I do not propose that wide cells should actually be _allowed_ for all relay cell types.
# Proposal
We add a new relay cell type: `RELAY_FRAGMENT`. This cell type contains part of another relay cell. A `RELAY_FRAGEMENT` cell can either introduce a new
Typo: RELAY_FRAGEMENT --> RELAY_FRAGMENT
fragmented cell, or can continue one that is already in progress.
The format of a RELAY_FRAGMENT body is one of the following:
// First body in a series struct fragment_begin { // What relay_command is in use for the underlying cell? u8 relay_command; // What will the total length of the cell be once it is reassembled? u16 total_len; // Bytes for the cell body u8 body[]; } // all other cells. struct fragment_continued { // More bytes for the cell body. u8 body[]; }
To send a fragmented cell, first a party sends a RELAY_FRAGMENT cell containing a "fragment_begin" payload. This payload describes the total length of the cell, the relay command
Fragmented cells other than the last one in sequence MUST be sent full of as much data as possible. Parties SHOULD close a circuit if they receive a non-full fragmented cell that is not the last fragment in a sequence.
Fragmented cells MUST NOT be interleaved with other relay cells on a circuit, other than cells used for flow control. (Currently, this is only SENDME cells.) If any party receives any cell on a circuit, other than a flow control cell or a RELAY_FRAGEMENT cell, before the fragmented cell is
Typo: RELAY_FRAGEMENT --> RELAY_FRAGMENT
complete, than it SHOULD close the circuit.
Parties MUST NOT send extra data in fragmented cells beyond the amount given in the first 'total_len' field.
Should the circuit be closed or the fragments dropped?
Not every relay command may be sent in a fragmented cell. In this proposal, we allow the following cell types to be fragmented: EXTEND2, EXTENDED2, INTRODUCE1, INTRODUCE2, RENDEZVOUS. Any party receiving a command that they believe should not be fragmented should close the circuit.
Probably we want RENDEZVOUS1 and RENDEZVOUS2.
Not all lengths up to 65535 are valid lengths for a fragmented cell. Any length under 499 bytes SHOULD cause the circuit to close, since that could fit into a non-fragmented RELAY cell. Parties SHOULD enforce maximum lengths for cell types that they understand.
All `RELAY_FRAGMENT` cells for the fragmented cell must have the same Stream ID. (For those cells allowed above, the Stream ID is always zero.) Implementations SHOULD close a circuit if they receive fragments with mismatched Stream ID.
# Onion service concerns.
We allocate a new extension for use in the ESTABLISH_INTRO by onion services, to indicate that they can receive a wide INTRODUCE2 cell. This extension contains:
struct wide_intro2_ok { u16 max_len; }
We allocate a new extension for use in the `ESTABLISH_RENDEZVOUS` cell, to indicate acceptance of wide `RENDEZVOUS2` cells. This extension contains:
struct wide_rend2_ok { u16 max_len; }
(Note that `ESTABLISH_RENDEZVOUS` cells do not currently have a an extension mechanism. They should be extended to use the same extension format as `ESTABLISH_INTRO` cells, with extensions placed after the rendezvous cookie.)
Why would a client need to announce wide cells in the ESTABLISH phase as opposed to using protover "Relay=N" ?
The maximum length of a fragmented cell is capped to 2^16 (u16) so we don't really need the establish process to inform us of the maximum expected length but rather use the max_len in the first fragment?
Furthermore, ESTABLISH_INTRO has extensions (only 1 as of today) so they could also be fragments themselves and thus I'm not sure I see the point of having two different ways of "expecting" fragments for the ESTABLISH_* cells and the INTRO/RENDEZVOUS cells?
# Handling RELAY_EARLY
The first fragment of each EXTEND cell should be tagged with `RELAY_EARLY`. The remaining fragments should not. Relays should accept `EXTEND` cells if and only if their _first_ fragment is tagged with `RELAY_EARLY`.
Rationale: We could allow any fragment to be tagged, but that would give hostile guards an opportunity to move RELAY_EARLY tags around and build a covert channel. But if we later move to a relay encryption method that lets us authenticate RELAY_EARLY, we could then require only that _any_ fragment has RELAY_EARLY set.
# Compatibility
This proposal will require the allocation of a new 'Relay' protocol version, to indicate understanding of the RELAY_FRAGMENTED command.
Here is a thought about a DoS vector. Here goes:
As an upper limit of 65KB total fragment size, it represents ~126 cells in total so I could basically send *125* cells and then stop which will put in memory a bit more than 64KB and it will stay there until the last fragment is received.
And then I do that on 1000 different circuits bringing the total count in memory to 64GB. All stuck there, all "waiting" for the last fragment.
Our OOM would kick in killing circuits but it just seems to me a very easy way to continously kick the OOM of a _service_ which is pretty bad side channel.
Thoughts?
Cheers! David
On Thu, May 14, 2020 at 3:15 PM David Goulet dgoulet@torproject.org wrote:
On 11 May (16:47:24), Nick Mathewson wrote:
[...]
# Onion service concerns.
We allocate a new extension for use in the ESTABLISH_INTRO by onion services, to indicate that they can receive a wide INTRODUCE2 cell. This extension contains:
struct wide_intro2_ok { u16 max_len; }
We allocate a new extension for use in the `ESTABLISH_RENDEZVOUS` cell, to indicate acceptance of wide `RENDEZVOUS2` cells. This extension contains:
struct wide_rend2_ok { u16 max_len; }
(Note that `ESTABLISH_RENDEZVOUS` cells do not currently have a an extension mechanism. They should be extended to use the same extension format as `ESTABLISH_INTRO` cells, with extensions placed after the rendezvous cookie.)
Why would a client need to announce wide cells in the ESTABLISH phase as opposed to using protover "Relay=N" ?
This is not for announcing support of wide cells -- this is for reporting a setting for how wide fragmented cells should be.
The maximum length of a fragmented cell is capped to 2^16 (u16) so we don't really need the establish process to inform us of the maximum expected length but rather use the max_len in the first fragment?
This all comes back to an earlier part of the proposal:
Not all lengths up to 65535 are valid lengths for a fragmented cell. Any length under 499 bytes SHOULD cause the circuit to close, since that could fit into a non-fragmented RELAY cell. Parties SHOULD enforce maximum lengths for cell types that they understand.
In other words, I'm imagining that there is a maximum length for each cell type that is much shorter than 65535, even though we're using two bytes for the length field.
The extension in the establish_intro cell is to tell the intro point the longest introduce1 cell that it should accept; this extension in the establish_rend cell is to tell the rendezvous point the longest rendezvous1 cell that it should accept.
Another way we could do this would be with a set of network parameters to describe the maximum length of each fragmented cell. Do you think that would be simpler?
(I can't quite remember why I specified it this way in the first place.)
Furthermore, ESTABLISH_INTRO has extensions (only 1 as of today) so they could also be fragments themselves and thus I'm not sure I see the point of having two different ways of "expecting" fragments for the ESTABLISH_* cells and the INTRO/RENDEZVOUS cells?
The difference thing here is that everybody can tell which protocols that a relay supports, but there is no automatic way to tell which protocols an onion service or client supports. Since INTRODUCE2/RENDEZVOUS2 cells are handled by these clients, they need to get opted into by the relays.
(I'm not sure I understood the question completely.)
# Compatibility
This proposal will require the allocation of a new 'Relay' protocol version, to indicate understanding of the RELAY_FRAGMENTED command.
Here is a thought about a DoS vector. Here goes:
As an upper limit of 65KB total fragment size, it represents ~126 cells in total so I could basically send *125* cells and then stop which will put in memory a bit more than 64KB and it will stay there until the last fragment is received.
And then I do that on 1000 different circuits bringing the total count in memory to 64GB. All stuck there, all "waiting" for the last fragment.
Our OOM would kick in killing circuits but it just seems to me a very easy way to continously kick the OOM of a _service_ which is pretty bad side channel.
A few responses here:
First, we shouldn't allow 65535-byte fragmented cells. The actual maximum length should be something more like 1024 or 4096 bytes.
Second, we should make sure that when we are reassembling cells, we use the same buf_t buffers that we use for other stuff. Our buffers are timestamped, so we can tell which buffer has had data stalling for the longest, and we should use that to make sure we're killing off the right circuits preferentially.
Third, fragments should only be allowed at an onion service for INTRODUCE2, and those should only come one at a time from each introduction point, so the number that it's reassembling at the time will be limited by the number of intro circuits it has open. It'll be the the intro points that have to be keeping a bunch of cells in assembly at once, and be ready to kill off circuits that dawdle too long.
Does this make more sense? If so I'll try to clarify it in the proposal.