CVE-2020-8516 Hidden Service deanonymization

Since no one is posting it here and talking about it, I will post it.

https://nvd.nist.gov/vuln/detail/CVE-2020-8516

The guy: http://www.hackerfactor.com/blog/index.php?/archives/868-Deanonymizing-Tor-C...

Is this real?

Are we actually not verifying if the IP of the Rend is a node in the Tor network?

We (network team) actually don't think this is a bug but it is actually done on purpose for specific reasons. Please see asn's answer on https://bugs.torproject.org/33129 that explains why that is.

Onto the bigger issue at ends that the post explains. I'm going to extract the relevant quote that this post is all about:

Remember: the guard rarely changes but the other two hops change often. If he can repeatedly map out my circuit's last node, then he can build a large exclusion list. If he can exclude everything else, then he can find my guard node. And if he can't exclude everything, then he can probably whittle it down to a handful of possible guard nodes.

That is indeed a known attack. One can create a set of relays from the 3rd node (last one before connecting to the rendezvous point) selected by the service and doing enough requests to the service, you can end up with a very large set of relays that can _not_ be your Guard due to how path selection works as explained in the blog post.

You probably won't end up with one single Guard but rather a small set of relays that could be it. For instance, if the service has setup ExcludeNodes then they will all be in your set.

And the reason for private nodes is probably because this way you eliminate noise from other tor traffic so _anything_ connecting back to your ORPort is related to the onion service connections you've done. You don't need to filter out the circuits with some custom code (which is very easy to do anyway).

That is unfortunately a problem that onion service have. These types of guard discovery attacks exists and they are the primary reasons why we came up with Vanguards couple years ago:

https://blog.torproject.org/announcing-vanguards-add-onion-services

But one thing for sure, simply forcing rendezvous points to be part of the consensus will _not_ fix this problem as it is fairly easy to pull this type of attack by simply using a normal relay within the consensus.

Hope this help! David

-- Bbbgg4u8CrNOEpbT98JqIsuQesh4Pr607DGrz6pE1F8=

Paul Syverson

9:30 p.m.

On Tue, Feb 04, 2020 at 04:15:23PM -0500, David Goulet wrote:

...

On 04 Feb (19:03:38), juanjo wrote:

[snip]

...

And the reason for private nodes is probably because this way you eliminate noise from other tor traffic so _anything_ connecting back to your ORPort is related to the onion service connections you've done. You don't need to filter out the circuits with some custom code (which is very easy to do anyway).

That is unfortunately a problem that onion service have. These types of guard discovery attacks exists and they are the primary reasons why we came up with Vanguards couple years ago:

https://blog.torproject.org/announcing-vanguards-add-onion-services

Indeed. Just to underscore the point: we demonstrated those attacks in the wild and proposed versions of vanguards in the same work where we introduced guards in the first place, published way back in 2006.

...

But one thing for sure, simply forcing rendezvous points to be part of the consensus will _not_ fix this problem as it is fairly easy to pull this type of attack by simply using a normal relay within the consensus.

aloha, Paul

Mike Perry

5 Feb 5 Feb

4:41 p.m.

On 2/4/20 3:15 PM, David Goulet wrote:

...

On 04 Feb (19:03:38), juanjo wrote:

Greetings!

...
Since no one is posting it here and talking about it, I will post it.

https://nvd.nist.gov/vuln/detail/CVE-2020-8516

The guy: http://www.hackerfactor.com/blog/index.php?/archives/868-Deanonymizing-Tor-C...

Is this real?

Are we actually not verifying if the IP of the Rend is a node in the Tor network?

We (network team) actually don't think this is a bug but it is actually done on purpose for specific reasons. Please see asn's answer on https://bugs.torproject.org/33129 that explains why that is.

Onto the bigger issue at ends that the post explains. I'm going to extract the relevant quote that this post is all about:
Remember: the guard rarely changes but the other two hops change often.
If he can repeatedly map out my circuit's last node, then he can build a
large exclusion list. If he can exclude everything else, then he can find
my guard node. And if he can't exclude everything, then he can probably
whittle it down to a handful of possible guard nodes.
That is indeed a known attack. One can create a set of relays from the 3rd node (last one before connecting to the rendezvous point) selected by the service and doing enough requests to the service, you can end up with a very large set of relays that can _not_ be your Guard due to how path selection works as explained in the blog post.

You probably won't end up with one single Guard but rather a small set of relays that could be it. For instance, if the service has setup ExcludeNodes then they will all be in your set.

For completeness of understanding and to be thorough, there is an interesting wrinkle in http://www.hackerfactor.com/blog/index.php?/archives/868-Deanonymizing-Tor-C... that might deserve some additional investigation.

Specifically: the "Forward in Reverse" subsection, which is covered in more detail under "The Reverse Attack" here: https://www.hackerfactor.com/blog/index.php?/archives/779-Behind-the-Tor-Att...

The "Oddly, sometimes the connection would succeed" sentence is a red flag sentence. If you are inclined to be paranoid, there is indeed a way to hide a real attack in what looks like a simple ntohl() bug here.

This "sometimes" connection behavior is often seen in tagging attacks, where the adversary abuses Tor's AES-CTR mode stream-cipher-style properties to XOR a tag at one end of a circuit, and undo that tag only if the other endpoint is present. In this way, only the connections that actually succeed are those that the adversary is *certain* that they are in both positions in the circuit (to perform Guard discovery, or if they are the Guard relay, to confirm deanonymization).

If you want to hide your tagging attack as what looks like a simple ntohl() bug here, you send your intro2 with the reverse IP address. Then, when your middle node suspects a candidate rend cell (via timing + circuit setup fingerprinting, to have a guess), it can confirm this guess by undoing the tag by XORing the cipherstream with ntohl(ip) XOR ip.

Because of our stream-cipher-style use of AES-CTR, the busted rend cell contains AES-CTR-cipherstream XOR ip. This means that when the adversary XORs this position *again* with ip XOR ntohl(ip), they undo the tag: AES-CTR-cipherstream XOR ip XOR ip XOR ntohl(ip) = AES-CTR-cipherstream XOR ntohl(ip)

Aka a correctly performing rend cell tag hidden in what looks like a very common networking bug.

This cipherstream tagging weakness has had a few proposals to fix, most recently: https://gitweb.torproject.org/torspec.git/tree/proposals/295-relay-crypto-wi...

BUT DON'T PANIC: There is also an alternate explanation for the "sometimes succeed" red flag in this particular case, other than a tagging attack.

Because Tor will actually use the rend relay fingerprint to try to find an already-open connection before opening a new one, it is possible for rends with a correct fingerprint to connect successfully, even if the IP address is wrong, so long as a previous TLS connection to the correct IP exists.

So most likely, this is just a poorly written Tor client, *but* there still is the possibility that it is an attack cleverly disguised as a poorly written Tor client.. :/

It may be a good idea for Neal's/our monitoring infrastructure to keep an eye on this behavior too, for this reason, to test for the side channel usage + rend XOR "correction" vs just dumb bug that is sometimes connecting by getting lucky (and thus never properly reverses the rend IP address). If this is indeed just a bug, when these rends do succeed, the IP address should never be correct.

The way to do that would be to build rend circuits using 3rd hops that you (the service operator) control, so that that 3rd hop can check if the rend succeeds because the TLS connection happened to be open (benign behavior) or because the reversed ntohl() got corrected somehow (attack).

Thank you for your vigilance, Neal!

-- Mike Perry

George Kadianakis

6 Feb 6 Feb

11:54 a.m.

David Goulet dgoulet@torproject.org writes:

...

On 04 Feb (19:03:38), juanjo wrote:

Greetings!

...
Since no one is posting it here and talking about it, I will post it.

https://nvd.nist.gov/vuln/detail/CVE-2020-8516

The guy: http://www.hackerfactor.com/blog/index.php?/archives/868-Deanonymizing-Tor-C...

Is this real?

Are we actually not verifying if the IP of the Rend is a node in the Tor network?

We (network team) actually don't think this is a bug but it is actually done on purpose for specific reasons. Please see asn's answer on https://bugs.torproject.org/33129 that explains why that is.

Onto the bigger issue at ends that the post explains. I'm going to extract the relevant quote that this post is all about:
Remember: the guard rarely changes but the other two hops change often.
If he can repeatedly map out my circuit's last node, then he can build a
large exclusion list. If he can exclude everything else, then he can find
my guard node. And if he can't exclude everything, then he can probably
whittle it down to a handful of possible guard nodes.
That is indeed a known attack. One can create a set of relays from the 3rd node (last one before connecting to the rendezvous point) selected by the service and doing enough requests to the service, you can end up with a very large set of relays that can _not_ be your Guard due to how path selection works as explained in the blog post.

For what it's worth, I'm glad this discussion has been restarted because we did lots of research work in 2018 about these sort of attacks, but we were kinda drown in the various tradeoffs and ended up not doing much after releasing the vanguard tool.

For people who are following from home and would like to help out here is some reading materials: https://lists.torproject.org/pipermail/tor-dev/2018-April/013070.html https://lists.torproject.org/pipermail/tor-dev/2018-May/013162.html https://trac.torproject.org/projects/tor/ticket/25754

Basically, from what I remember, to defend against such attacks we either need to change our path selection logic (#24487), or abandon the path restrictions that cause infoleaks (big thread above), or use two guards (prop#291 plus big thread above). Each of these options has its own tradeoffs and we need to analyze them again. If someone could do a summary that would be great to get this started again...

For now, if you are afraid of such attacks, you should use and love vanguards!

Thanks a lot! :-)

Mike Perry

5:20 p.m.

On 2/6/20 5:54 AM, George Kadianakis wrote:

...

David Goulet dgoulet@torproject.org writes:

...
On 04 Feb (19:03:38), juanjo wrote:

Greetings!

...
Since no one is posting it here and talking about it, I will post it.

https://nvd.nist.gov/vuln/detail/CVE-2020-8516

The guy: http://www.hackerfactor.com/blog/index.php?/archives/868-Deanonymizing-Tor-C...

Is this real?

Are we actually not verifying if the IP of the Rend is a node in the Tor network?

We (network team) actually don't think this is a bug but it is actually done on purpose for specific reasons. Please see asn's answer on https://bugs.torproject.org/33129 that explains why that is.

Onto the bigger issue at ends that the post explains. I'm going to extract the relevant quote that this post is all about:
Remember: the guard rarely changes but the other two hops change often.
If he can repeatedly map out my circuit's last node, then he can build a
large exclusion list. If he can exclude everything else, then he can find
my guard node. And if he can't exclude everything, then he can probably
whittle it down to a handful of possible guard nodes.
That is indeed a known attack. One can create a set of relays from the 3rd node (last one before connecting to the rendezvous point) selected by the service and doing enough requests to the service, you can end up with a very large set of relays that can _not_ be your Guard due to how path selection works as explained in the blog post.
For what it's worth, I'm glad this discussion has been restarted because we did lots of research work in 2018 about these sort of attacks, but we were kinda drown in the various tradeoffs and ended up not doing much after releasing the vanguard tool.

For people who are following from home and would like to help out here is some reading materials: https://lists.torproject.org/pipermail/tor-dev/2018-April/013070.html https://lists.torproject.org/pipermail/tor-dev/2018-May/013162.html https://trac.torproject.org/projects/tor/ticket/25754

Basically, from what I remember, to defend against such attacks we either need to change our path selection logic (#24487), or abandon the path restrictions that cause infoleaks (big thread above), or use two guards (prop#291 plus big thread above). Each of these options has its own tradeoffs and we need to analyze them again. If someone could do a summary that would be great to get this started again...

For now, if you are afraid of such attacks, you should use and love vanguards!

Yes, specifically vanguards always uses two guards and disables all path restrictions to mitigate info-leak route disclosure attacks like the above.

Vanilla Tor uses two guards only sometimes, and that is part of the info leak. Using two guards is not enough by itself, though, for cases where you get unlucky and choose both guards from the same Family or some other restriction violating case that will leak info by way of making it clear which relays you *don't* use for your 3rd hop.

Since MyFamily and the other path restrictions are of questionable value, but the info leak from using path restrictions is clear and measurable, we opted to make the tradeoff to disable path restrictions entirely in the vanguards addon. This is actually done by the HSLayer*Nodes torrc options that we use. It was significantly less disruptive to disable path restrictions only if those options were set, than to redo all path selection in Tor itself. See the parent ticket and its children for details: https://trac.torproject.org/projects/tor/ticket/25546

I filed https://github.com/mikeperry-tor/vanguards/issues/53 to make these choices clear in the vanguards docs. Right now you have to read the tor manpage for the torrc options we used in order to even know about these choices, which is not great.

The creation vanguards addon itself was also somewhat controversial - asn and I basically had to go rogue to get all of these defenses prototyped in a reasonable time frame. Merging these defenses into Tor is a significantly larger engineering task than making a python control port addon.

It is rather sad that we have to make such choices for security features/improvements like this, and for the tagging problem. But security features are surprisingly difficult to obtain funding for (!), and so we often have to find ways to do whatever we can in these areas.

-- Mike Perry

s7r

4 Feb 4 Feb

10:04 p.m.

juanjo wrote:

...

Since no one is posting it here and talking about it, I will post it.

https://nvd.nist.gov/vuln/detail/CVE-2020-8516

The guy: http://www.hackerfactor.com/blog/index.php?/archives/868-Deanonymizing-Tor-C...

Is this real?

Are we actually not verifying if the IP of the Rend is a node in the Tor network?

When I saw `CVE-2020*` combined with Hidden Service deanonymization in the title, I thought I'm going to have an interesting evening.

Then I saw: This vulnerability is currently undergoing analysis and not all information is available. Please check back soon to view the completed vulnerability summary.

I don't think this should be a CVE. First of all, it's not really deanonymization technically speaking. It's a 'Guard discovery attack'. Of course, it can potentially lead to onion service deanonymization if combined with another attack, so it's no secret this is quite possible.

However, this is a well known problem. The onion service client (think of it as the visitor of a .onion website) is the one who chooses the rendezvous point. This means it can choose a hostile one, under his control, and at the same time run more middle relays and establish rendezvous circuits with a particular onion service continuously until a workable path from attacker's perspective is chosen.

The rendezvous point being a part of the consensus or not does not actually make even the slightest difference. In fact, if we make a requirement for the rendezvous point to be in the consensus (from onion service server view of the network), we only end up with performance limitations. Because onion service client and onion service server can have (and often have) different views over the network. This is expected to aggravate if we want to really scale Tor (check walking onions proposal), so making this requirement will bottleneck Tor scaling without actually fixing the slightest thing.

The only fix here is for Tor when running in onion service server mode to keep a track of its historic established rendezvous circuits and detect such attacks, because they are very trivial to detect:

A Tor relay that is not on the consensus is unmeasured, so it can have a weight of n, thus a probability to be selected genuinely by a honest onion service client is n%.

Now, you only look for insanely high n values. If your onion service server has 90% of its last "m" established rendezvous circuits to the same rendezvous point, you can't possibly think it's a coincidence right?

Same logic applies to rendezvous relays that are in the consensus as well, only that you might allow a higher "n" value because you know their weights, and they might be fast relays.

Thus in anyway your "n" cannot possibly be even near the limit an attacker needs to perform a guard discovery attack.

I wrote a proposal sketch back in 2016 about this, it mitigates exactly this attack: https://lists.torproject.org/pipermail/tor-dev/2016-January/010291.html

The protection is called "RendGuard" and is implemented in the Vanguards defense implemented by Mike Perry in 2018.

The RendGuard part of that could be in Tor by default, because it doesn't face so many load-balancing issues and anti-fingerprinting issues as opposite to layer 2 and layer 3 guards.

1736

Age (days ago)

1738

Last active (days ago)

tor-dev@lists.torproject.org

6 comments

6 participants

tags (0)

participants (6)

David Goulet
George Kadianakis
juanjo
Mike Perry
Paul Syverson
s7r