Hello relay and/or bridge operators,
you might already know this: we're publishing sanitized versions of bridge descriptors on Tor Metrics.
https://metrics.torproject.org/collector.html#bridge-descriptors
We're using these sanitized descriptors to visualize interesting facts about Tor bridges, among other things:
https://metrics.torproject.org/bridges-ipv6.html
We're now considering to drop one step in the sanitizing step, which is to remove contact information. The result would be that we'd keep contact information in sanitized descriptors in the exact same way as the bridge operator put it into their torrc file.
https://metrics.torproject.org/bridge-descriptors.html#contact
https://trac.torproject.org/projects/tor/ticket/20983
Possible advantages are: - Relay Search would support searching for bridges by contact information. - People who keep a watching eye on the Tor network could reach out to bridge operators to inform them that they're running an outdated tor/PT version, or that running bridges and exits together is not cool. - If somebody ever revives OnionTip/TorTip, bridges could participate and receive donations for running a bridge. Or t-shirts, who knows. Note that I'm not promising either here, but without contact information, neither would even be possible.
Possible disadvantages are: - If somebody runs a relay and a bridge, both with the same contact information, a censoring adversary might guess that the bridge might run on a nearby IP address as the relay. However, they could as well assume that for all relays and block or scan the IP space around all known relays. - Bridge operators might be surprised to see their contact information in a public archive. We do have a warning in the tor manual https://www.torproject.org/docs/tor-manual.html.en#ContactInfo, but maybe nobody reads the fine manual.
Your opinion counts. Is this a good or a bad idea, and why?
I'll keep this discussion open for a week and not start changing anything during that time. If you want to state your opinion, please do it on this list by Wednesday, February 14.
Thanks!
All the best, Karsten
On Wed, Feb 7, 2018, at 4:45 PM, Karsten Loesing wrote:
Possible disadvantages are:
- If somebody runs a relay and a bridge, both with the same contact
information, a censoring adversary might guess that the bridge might run on a nearby IP address as the relay. However, they could as well assume that for all relays and block or scan the IP space around all known relays.
- Bridge operators might be surprised to see their contact information
in a public archive. We do have a warning in the tor manual https://www.torproject.org/docs/tor-manual.html.en#ContactInfo, but maybe nobody reads the fine manual.
An email address may be linked to an IP address in public sources, e.g. mailing list archives, forum postings.
On 7. Feb 2018, at 18:55, Geoff Down geoffdown@fastmail.net wrote:
On Wed, Feb 7, 2018, at 4:45 PM, Karsten Loesing wrote:
Possible disadvantages are:
- If somebody runs a relay and a bridge, both with the same contact
information, a censoring adversary might guess that the bridge might run on a nearby IP address as the relay. However, they could as well assume that for all relays and block or scan the IP space around all known relays.
- Bridge operators might be surprised to see their contact information
in a public archive. We do have a warning in the tor manual https://www.torproject.org/docs/tor-manual.html.en#ContactInfo, but maybe nobody reads the fine manual.
An email address may be linked to an IP address in public sources, e.g. mailing list archives, forum postings.
... or whois information.
On 2018-02-07 19:37, Sebastian Hahn wrote:
On 7. Feb 2018, at 18:55, Geoff Down geoffdown@fastmail.net wrote:
On Wed, Feb 7, 2018, at 4:45 PM, Karsten Loesing wrote:
Possible disadvantages are:
- If somebody runs a relay and a bridge, both with the same contact
information, a censoring adversary might guess that the bridge might run on a nearby IP address as the relay. However, they could as well assume that for all relays and block or scan the IP space around all known relays.
- Bridge operators might be surprised to see their contact information
in a public archive. We do have a warning in the tor manual https://www.torproject.org/docs/tor-manual.html.en#ContactInfo, but maybe nobody reads the fine manual.
An email address may be linked to an IP address in public sources, e.g. mailing list archives, forum postings.
... or whois information.
Okay.
These sound like variants of the first disadvantage listed above. There are two additional assumptions in here, though:
1) bridge operators use the same or a similar email address as their bridge contact information and for mailing list/forum postings or in their whois information;
2) bridge operators are running their bridges close to the host they're using to post to mailing lists/forums or close to the host where they're hosting a registered domain.
I can see situations where both assumptions are met. But I think, overall, that the likelihood of locating a bridge by connecting contact information to mailing list archives, forum postings, or whois information makes this attack rather unattractive.
I'd say let's list this as another possible disadvantage, and let's compare them all to the possible advantages at the end.
Unless you thought of this as a show-stopper, in which case I'd kindly ask you to elaborate.
Thanks for the feedback, Geoff and Sebastian!
All the best, Karsten
Hi there,
I don't want to declare it a showstopper outright, but:
On 8. Feb 2018, at 09:42, Karsten Loesing karsten@torproject.org wrote:
These sound like variants of the first disadvantage listed above. There are two additional assumptions in here, though:
- bridge operators use the same or a similar email address as their
bridge contact information and for mailing list/forum postings or in their whois information;
- bridge operators are running their bridges close to the host they're
using to post to mailing lists/forums or close to the host where they're hosting a registered domain.
Neither is required. The only assumptions are that it is possible to enumerate whois information for the entire v4 internet (which should be the case) and that it is possible to link the email address provided in the contact line with the name that's used in whois (which might or might not be easy, in my case it'd actually be trivial because the name is a part of my email address).
I can see situations where both assumptions are met. But I think, overall, that the likelihood of locating a bridge by connecting contact information to mailing list archives, forum postings, or whois information makes this attack rather unattractive.
I'd say let's list this as another possible disadvantage, and let's compare them all to the possible advantages at the end.
Unless you thought of this as a show-stopper, in which case I'd kindly ask you to elaborate.
Thanks for the feedback, Geoff and Sebastian!
Just to summarize how the attack would work, you link the email to anything containing a real name, you crawl whois for IPs assigned to people with that name, unless they use some anonymizing technique you get a (small) list of candidate IP addresses to test.
Cheers Sebastian
On 2018-02-08 10:54, Sebastian Hahn wrote:
Hi there,
Hello!
I don't want to declare it a showstopper outright, but:
On 8. Feb 2018, at 09:42, Karsten Loesing karsten@torproject.org wrote:
These sound like variants of the first disadvantage listed above. There are two additional assumptions in here, though:
- bridge operators use the same or a similar email address as their
bridge contact information and for mailing list/forum postings or in their whois information;
- bridge operators are running their bridges close to the host they're
using to post to mailing lists/forums or close to the host where they're hosting a registered domain.
Neither is required.
Hmm? Not sure I understand.
The only assumptions are that it is possible to enumerate whois information for the entire v4 internet (which should be the case)
Right.
and that it is possible to link the email address provided in the contact line with the name that's used in whois (which might or might not be easy, in my case it'd actually be trivial because the name is a part of my email address).
Yes, but this is what I mean in assumption 1) above. You could easily have used a new address for the bridge.
I can see situations where both assumptions are met. But I think, overall, that the likelihood of locating a bridge by connecting contact information to mailing list archives, forum postings, or whois information makes this attack rather unattractive.
I'd say let's list this as another possible disadvantage, and let's compare them all to the possible advantages at the end.
Unless you thought of this as a show-stopper, in which case I'd kindly ask you to elaborate.
Thanks for the feedback, Geoff and Sebastian!
Just to summarize how the attack would work, you link the email to anything containing a real name, you crawl whois for IPs assigned to people with that name, unless they use some anonymizing technique you get a (small) list of candidate IP addresses to test.
Yes, but this only works if assumption 2) above is met. You could easily have run your bridge on a different host than the one that is connected to whois information under your name/address.
To be clear, I see how this could be used to locate some unknown fraction of bridges with relatively small effort. Similar to the first attack that I mentioned under possible disadvantages, and similar to how similar relay and bridge nicknames could give hints on bridge locations.
The question in the end will be whether we want to trade these disadvantages for the advantages from making bridge contact information available to more than a handful of people. I don't have the answer to that question yet.
Cheers Sebastian
Thanks for your input!
All the best, Karsten
Possible advantages are:
another advantage I can come up with: we will be able to analyze bridge shares (if most have contactInfo set), meaning is one or two entity running all bridges? How many operators are there?
Obviously you could also see this as disadvantage.
When discussing bridge IP:port secrecy it is probably worth noting that IP:port information of about 2k bridges (that is most bridges) got published last Sept. 2017 (see metrics-team mailing list post from Oct. 2017). I'm not saying that we should not try to keep hiding that information.
On 2018-02-08 12:19, nusenu wrote:
Possible advantages are:
another advantage I can come up with: we will be able to analyze bridge shares (if most have contactInfo set), meaning is one or two entity running all bridges? How many operators are there?
Obviously you could also see this as disadvantage.
Makes sense. I'd count that as advantage. We're not trying to hide who's running a bridge. We're just trying to hide where bridges are located, so that they're harder to block.
When discussing bridge IP:port secrecy it is probably worth noting that IP:port information of about 2k bridges (that is most bridges) got published last Sept. 2017 (see metrics-team mailing list post from Oct. 2017). I'm not saying that we should not try to keep hiding that information.
Just to give enough context for folks on this list, it wasn't us who published that information, it was a group of researchers.
https://lists.torproject.org/pipermail/metrics-team/2017-October/000489.html
All the best, Karsten
Whatever you decide, I think you should have this mentioned in the setup docs for bridges.
Sent with ProtonMail Secure Email.
-------- Original Message -------- On February 8, 2018 6:53 AM, Karsten Loesing karsten@torproject.org wrote:
On 2018-02-08 12:19, nusenu wrote:
Possible advantages are: another advantage I can come up with:
we will be able to analyze bridge shares (if most have contactInfo set), meaning is one or two entity running all bridges? How many operators are there? Obviously you could also see this as disadvantage.
Makes sense. I'd count that as advantage. We're not trying to hide who's running a bridge. We're just trying to hide where bridges are located, so that they're harder to block.
When discussing bridge IP:port secrecy it is probably worth noting that IP:port information of about 2k bridges (that is most bridges) got published last Sept. 2017 (see metrics-team mailing list post from Oct. 2017). I'm not saying that we should not try to keep hiding that information.
Just to give enough context for folks on this list, it wasn't us who published that information, it was a group of researchers.
https://lists.torproject.org/pipermail/metrics-team/2017-October/000489.html
All the best, Karsten
tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
On 2018-02-08 19:48, torix@protonmail.com wrote:
Whatever you decide, I think you should have this mentioned in the setup docs for bridges.
We have the following explanation in the manual:
"Administrative contact information for this relay or bridge. This line can be used to contact you if your relay or bridge is misconfigured or something else goes wrong. Note that we archive and publish all descriptors containing these lines and that Google indexes them, so spammers might also collect them. You may want to obscure the fact that it’s an email address and/or generate a new address for this purpose."
https://www.torproject.org/docs/tor-manual.html.en#ContactInfo
What other docs do you have in mind that we should change in case we decide to publish bridge contact information?
All the best, Karsten
Once the decision has been made to publish contactInfo, people with access to the current contactInfo (bridgeDB, isis?) should sent current bridge operators a pre-notice about the upcoming change so they have a chance to react to it.
I assume you will not implement this change retroactively (only contactInfo going forward from a given date will be published but not from past descriptors).
We could also to reach out to the IMDEA researcher that wrote the bridge paper "Dissecting Tor Bridges: a Security Evaluation of Their Private and Public Infrastructures" as they might have some additional ideas why this could be a bad idea?
On 2018-02-12 11:39, nusenu wrote:
Once the decision has been made to publish contactInfo, people with access to the current contactInfo (bridgeDB, isis?) should sent current bridge operators a pre-notice about the upcoming change so they have a chance to react to it.
I assume you will not implement this change retroactively (only contactInfo going forward from a given date will be published but not from past descriptors).
Fine question. I guess there's not much value in having contact information for bridges running in the past, but only for current bridges.
We could pick a date like March 1 or April 1 and start including contact information in descriptors published after that date.
Then it also makes sense to reach out to bridge operators with such an announcement.
We could also to reach out to the IMDEA researcher that wrote the bridge paper "Dissecting Tor Bridges: a Security Evaluation of Their Private and Public Infrastructures" as they might have some additional ideas why this could be a bad idea?
Good idea, I'll do that now.
(Again, adding more context for this list, this is different research group than the one disclosing a list of bridge IP addresses a couple month ago.)
All the best, Karsten
Possible advantages are:
- Relay Search would support searching for bridges by contact information.
- People who keep a watching eye on the Tor network could reach out to
bridge operators to inform them that they're running an outdated tor/PT version, or that running bridges and exits together is not cool.
some more come to mind:
- we could tell operators of obfs2 and obfs3 bridges that they would be much more useful if they run obfs4 PT (increase the usefulness of current resources)
- we could tell operators that running obfs3 and obfs4 is a bad idea
- we could tell operator that exposing their vanilla ORPort is a bad idea
On 2018-02-11 00:43, nusenu wrote:
- we could tell operators that running obfs3 and obfs4 is a bad idea
Are you saying obfs3 and obfs4 shouldn't run simultaneously on the same bridge? That would be good to know indeed.
- we could tell operator that exposing their vanilla ORPort is a bad
idea
If you block the ORPort, won't the reachability check fail?
Kind regards, Alexander
On 2018-02-12 11:19, Alexander Dietrich wrote:
On 2018-02-11 00:43, nusenu wrote:
- we could tell operators that running obfs3 and obfs4 is a bad idea
Are you saying obfs3 and obfs4 shouldn't run simultaneously on the same bridge? That would be good to know indeed.
Citing from the IMDEA paper that was mentioned later on this thread:
"The combination of PTs with different security properties raises several security concerns, since the security of the bridge is only as strong as its weakest link. First, an adversary detecting the weakest transport and blocking the IP disables also stronger transports for free, e.g., for the nearly 100 bridges that offer obfs3 or obfs4 in combination with obfs2, which is deprecated and trivial to identify through traffic analysis. Second, it allows an adversary to confirm a bridge, even in presence of transports that implement reply protection. For example, for the most popular combination obfs3+obfs4+ScrambleSuit, offered by 524 bridges, an adversary can confirm a bridge, e.g., identified through traffic analysis [39], through a vertical scan using obfs3 on the candidate IP address."
https://software.imdea.org/~juanca/papers/torbridges_ndss17.pdf
- we could tell operator that exposing their vanilla ORPort is a bad idea
If you block the ORPort, won't the reachability check fail?
Fine question. At least this has been the case in the past, though I know there was discussion and maybe development to overcome this weakness. But even if it's not possible yet, having bridge contact information would allow us in the _future_ to reach out to bridge operators to inform them that they don't have to keep their OR port open anymore, and maybe even shouldn't.
I see how this doesn't fully answer your question. Maybe somebody else knows more about the current state of things.
Kind regards, Alexander
All the best, Karsten
If you block the ORPort, won't the reachability check fail?
Fine question. At least this has been the case in the past, though I know there was discussion and maybe development to overcome this weakness. But even if it's not possible yet, having bridge contact information would allow us in the _future_ to reach out to bridge operators to inform them that they don't have to keep their OR port open anymore, and maybe even shouldn't.
https://trac.torproject.org/projects/tor/ticket/7349 https://trac.torproject.org/projects/tor/ticket/17159
Am Montag, 12. Februar 2018 um 12:33 schrieb nusenu nusenu-lists@riseup.net:
If you block the ORPort, won't the reachability check fail?
Fine question. At least this has been the case in the past, though I know there was discussion and maybe development to overcome this weakness. But even if it's not possible yet, having bridge contact information would allow us in the _future_ to reach out to bridge operators to inform them that they don't have to keep their OR port open anymore, and maybe even shouldn't.
https://trac.torproject.org/projects/tor/ticket/7349 https://trac.torproject.org/projects/tor/ticket/17159
Sorry, I'm not sure where to ask this questions but reading this thread I realizes that I misunderstood this howto: https://trac.torproject.org/projects/tor/wiki/doc/PluggableTransports/obfs4p...
Is it necessary for the ExtOrPort to be random? Does the port change automatically? Is it possible to specify the port?
And how can the wiki pages be changed?
Thanks :)
-- https://mastodon.social/@nusenu twitter: @nusenu_
tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
On 2018-02-11 00:43, nusenu wrote:
Possible advantages are:
- Relay Search would support searching for bridges by contact information.
- People who keep a watching eye on the Tor network could reach out to
bridge operators to inform them that they're running an outdated tor/PT version, or that running bridges and exits together is not cool.
some more come to mind:
- we could tell operators of obfs2 and obfs3 bridges that they would be much more
useful if they run obfs4 PT (increase the usefulness of current resources)
we could tell operators that running obfs3 and obfs4 is a bad idea
we could tell operator that exposing their vanilla ORPort is a bad idea
Yes, those all make sense. They're sort of variants of the second bullet point above, so I think we should just combine them.
I'm summarizing advantages and disadvantages that we have so far below:
Possible advantages are: - Relay Search would support searching for bridges by contact information. - People who keep a watching eye on the Tor network could reach out to bridge operators to inform them that they're running an outdated tor/PT version, that running bridges and exits together is not cool, that they might better be running different PTs, or that running a PT together with another PT or with an exposed vanilla OR port might be a bad idea. - If somebody ever revives OnionTip/TorTip, bridges could participate and receive donations for running a bridge. Or t-shirts, who knows. Note that I'm not promising either here, but without contact information, neither would even be possible. - We will be able to analyze bridge shares and in particular bridge operator diversity.
Possible disadvantages are: - If somebody runs a relay and a bridge, both with the same contact information, a censoring adversary might guess that the bridge might run on a nearby IP address as the relay. However, they could as well assume that for all relays and block or scan the IP space around all known relays. - Somebody might use an email address as bridge contact information that can be linked to an IP address in public sources, e.g. mailing list archives, forum postings, or whois information. If that IP address is the same or nearby a bridge IP address, then the bridge can be located quite easily. - Bridge operators might be surprised to see their contact information in a public archive. We do have a warning in the tor manual https://www.torproject.org/docs/tor-manual.html.en#ContactInfo, but maybe nobody reads the fine manual.
All the best, Karsten
On 2018-02-07 17:45, Karsten Loesing wrote:
We're now considering to drop one step in the sanitizing step, which is to remove contact information. The result would be that we'd keep contact information in sanitized descriptors in the exact same way as the bridge operator put it into their torrc file.
[...]
I'll keep this discussion open for a week and not start changing anything during that time. If you want to state your opinion, please do it on this list by Wednesday, February 14.
FWIW, we collected all feedback from this thread, discussed this change in the metrics team, and forwarded our planned change to the Tor Research Safety Board. I don't know how fast that will move, but I could imagine it's a matter of weeks, not days.
I'll get back to this list when we have a decision whether or not to publish bridge contact information. This will be before we start publishing anything new, with enough heads-up time for bridge operators to either remove or change their contact information.
Thanks again for the discussion.
All the best, Karsten
On Tue, Feb 20, 2018 at 05:51:44PM +0100, Karsten Loesing wrote:
FWIW, we collected all feedback from this thread, discussed this change in the metrics team, and forwarded our planned change to the Tor Research Safety Board. I don't know how fast that will move, but I could imagine it's a matter of weeks, not days.
I just put in a review over on the safety board page, but I'm publishing it here too for completeness / efficiency:
Thought 1: I wouldn't worry that much about whether published contactinfo would help an adversary do blocking. There are many ways that bridge enumeration might happen, and this one seems pretty tame and limited.
But thought 1b: Is there a way to discover if we were wrong and it *is* helping an adversary? It would be nice to have some way to validate this decision not to worry, and some way to detect if it turns out we were wrong. I can't think of a good way, and the lack of a feedback mechanism makes the assumption more risky to act on.
Thought 2: Ordinarily, research groups would do the analysis privately on their data set, and publish only the results. That is, the safety board question would be "Can I collect this data? I'll throw it away afterwards and only publish my analysis." But this is a different situation: the goal is to provide a public data set so others can do their own analysis. It's a tradeoff: potential surprises to bridge operators vs potential benefits to community. This is really a community growth strategy decision. When phrased that way, you might be able to include some more concrete points in the "positive" category, such as: ability for more external researchers to get involved, and increased chance that a community of bridge operators develops. And speaking of community-building, are there volunteers lined up who would contact bridge operators if given the chance, or is this more of a theoretical "maybe it would happen"?
Thought 3: I think sending mail to the current contactinfos, telling them that starting in a few weeks their contactinfo will go public, is a fine approach on the "notice / consent" spectrum -- especially since as you say they technically already got notice when they were editing the torrc file, so this follow-up attempt wouldn't be the first try.
Thought 4: In retrospect, it would be good to have some initial analysis of the (currently secret) data set. For example, how many bridges set contactinfo, and how many don't? How many of each of those are 'fast' (popular) bridges? What fraction of the contactinfos are actually a usable email address? How many bridge families are there now, i.e. bridges that use the same contact email address? Maybe most bridges don't set it currently, so this whole question doesn't matter much, or maybe many of them set it but obfuscate it, which will make your notification plan harder than you predicted.
--Roger
Roger Dingledine:
And speaking of community-building, are there volunteers lined up who would contact bridge operators if given the chance, or is this more of a theoretical "maybe it would happen"?
I'll eventually add a check for "is exit operator also a bridge operator?" and might contact operators in such cases (I'm not saying this is 'communicty-building').
Thought 4: In retrospect, it would be good to have some initial analysis of the (currently secret) data set. For example, how many bridges set contactinfo, and how many don't? How many of each of those are 'fast' (popular) bridges? What fraction of the contactinfos are actually a usable email address? How many bridge families are there now, i.e. bridges that use the same contact email address? Maybe most bridges don't set it currently, so this whole question doesn't matter much, or maybe many of them set it but obfuscate it, which will make your notification plan harder than you predicted.
I really like T4.
thanks for your thoughts!
On 2018-03-04 23:26, Roger Dingledine wrote:
On Tue, Feb 20, 2018 at 05:51:44PM +0100, Karsten Loesing wrote:
FWIW, we collected all feedback from this thread, discussed this change in the metrics team, and forwarded our planned change to the Tor Research Safety Board. I don't know how fast that will move, but I could imagine it's a matter of weeks, not days.
I just put in a review over on the safety board page, but I'm publishing it here too for completeness / efficiency:
Thanks for sharing your response here. That's very helpful!
I didn't respond yet, because I wasn't sure whether there will be more answers from other safety board folks. Do you know whether that's the case? (Maybe they submitted their answers to the mystical safety board system, and somebody needs to close the request before all reviews are sent out.)
Should I just move forward, or should I give it another week or two?
All the best, Karsten
tor-relays@lists.torproject.org