On Mon, 1 Jan 2018 08:45:57 +0000 nullius nullius@nym.zone wrote:
On 2017-12-31 at 10:48:52 +0000, Yawning Angel yawning@schwanenlied.me wrote:
This is pointless because internationalized domain names are standardized around Punycode encoding (Unicode<->ASCII), and said standard is supported by applications that support IDN queries.
I am firmly against this change, and I'm not particularly thrilled by the thought of homograph attacks either.
Happy New Year, Yawning; and apologies for the delayed reply. I thought I’d best work up some code for an object demonstration of why I urge the importance of UTF-8 (and also embedded spaces, which I forgot to mention explicitly).
I'm aware of the use cases for IDNs.
As for Punycode vs. UTF-8:
Homograph attacks are not “solved” by Punycode any more than they would be fixed by base64ing all addresses. Punycode is not a security feature; to the contrary! CVE-2013-7424, CVE-2015-8948, CVE-2016-6261, CVE-2016-6262, CVE-2017-14062.... Need I say more?
Sigh, the problem is encoding format agnostic.
My point was, by allowing non-ASCII characters the onus is on *someone* to solve the problem of homograph attacks (which admittedly is a bit of a tangent). I'm painfully aware that all browsers, including Tor Browser have utterly inadequate solutions here.
I know that as you say, applications which handle a string as a “domain” will Punycode it before Tor even sees it. But my thinking from the beginning was not in terms of DNS names. One of my constructive criticisms of prop-279 is that it makes that assumption.
It makes that assumption because it is an entirely reasonable thing to do in the context of Tor.
Dare to dream outside the quasi-DNS box about how .onion addresses can be represented!
I will quote Alec Muffet here:
a) if Onion addresses suddenly stop looking very-similar-to DNS addresses, Tor risks returning to a world where special expertise is necessary to build software for it, thereby harming growth/adoption
The current proposal can get "very similar-to DNS addresses" IDNs by using the same encoding format that DNS uses.
Regards,